# ID Minter Demo

This notebook demonstrates the ID Minter functionality for stable identifiers during source system migrations.

## What this covers

1. **Batch lookup** - Efficient lookup of multiple source identifiers
2. **Batch minting** - Combined lookup and mint in a single transaction
3. **Predecessor inheritance** - How new records inherit canonical IDs from old systems
4. **New record minting** - Minting brand new records from the pre-generated pool
5. **Race condition protection** - Demonstrating concurrent access safety
6. **Alias discovery** - Finding which source identifiers share a canonical ID

For the schema migration process, see [schema_migration.ipynb](schema_migration.ipynb).

## Prerequisites

- Docker installed
- [uv](https://docs.astral.sh/uv/) installed

## Setup

```bash
cd /Users/kennyr/workspace/docs/rfcs/XXX-stable_identifiers

# Install dependencies and create virtual environment
uv sync

# Start the MySQL container
docker-compose up -d
```

Then select the `.venv` Python interpreter for this notebook.

In [1]:
import pymysql
import csv
import random
import threading
import time
from concurrent.futures import ThreadPoolExecutor, as_completed

from id_minter import generate_canonical_id

# Database connection settings
DB_CONFIG = {
    'host': 'localhost',
    'port': 3306,
    'user': 'root',
    'password': 'rootpassword',
    'database': 'id_minter'
}

def get_connection():
    """Get a database connection."""
    return pymysql.connect(**DB_CONFIG, cursorclass=pymysql.cursors.DictCursor)

def execute_query(query: str, params: tuple = None, fetch: bool = False):
    """Execute a query and optionally fetch results."""
    conn = get_connection()
    try:
        with conn.cursor() as cursor:
            cursor.execute(query, params)
            if fetch:
                return cursor.fetchall()
            conn.commit()
            return cursor.rowcount
    finally:
        conn.close()

# Test connection
try:
    conn = get_connection()
    conn.close()
    print("✓ Connected to MySQL successfully")
except Exception as e:
    print(f"✗ Connection failed: {e}")
    print("\nMake sure docker-compose is running:")
    print("  cd /Users/kennyr/workspace/docs/rfcs/XXX-stable_identifiers")
    print("  docker-compose up -d")

✓ Connected to MySQL successfully


## Database Setup

Create the schema and load sample data for testing.

In [2]:
# Reset and create schema
execute_query("DROP TABLE IF EXISTS identifiers")
execute_query("DROP TABLE IF EXISTS identifiers_old")
execute_query("DROP TABLE IF EXISTS canonical_ids")

# Create canonical_ids table
execute_query("""
CREATE TABLE canonical_ids (
    CanonicalId VARCHAR(8) NOT NULL PRIMARY KEY,
    Status ENUM('free', 'assigned') NOT NULL DEFAULT 'free',
    CreatedAt TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_free (Status, CanonicalId)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
""")

# Create identifiers table
execute_query("""
CREATE TABLE identifiers (
    OntologyType VARCHAR(255) NOT NULL,
    SourceSystem VARCHAR(255) NOT NULL,
    SourceId VARCHAR(255) NOT NULL,
    CanonicalId VARCHAR(8) NOT NULL,
    CreatedAt TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (OntologyType, SourceSystem, SourceId),
    FOREIGN KEY (CanonicalId) REFERENCES canonical_ids(CanonicalId),
    INDEX idx_canonical (CanonicalId)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
""")

print("✓ Schema created")

# Load sample data
csv_path = '/Users/kennyr/workspace/docs/rfcs/XXX-stable_identifiers/identifiers_sample.csv'
sample_data = []
with open(csv_path, 'r') as f:
    reader = csv.DictReader(f)
    for row in reader:
        sample_data.append(row)

# Insert canonical IDs first
conn = get_connection()
cursor = conn.cursor()
for record in sample_data:
    cursor.execute(
        "INSERT IGNORE INTO canonical_ids (CanonicalId, Status) VALUES (%s, 'assigned')",
        (record['CanonicalId'],)
    )
conn.commit()

# Insert identifiers
for record in sample_data:
    cursor.execute(
        "INSERT INTO identifiers (OntologyType, SourceSystem, SourceId, CanonicalId) VALUES (%s, %s, %s, %s)",
        (record['OntologyType'], record['SourceSystem'], record['SourceId'], record['CanonicalId'])
    )
conn.commit()
cursor.close()
conn.close()

print(f"✓ Loaded {len(sample_data)} sample records")

# Pre-generate free IDs
conn = get_connection()
cursor = conn.cursor()
generated = 0
for _ in range(200):
    new_id = generate_canonical_id()
    cursor.execute(
        "INSERT IGNORE INTO canonical_ids (CanonicalId, Status) VALUES (%s, 'free')",
        (new_id,)
    )
    if cursor.rowcount > 0:
        generated += 1
conn.commit()
cursor.close()
conn.close()

print(f"✓ Pre-generated {generated} free IDs")

# Show pool status
result = execute_query("SELECT Status, COUNT(*) as count FROM canonical_ids GROUP BY Status", fetch=True)
print("\nCanonical ID pool status:")
for row in result:
    print(f"  {row['Status']:10} {row['count']:5} IDs")

✓ Schema created
✓ Loaded 10000 sample records
✓ Pre-generated 200 free IDs

Canonical ID pool status:
  free         200 IDs
  assigned   10000 IDs


## Initialize ID Minter

In [3]:
import importlib
import id_minter
importlib.reload(id_minter)
from id_minter import IDMinter

minter = IDMinter(get_connection)
print("✓ ID Minter initialized")

✓ ID Minter initialized


## 1. Batch Lookup

The `lookup_ids()` method efficiently fetches canonical IDs for multiple source identifiers in a single query. This is the hot path - most records processed already have canonical IDs.

In [4]:
print("="*60)
print("Batch Lookup Demo")
print("="*60)

# Get some existing IDs to lookup
existing = execute_query("""
    SELECT OntologyType, SourceSystem, SourceId, CanonicalId 
    FROM identifiers 
    WHERE SourceSystem = 'sierra-system-number' AND OntologyType = 'Work'
    LIMIT 3
""", fetch=True)

# Create test batch: mix of existing and non-existing, different ontology types
test_batch = [
    ('Work', 'sierra-system-number', existing[0]['SourceId']),      # Exists
    ('Work', 'sierra-system-number', existing[1]['SourceId']),      # Exists
    ('Image', 'sierra-system-number', existing[2]['SourceId']),     # Different ontology (won't exist)
    ('Work', 'axiell-collections-id', 'DOES-NOT-EXIST'),           # Non-existent
]

print(f"\nLooking up {len(test_batch)} source IDs (mixed ontology types):")
for ont, sys, sid in test_batch:
    print(f"  {ont}/{sys}/{sid}")

found = minter.lookup_ids(test_batch)

print(f"\nFound {len(found)} existing canonical IDs:")
for (ont, sys, sid), cid in found.items():
    print(f"  ✓ {ont}/{sys}/{sid} -> {cid}")

missing = [sid for sid in test_batch if sid not in found]
print(f"\nNot found (would need minting): {len(missing)}")
for ont, sys, sid in missing:
    print(f"  ✗ {ont}/{sys}/{sid}")

Batch Lookup Demo

Looking up 4 source IDs (mixed ontology types):
  Work/sierra-system-number/1001768
  Work/sierra-system-number/1007167
  Image/sierra-system-number/1007828
  Work/axiell-collections-id/DOES-NOT-EXIST

Found 2 existing canonical IDs:
  ✓ Work/sierra-system-number/1001768 -> nvkdnjxp
  ✓ Work/sierra-system-number/1007167 -> swbrj79k

Not found (would need minting): 2
  ✗ Image/sierra-system-number/1007828
  ✗ Work/axiell-collections-id/DOES-NOT-EXIST


## 2. Batch Minting

The `mint_ids()` method combines batch lookup with minting in a single transaction (~6 queries regardless of batch size):

1. Batch lookup source IDs + predecessor IDs
2. Fail fast if any predecessors missing
3. Batch INSERT for predecessor inheritance
4. Claim free IDs from pool
5. Batch INSERT for new IDs
6. Verify and mark as assigned

In [5]:
print("="*60)
print("Batch Minting Demo")
print("="*60)

# Get pool status before
pool_before = execute_query("SELECT Status, COUNT(*) as count FROM canonical_ids GROUP BY Status", fetch=True)
free_before = next(r['count'] for r in pool_before if r['Status'] == 'free')

# Get existing Sierra records for predecessor test
existing_works = execute_query("""
    SELECT SourceId, CanonicalId FROM identifiers 
    WHERE SourceSystem = 'sierra-system-number' AND OntologyType = 'Work'
    LIMIT 2
""", fetch=True)

# Create batch request with mix of:
# - Existing IDs (should be found)
# - New ID with predecessor (should inherit canonical ID)
# - Brand new ID (should claim from pool)
requests = [
    (('Work', 'sierra-system-number', existing_works[0]['SourceId']), None),  # Exists
    (('Work', 'sierra-system-number', existing_works[1]['SourceId']), None),  # Exists
    (('Work', 'axiell-collections-id', f'AC-BATCH-{random.randint(100000, 999999)}'), 
     ('Work', 'sierra-system-number', existing_works[0]['SourceId'])),  # Inherits
    (('Work', 'axiell-collections-id', f'AC-NEW-{random.randint(100000, 999999)}'), None),  # New
]

print(f"\nProcessing {len(requests)} requests:")
for i, (source_id, predecessor) in enumerate(requests):
    ont, sys, sid = source_id
    pred_str = f" <- {predecessor[0]}/{predecessor[1]}/{predecessor[2]}" if predecessor else ""
    print(f"  {i+1}. {ont}/{sys}/{sid}{pred_str}")

# Execute batch mint
results = minter.mint_ids(requests)

print(f"\nResults:")
for (ont, sys, sid), cid in results.items():
    # Determine status
    was_existing = any(sys == 'sierra-system-number' and sid == w['SourceId'] for w in existing_works)
    pred = next((r[1] for r in requests if r[0] == (ont, sys, sid) and r[1]), None)
    
    if was_existing:
        status = "found"
    elif pred:
        expected = existing_works[0]['CanonicalId']
        status = "inherited ✓" if cid == expected else "inherited ✗"
    else:
        status = "minted"
    
    print(f"  {ont}/{sys}/{sid} -> {cid} ({status})")

# Check pool status after
pool_after = execute_query("SELECT Status, COUNT(*) as count FROM canonical_ids GROUP BY Status", fetch=True)
free_after = next(r['count'] for r in pool_after if r['Status'] == 'free')

print(f"\nFree IDs consumed: {free_before - free_after}")

Batch Minting Demo

Processing 4 requests:
  1. Work/sierra-system-number/1001768
  2. Work/sierra-system-number/1007167
  3. Work/axiell-collections-id/AC-BATCH-312034 <- Work/sierra-system-number/1001768
  4. Work/axiell-collections-id/AC-NEW-390046

Results:
  Work/sierra-system-number/1001768 -> nvkdnjxp (found)
  Work/sierra-system-number/1007167 -> swbrj79k (found)
  Work/axiell-collections-id/AC-BATCH-312034 -> nvkdnjxp (inherited ✓)
  Work/axiell-collections-id/AC-NEW-390046 -> a5jxa52t (minted)

Free IDs consumed: 1


## 3. Predecessor Inheritance

When migrating from one source system to another (e.g., Sierra → Axiell Collections), new records can inherit the canonical ID of their predecessor.

In [6]:
print("="*60)
print("Predecessor Inheritance Demo")
print("="*60)

# Get Sierra records to migrate
sierra_records = execute_query("""
    SELECT SourceId, CanonicalId FROM identifiers 
    WHERE SourceSystem = 'sierra-system-number' AND OntologyType = 'Work'
    LIMIT 5 OFFSET 10
""", fetch=True)

print("\nExisting Sierra Work records to migrate:")
for rec in sierra_records:
    print(f"  sierra-system-number/{rec['SourceId']} -> {rec['CanonicalId']}")

print("\nMigrating to Axiell Collections...\n")

for i, sierra_rec in enumerate(sierra_records):
    axiell_id = f"AC-MIGRATE-{random.randint(100000, 999999)}"
    source_key = ('Work', 'axiell-collections-id', axiell_id)
    predecessor = ('Work', 'sierra-system-number', sierra_rec['SourceId'])
    
    results = minter.mint_ids([(source_key, predecessor)])
    canonical_id = results[source_key]
    
    # Verify inheritance
    inherited = canonical_id == sierra_rec['CanonicalId']
    status = "✓ Inherited" if inherited else "✗ Wrong ID!"
    
    print(f"  {i+1}. axiell-collections-id/{axiell_id}")
    print(f"     Predecessor: sierra-system-number/{sierra_rec['SourceId']}")
    print(f"     Canonical ID: {canonical_id} ({status})")
    print()

Predecessor Inheritance Demo

Existing Sierra Work records to migrate:
  sierra-system-number/1026798 -> jw8huput
  sierra-system-number/1029638 -> vkgj5qyd
  sierra-system-number/1042410 -> f7dd5efv
  sierra-system-number/1043931 -> h47ef5sr
  sierra-system-number/1045864 -> daha9q32

Migrating to Axiell Collections...

  1. axiell-collections-id/AC-MIGRATE-150289
     Predecessor: sierra-system-number/1026798
     Canonical ID: jw8huput (✓ Inherited)

  2. axiell-collections-id/AC-MIGRATE-954192
     Predecessor: sierra-system-number/1029638
     Canonical ID: vkgj5qyd (✓ Inherited)

  3. axiell-collections-id/AC-MIGRATE-281948
     Predecessor: sierra-system-number/1042410
     Canonical ID: f7dd5efv (✓ Inherited)

  4. axiell-collections-id/AC-MIGRATE-115277
     Predecessor: sierra-system-number/1043931
     Canonical ID: h47ef5sr (✓ Inherited)

  5. axiell-collections-id/AC-MIGRATE-669765
     Predecessor: sierra-system-number/1045864
     Canonical ID: daha9q32 (✓ Inherited)



## 4. New Record Minting

Brand new records without predecessors claim IDs from the pre-generated pool.

In [7]:
print("="*60)
print("New Record Minting Demo")
print("="*60)

# Get pool status before
pool_before = execute_query("SELECT Status, COUNT(*) as count FROM canonical_ids GROUP BY Status", fetch=True)
free_before = next(r['count'] for r in pool_before if r['Status'] == 'free')
assigned_before = next(r['count'] for r in pool_before if r['Status'] == 'assigned')

print(f"\nPool before: {free_before} free, {assigned_before} assigned")

# Mint 10 brand new records (no predecessors) - using batch
print("\nMinting 10 brand new records...\n")

requests = []
for i in range(10):
    axiell_id = f"AC-BRAND-NEW-{random.randint(100000, 999999)}"
    source_key = ('Work', 'axiell-collections-id', axiell_id)
    requests.append((source_key, None))

results = minter.mint_ids(requests)

for i, (source_key, _) in enumerate(requests):
    ont, sys, sid = source_key
    canonical_id = results[source_key]
    print(f"  {i+1}. {sys}/{sid} -> {canonical_id}")

# Get pool status after
pool_after = execute_query("SELECT Status, COUNT(*) as count FROM canonical_ids GROUP BY Status", fetch=True)
free_after = next(r['count'] for r in pool_after if r['Status'] == 'free')
assigned_after = next(r['count'] for r in pool_after if r['Status'] == 'assigned')

print(f"\nPool after: {free_after} free, {assigned_after} assigned")
print(f"Free IDs consumed: {free_before - free_after}")
print(f"Assigned IDs added: {assigned_after - assigned_before}")

New Record Minting Demo

Pool before: 199 free, 10001 assigned

Minting 10 brand new records...

  1. axiell-collections-id/AC-BRAND-NEW-984091 -> a8d7wbed
  2. axiell-collections-id/AC-BRAND-NEW-401343 -> a8ndwxru
  3. axiell-collections-id/AC-BRAND-NEW-231693 -> accgquf6
  4. axiell-collections-id/AC-BRAND-NEW-933341 -> at67nn5b
  5. axiell-collections-id/AC-BRAND-NEW-278528 -> awdb4wn3
  6. axiell-collections-id/AC-BRAND-NEW-499480 -> b3db6wda
  7. axiell-collections-id/AC-BRAND-NEW-513695 -> b4s9w353
  8. axiell-collections-id/AC-BRAND-NEW-736050 -> b69ekxkh
  9. axiell-collections-id/AC-BRAND-NEW-955302 -> b779f5vh
  10. axiell-collections-id/AC-BRAND-NEW-322800 -> b8xvbqtx

Pool after: 189 free, 10011 assigned
Free IDs consumed: 10
Assigned IDs added: 10


## 5. Race Condition Protection

The ID Minter uses `FOR UPDATE SKIP LOCKED` to handle concurrent access safely:

- Multiple processes can claim different free IDs without blocking
- If two processes try to mint the same source ID, one wins and the other's claimed ID stays free
- Verification query detects which IDs were actually used

In [8]:
print("="*60)
print("Race Condition Protection Demo")
print("="*60)

# Track results from concurrent threads
race_results = {}
race_lock = threading.Lock()

def mint_same_id(thread_id: int, source_id: str):
    """Each thread tries to mint the same source ID."""
    # Each thread gets its own minter with its own connection
    thread_minter = IDMinter(get_connection)
    try:
        source_key = ('Work', 'race-test', source_id)
        results = thread_minter.mint_ids([(source_key, None)])
        canonical_id = results[source_key]
        with race_lock:
            race_results[thread_id] = canonical_id
    finally:
        thread_minter.close()

# Get pool status before
pool_before = execute_query("SELECT Status, COUNT(*) as count FROM canonical_ids GROUP BY Status", fetch=True)
free_before = next(r['count'] for r in pool_before if r['Status'] == 'free')

print(f"\nFree IDs before test: {free_before}")

# Test 1: Multiple threads trying to mint the SAME source ID
print("\n--- Test 1: Same source ID, 5 concurrent threads ---")
test_source_id = f"RACE-TEST-1-{random.randint(100000, 999999)}"
print(f"Source ID: Work/race-test/{test_source_id}")

race_results.clear()
threads = []
for i in range(5):
    t = threading.Thread(target=mint_same_id, args=(i, test_source_id))
    threads.append(t)

# Start all threads simultaneously
for t in threads:
    t.start()
for t in threads:
    t.join()

print(f"\nResults from {len(race_results)} threads:")
for thread_id, cid in sorted(race_results.items()):
    print(f"  Thread {thread_id}: {cid}")

# All threads should return the same canonical ID
unique_ids = set(race_results.values())
if len(unique_ids) == 1:
    print(f"\n✓ All threads returned the same canonical ID: {list(unique_ids)[0]}")
else:
    print(f"\n✗ ERROR: Got different canonical IDs: {unique_ids}")

# Check that only 1 free ID was consumed (not 5)
pool_after_test1 = execute_query("SELECT Status, COUNT(*) as count FROM canonical_ids GROUP BY Status", fetch=True)
free_after_test1 = next(r['count'] for r in pool_after_test1 if r['Status'] == 'free')
ids_consumed = free_before - free_after_test1

print(f"\nFree IDs consumed: {ids_consumed}")
if ids_consumed == 1:
    print("✓ Only 1 ID was claimed (unused IDs stayed in pool)")
else:
    print(f"✗ Expected 1 ID consumed, got {ids_consumed}")

Race Condition Protection Demo

Free IDs before test: 189

--- Test 1: Same source ID, 5 concurrent threads ---
Source ID: Work/race-test/RACE-TEST-1-998827

Results from 5 threads:
  Thread 0: bb68hbe7
  Thread 1: bb68hbe7
  Thread 2: bb68hbe7
  Thread 3: bb68hbe7
  Thread 4: bb68hbe7

✓ All threads returned the same canonical ID: bb68hbe7

Free IDs consumed: 1
✓ Only 1 ID was claimed (unused IDs stayed in pool)


In [9]:
# Test 2: Multiple threads minting DIFFERENT source IDs
print("\n--- Test 2: Different source IDs, 10 concurrent threads ---")

pool_before_test2 = execute_query("SELECT Status, COUNT(*) as count FROM canonical_ids GROUP BY Status", fetch=True)
free_before_test2 = next(r['count'] for r in pool_before_test2 if r['Status'] == 'free')

race_results.clear()
threads = []
for i in range(10):
    unique_source_id = f"RACE-TEST-2-{i}-{random.randint(100000, 999999)}"
    t = threading.Thread(target=mint_same_id, args=(i, unique_source_id))
    threads.append(t)

# Start all threads simultaneously
for t in threads:
    t.start()
for t in threads:
    t.join()

print(f"\nResults from {len(race_results)} threads:")
for thread_id, cid in sorted(race_results.items()):
    print(f"  Thread {thread_id}: {cid}")

# All threads should have different canonical IDs
unique_ids = set(race_results.values())
if len(unique_ids) == 10:
    print(f"\n✓ All 10 threads got unique canonical IDs")
else:
    print(f"\n✗ Expected 10 unique IDs, got {len(unique_ids)}")

# Check that 10 free IDs were consumed
pool_after_test2 = execute_query("SELECT Status, COUNT(*) as count FROM canonical_ids GROUP BY Status", fetch=True)
free_after_test2 = next(r['count'] for r in pool_after_test2 if r['Status'] == 'free')
ids_consumed = free_before_test2 - free_after_test2

print(f"Free IDs consumed: {ids_consumed}")
if ids_consumed == 10:
    print("✓ Exactly 10 IDs claimed (one per thread)")
else:
    print(f"  Expected 10, got {ids_consumed}")


--- Test 2: Different source IDs, 10 concurrent threads ---

Results from 10 threads:
  Thread 0: c2ssxa2x
  Thread 1: c6x5k57k
  Thread 2: c22c8pst
  Thread 3: c79j2uaa
  Thread 4: bjh53qgd
  Thread 5: c4vf6qqn
  Thread 6: c6rfrkef
  Thread 7: cbcewhwq
  Thread 8: bkckaubc
  Thread 9: bcbv5jzv

✓ All 10 threads got unique canonical IDs
Free IDs consumed: 10
✓ Exactly 10 IDs claimed (one per thread)


In [10]:
# Test 3: Idempotency - minting the same ID twice returns the same result
print("\n--- Test 3: Idempotency ---")

test_source_id = f"IDEMPOTENT-TEST-{random.randint(100000, 999999)}"
print(f"Source ID: Work/idempotent-test/{test_source_id}")

source_key = ('Work', 'idempotent-test', test_source_id)

# First mint
results = minter.mint_ids([(source_key, None)])
canonical_id_1 = results[source_key]
print(f"\nFirst mint:  {canonical_id_1}")

# Second mint (same source ID)
results = minter.mint_ids([(source_key, None)])
canonical_id_2 = results[source_key]
print(f"Second mint: {canonical_id_2}")

if canonical_id_1 == canonical_id_2:
    print("\n✓ Idempotent: same canonical ID returned")
else:
    print("\n✗ ERROR: different canonical IDs returned!")


--- Test 3: Idempotency ---
Source ID: Work/idempotent-test/IDEMPOTENT-TEST-312071

First mint:  cv7mybm7
Second mint: cv7mybm7

✓ Idempotent: same canonical ID returned


## 6. Alias Discovery

After migration, multiple source identifiers may share the same canonical ID. The `CreatedAt` timestamp indicates which was the original and which are aliases.

In [11]:
print("="*60)
print("Alias Discovery Demo")
print("="*60)

# Find canonical IDs with multiple source identifiers
aliased = execute_query("""
    SELECT CanonicalId, COUNT(*) as count
    FROM identifiers 
    GROUP BY CanonicalId 
    HAVING COUNT(*) > 1
    ORDER BY count DESC
    LIMIT 5
""", fetch=True)

print(f"\nFound {len(aliased)} canonical IDs with multiple source identifiers:\n")

for item in aliased:
    canonical_id = item['CanonicalId']
    
    # Get all source identifiers with alias status
    details = execute_query("""
        SELECT 
            i.*,
            CASE WHEN i.CreatedAt = earliest.MinCreatedAt THEN 'Original' ELSE 'Alias' END AS Status
        FROM identifiers i
        JOIN (
            SELECT CanonicalId, MIN(CreatedAt) AS MinCreatedAt 
            FROM identifiers 
            GROUP BY CanonicalId
        ) earliest ON i.CanonicalId = earliest.CanonicalId
        WHERE i.CanonicalId = %s
        ORDER BY i.CreatedAt
    """, (canonical_id,), fetch=True)
    
    print(f"Canonical ID: {canonical_id}")
    print("-" * 50)
    for d in details:
        print(f"  [{d['Status']:8}] {d['SourceSystem']}/{d['SourceId']}")
        print(f"           Created: {d['CreatedAt']}")
    print()

Alias Discovery Demo

Found 5 canonical IDs with multiple source identifiers:

Canonical ID: daha9q32
--------------------------------------------------
  [Original] sierra-system-number/1045864
           Created: 2026-02-09 11:57:52
  [Alias   ] axiell-collections-id/AC-MIGRATE-669765
           Created: 2026-02-09 11:58:08

Canonical ID: f7dd5efv
--------------------------------------------------
  [Original] sierra-system-number/1042410
           Created: 2026-02-09 11:57:53
  [Alias   ] axiell-collections-id/AC-MIGRATE-281948
           Created: 2026-02-09 11:58:08

Canonical ID: h47ef5sr
--------------------------------------------------
  [Original] sierra-system-number/1043931
           Created: 2026-02-09 11:57:53
  [Alias   ] axiell-collections-id/AC-MIGRATE-115277
           Created: 2026-02-09 11:58:08

Canonical ID: jw8huput
--------------------------------------------------
  [Original] sierra-system-number/1026798
           Created: 2026-02-09 11:57:52
  [Alias   ] ax

## Summary

This notebook demonstrated:

1. **Batch lookup** - Single query for multiple source IDs with mixed ontology types
2. **Batch minting** - Combined lookup + mint in ~6 queries regardless of batch size
3. **Predecessor inheritance** - Migrated records inherit canonical IDs from old systems
4. **New record minting** - Claims from pre-generated pool
5. **Race condition protection**:
   - Same source ID: All threads get same canonical ID, only 1 pool ID consumed
   - Different source IDs: Each thread gets unique ID, no blocking
   - Idempotency: Repeated mints return same result
6. **Alias discovery** - Identify original vs. alias by `CreatedAt` timestamp

### Key Properties

| Property | Behavior |
|----------|----------|
| Concurrency | `FOR UPDATE SKIP LOCKED` - no blocking |
| Race detection | Verification query after INSERT |
| ID pool | Unused IDs stay free (not wasted) |
| Idempotency | Same source ID always returns same canonical ID |
| Atomicity | All operations in single transaction |

## Cleanup

Run this to stop and remove the Docker container:

In [24]:
# Close connections
if 'minter' in dir() and minter.conn and minter.conn.open:
    minter.conn.close()
    print("✓ Minter connection closed")

# Stop the container
!docker compose down -v
print("✓ Docker container stopped")

✓ Minter connection closed
[?25l[0G[+] down 0/1
 [33m⠋[0m Container id-minter-mysql Stopping                                      [34m0.1s [0m
[?25h[?25l[2A[0G[+] down 0/1
 [33m⠙[0m Container id-minter-mysql Stopping                                      [34m0.2s [0m
[?25h[?25l[2A[0G[+] down 0/1
 [33m⠹[0m Container id-minter-mysql Stopping                                      [34m0.3s [0m
[?25h[?25l[2A[0G[+] down 0/1
 [33m⠸[0m Container id-minter-mysql Stopping                                      [34m0.4s [0m
[?25h[?25l[2A[0G[+] down 0/1
 [33m⠼[0m Container id-minter-mysql Stopping                                      [34m0.5s [0m
[?25h[?25l[2A[0G[+] down 0/1
 [33m⠴[0m Container id-minter-mysql Stopping                                      [34m0.6s [0m
[?25h[?25l[2A[0G[+] down 0/1
 [33m⠦[0m Container id-minter-mysql Stopping                                      [34m0.7s [0m
[?25h[?25l[2A[0G[+] down 0/1
 [33m⠧[0m Container id-m