# Schema Migration

This tutorial covers evolving existing pipelines. You'll learn:

- **Schema changes** — Adding and modifying columns
- **The alter() method** — Syncing definitions with database
- **Migration patterns** — Safe evolution strategies
- **Limitations** — What cannot be changed

In [1]:
import datajoint as dj
import numpy as np

schema = dj.Schema('tutorial_migration')

[2026-01-09 18:30:34,894][INFO]: DataJoint 2.0.0a16 connected to root@127.0.0.1:3306


## Initial Schema

In [2]:
@schema
class Subject(dj.Manual):
    definition = """
    subject_id : varchar(16)
    ---
    species : varchar(32)
    """

Subject.insert([
    {'subject_id': 'M001', 'species': 'mouse'},
    {'subject_id': 'M002', 'species': 'mouse'},
])
Subject()

subject_id,species
M001,mouse
M002,mouse


## Adding a Column

Update definition, then call `alter()`:

In [3]:
# Update definition
Subject.definition = """
subject_id : varchar(16)
---
species : varchar(32)
weight = null : float32   # New column
"""

# Apply change
Subject.alter(prompt=False)
Subject()

subject_id,species,weight  New column
M001,mouse,
M002,mouse,


## Modifying Column Type

In [4]:
# Widen varchar
Subject.definition = """
subject_id : varchar(16)
---
species : varchar(100)    # Was 32
weight = null : float32
"""

Subject.alter(prompt=False)
print(Subject.describe())

subject_id           : varchar(16)                  
---
species              : varchar(100)                 # Was 32
weight=null          : float32                      



## What Can Be Altered

| Change | Supported |
|--------|----------|
| Add columns | Yes |
| Drop columns | Yes |
| Modify types | Yes |
| Rename columns | Yes |
| **Primary keys** | **No** |
| **Foreign keys** | **No** |
| **Indexes** | **No** |

## Migration Pattern for Unsupported Changes

For primary key or foreign key changes:

```python
# 1. Create new table
@schema
class SubjectNew(dj.Manual):
    definition = """...new structure..."""

# 2. Migrate data
for row in Subject().to_dicts():
    SubjectNew.insert1(transform(row))

# 3. Update dependents
# 4. Drop old table
# 5. Rename if needed
```

## DataJoint 2.0 Migration

Upgrading from 0.x:

| 0.x | 2.0 |
|-----|-----|
| `longblob` | `<blob>` |
| `blob@store` | `<blob@store>` |
| `attach` | `<attach>` |
| `schema.jobs` | `Table.jobs` |

In [5]:
# Migrate blob columns
from datajoint.migrate import analyze_blob_columns, migrate_blob_columns

# Find columns needing migration
# results = analyze_blob_columns(schema)

# Apply (adds codec markers)
# migrate_blob_columns(schema, dry_run=False)

## Best Practices

1. **Test in development first**
2. **Backup before migration**
3. **Plan primary keys carefully** — they can't change
4. **Use versioned migration scripts** for production

In [6]:
schema.drop(prompt=False)