Skip to content

Conversation

@hyacinthus
Copy link
Collaborator

This PR includes:

  • Add migration function for langgraph 2.0 checkpoints tables
    • Adds checkpoint_ns column to all checkpoint tables
    • Drops unused columns (checkpoint_id, parent_checkpoint_id from checkpoints; version from checkpoint_blobs)
    • Updates primary keys to match ShallowPostgresSaver schema
    • Handles duplicate data cleanup during migration
  • Fix lint issues in Heurist image generation skills

Diff: main...hyacinthus

Copilot AI review requested due to automatic review settings November 26, 2025 15:22
@hyacinthus hyacinthus merged commit f28a42c into main Nov 26, 2025
5 checks passed
@hyacinthus hyacinthus deleted the hyacinthus branch November 26, 2025 15:22
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the database schema to support LangGraph 2.0's checkpoint system by updating checkpoint-related tables to match the ShallowPostgresSaver schema, and fixes redundant self-assignment lint issues in Heurist image generation skills.

Key Changes:

  • Adds automatic migration for checkpoint tables (checkpoints, checkpoint_blobs, checkpoint_writes) including adding checkpoint_ns column, dropping unused columns, and updating primary keys
  • Removes redundant skill_config = skill_config assignments from 7 Heurist image generation skill files
  • Updates error message in get_checkpointer() to be more accurate

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

File Description
intentkit/models/db_mig.py Adds migrate_checkpoints_table function to handle LangGraph 2.0 schema migration with column additions/drops, duplicate cleanup, and primary key updates
intentkit/models/db.py Updates error message in get_checkpointer() to reference "checkpointer" instead of "pool"
intentkit/skills/heurist/image_generation_*.py Removes redundant self-assignment statements across 7 image generation skill files (sdxl, flux_1_dev, cyber_realistic_xl, braindance, arthemy_real, arthemy_comics, animagine_xl)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +109 to +220
async def migrate_checkpoints_table(conn) -> None:
"""Migrate checkpoints tables to support langgraph 2.0."""
tables = ["checkpoints", "checkpoint_blobs", "checkpoint_writes"]

def _get_tables(connection):
insp = inspect(connection)
return insp.get_table_names()

existing_tables = await conn.run_sync(_get_tables)

for table in tables:
if table not in existing_tables:
continue

# 1. Add checkpoint_ns column
await conn.execute(
text(
f"ALTER TABLE {table} ADD COLUMN IF NOT EXISTS checkpoint_ns TEXT DEFAULT ''"
)
)

# 2. Drop columns that ShallowPostgresSaver doesn't use
if table == "checkpoints":
# ShallowPostgresSaver doesn't use checkpoint_id or parent_checkpoint_id
await conn.execute(
text("ALTER TABLE checkpoints DROP COLUMN IF EXISTS checkpoint_id")
)
await conn.execute(
text(
"ALTER TABLE checkpoints DROP COLUMN IF EXISTS parent_checkpoint_id"
)
)
elif table == "checkpoint_blobs":
# ShallowPostgresSaver doesn't use version column
await conn.execute(
text("ALTER TABLE checkpoint_blobs DROP COLUMN IF EXISTS version")
)

# 3. Update Primary Key
def _check_pk(connection, table_name=table):
insp = inspect(connection)
return insp.get_pk_constraint(table_name)

pk = await conn.run_sync(_check_pk)
current_cols = set(pk.get("constrained_columns", []))

# Expected columns depend on table
expected_cols = set()
pk_cols = ""
if table == "checkpoints":
expected_cols = {"thread_id", "checkpoint_ns"}
pk_cols = "thread_id, checkpoint_ns"
elif table == "checkpoint_blobs":
expected_cols = {"thread_id", "checkpoint_ns", "channel"}
pk_cols = "thread_id, checkpoint_ns, channel"
elif table == "checkpoint_writes":
expected_cols = {
"thread_id",
"checkpoint_ns",
"checkpoint_id",
"task_id",
"idx",
}
pk_cols = "thread_id, checkpoint_ns, checkpoint_id, task_id, idx"

if current_cols != expected_cols:
logger.info(f"Migrating {table} PK from {current_cols} to {expected_cols}")

# If migrating checkpoints to (thread_id, checkpoint_ns), we need to handle duplicates
if table == "checkpoints" and expected_cols == {
"thread_id",
"checkpoint_ns",
}:
# Keep only the latest checkpoint for each (thread_id, checkpoint_ns) based on checkpoint_id (time-ordered)
await conn.execute(
text("""
DELETE FROM checkpoints
WHERE (thread_id, checkpoint_ns, checkpoint_id) NOT IN (
SELECT thread_id, checkpoint_ns, MAX(checkpoint_id)
FROM checkpoints
GROUP BY thread_id, checkpoint_ns
)
""")
)

# If migrating checkpoint_blobs to (thread_id, checkpoint_ns, channel), we need to handle duplicates
elif table == "checkpoint_blobs" and expected_cols == {
"thread_id",
"checkpoint_ns",
"channel",
}:
# Keep only blobs that are referenced by the remaining checkpoints
# The relationship is: checkpoints.checkpoint -> 'channel_versions' ->> blob.channel = blob.version
await conn.execute(
text("""
DELETE FROM checkpoint_blobs cb
WHERE NOT EXISTS (
SELECT 1
FROM checkpoints cp
WHERE cp.thread_id = cb.thread_id
AND cp.checkpoint_ns = cb.checkpoint_ns
AND (cp.checkpoint -> 'channel_versions' ->> cb.channel) = cb.version
)
""")
)

if pk.get("name"):
await conn.execute(
text(f"ALTER TABLE {table} DROP CONSTRAINT {pk['name']}")
)

await conn.execute(text(f"ALTER TABLE {table} ADD PRIMARY KEY ({pk_cols})"))
Copy link

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new migrate_checkpoints_table function lacks test coverage. Given that this migration performs destructive operations (dropping columns, deleting data), it's critical to have tests that verify:

  1. The migration works correctly on a fresh database
  2. The migration handles existing data properly
  3. The duplicate cleanup logic works as expected
  4. The primary key updates succeed

Consider adding tests similar to the existing model tests in tests/models/test_skills_schema.py.

Copilot uses AI. Check for mistakes.
Comment on lines +119 to +220
for table in tables:
if table not in existing_tables:
continue

# 1. Add checkpoint_ns column
await conn.execute(
text(
f"ALTER TABLE {table} ADD COLUMN IF NOT EXISTS checkpoint_ns TEXT DEFAULT ''"
)
)

# 2. Drop columns that ShallowPostgresSaver doesn't use
if table == "checkpoints":
# ShallowPostgresSaver doesn't use checkpoint_id or parent_checkpoint_id
await conn.execute(
text("ALTER TABLE checkpoints DROP COLUMN IF EXISTS checkpoint_id")
)
await conn.execute(
text(
"ALTER TABLE checkpoints DROP COLUMN IF EXISTS parent_checkpoint_id"
)
)
elif table == "checkpoint_blobs":
# ShallowPostgresSaver doesn't use version column
await conn.execute(
text("ALTER TABLE checkpoint_blobs DROP COLUMN IF EXISTS version")
)

# 3. Update Primary Key
def _check_pk(connection, table_name=table):
insp = inspect(connection)
return insp.get_pk_constraint(table_name)

pk = await conn.run_sync(_check_pk)
current_cols = set(pk.get("constrained_columns", []))

# Expected columns depend on table
expected_cols = set()
pk_cols = ""
if table == "checkpoints":
expected_cols = {"thread_id", "checkpoint_ns"}
pk_cols = "thread_id, checkpoint_ns"
elif table == "checkpoint_blobs":
expected_cols = {"thread_id", "checkpoint_ns", "channel"}
pk_cols = "thread_id, checkpoint_ns, channel"
elif table == "checkpoint_writes":
expected_cols = {
"thread_id",
"checkpoint_ns",
"checkpoint_id",
"task_id",
"idx",
}
pk_cols = "thread_id, checkpoint_ns, checkpoint_id, task_id, idx"

if current_cols != expected_cols:
logger.info(f"Migrating {table} PK from {current_cols} to {expected_cols}")

# If migrating checkpoints to (thread_id, checkpoint_ns), we need to handle duplicates
if table == "checkpoints" and expected_cols == {
"thread_id",
"checkpoint_ns",
}:
# Keep only the latest checkpoint for each (thread_id, checkpoint_ns) based on checkpoint_id (time-ordered)
await conn.execute(
text("""
DELETE FROM checkpoints
WHERE (thread_id, checkpoint_ns, checkpoint_id) NOT IN (
SELECT thread_id, checkpoint_ns, MAX(checkpoint_id)
FROM checkpoints
GROUP BY thread_id, checkpoint_ns
)
""")
)

# If migrating checkpoint_blobs to (thread_id, checkpoint_ns, channel), we need to handle duplicates
elif table == "checkpoint_blobs" and expected_cols == {
"thread_id",
"checkpoint_ns",
"channel",
}:
# Keep only blobs that are referenced by the remaining checkpoints
# The relationship is: checkpoints.checkpoint -> 'channel_versions' ->> blob.channel = blob.version
await conn.execute(
text("""
DELETE FROM checkpoint_blobs cb
WHERE NOT EXISTS (
SELECT 1
FROM checkpoints cp
WHERE cp.thread_id = cb.thread_id
AND cp.checkpoint_ns = cb.checkpoint_ns
AND (cp.checkpoint -> 'channel_versions' ->> cb.channel) = cb.version
)
""")
)

if pk.get("name"):
await conn.execute(
text(f"ALTER TABLE {table} DROP CONSTRAINT {pk['name']}")
)

await conn.execute(text(f"ALTER TABLE {table} ADD PRIMARY KEY ({pk_cols})"))
Copy link

Copilot AI Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The migration logic has an ordering problem. The current structure processes each table sequentially:

  1. Add checkpoint_ns column
  2. Drop columns (checkpoint_id, parent_checkpoint_id, or version)
  3. Handle duplicates (but uses the dropped columns!)
  4. Update primary key

The duplicate cleanup (lines 183-192 for checkpoints, 202-213 for checkpoint_blobs) should happen BEFORE dropping columns in step 2. Consider restructuring to handle all pre-drop operations first, then drop columns, then update primary keys.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants