Skip to content

Conversation

@hyacinthus
Copy link
Collaborator

Remove unused EKS deployment steps from the CI build workflow.

Copilot AI review requested due to automatic review settings November 28, 2025 08:46
@hyacinthus hyacinthus merged commit 5fe9896 into main Nov 28, 2025
5 checks passed
@hyacinthus hyacinthus deleted the hyacinthus branch November 28, 2025 08:46
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request removes unused EKS deployment steps from the CI workflow while also introducing checkpoint cleanup functionality and updating several Python dependencies.

Key Changes:

  • Removes commented-out AWS EKS deployment steps from the GitHub Actions workflow
  • Adds new checkpoint cleanup functionality with both a scheduled job and manual script
  • Updates boto3 (1.41.4 → 1.41.5), botocore (1.41.4 → 1.41.5), langsmith (0.4.48 → 0.4.49), and rich-toolkit (0.16.0 → 0.17.0)

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
.github/workflows/build.yml Removes unused AWS EKS deployment configuration and credentials setup (45 lines removed)
intentkit/core/cleanup.py New module implementing checkpoint cleanup functionality with SQL-based deletion of old checkpoint data
scripts/cleanup_checkpoints.py New CLI script for manual checkpoint cleanup with dry-run and force modes
intentkit/core/scheduler.py Adds scheduled job to run checkpoint cleanup daily at 2:20 UTC
uv.lock Routine dependency updates for boto3, botocore, langsmith, and rich-toolkit packages

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +57 to +74
WITH old_threads AS (
SELECT thread_id, checkpoint_ns
FROM checkpoints
WHERE (checkpoint ->> 'ts')::timestamptz < :cutoff
),
deleted_writes AS (
DELETE FROM checkpoint_writes cw
USING old_threads ot
WHERE cw.thread_id = ot.thread_id AND cw.checkpoint_ns = ot.checkpoint_ns
),
deleted_blobs AS (
DELETE FROM checkpoint_blobs cb
USING old_threads ot
WHERE cb.thread_id = ot.thread_id AND cb.checkpoint_ns = ot.checkpoint_ns
)
DELETE FROM checkpoints cp
USING old_threads ot
WHERE cp.thread_id = ot.thread_id AND cp.checkpoint_ns = ot.checkpoint_ns
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SQL deletion logic may not correctly identify all old checkpoints. The query selects thread_id, checkpoint_ns from checkpoints where timestamp is old, but this may only match some checkpoints for a given thread_id. If a thread has both old and new checkpoints, this query will only delete the old ones, potentially leaving orphaned related records or creating an inconsistent state.

Consider whether the intent is to delete entire threads (all checkpoints for a thread_id where ANY checkpoint is old) or only specific old checkpoints. If it's the former, the logic should be adjusted to identify threads with old checkpoints, then delete ALL checkpoints for those threads.

Suggested change
WITH old_threads AS (
SELECT thread_id, checkpoint_ns
FROM checkpoints
WHERE (checkpoint ->> 'ts')::timestamptz < :cutoff
),
deleted_writes AS (
DELETE FROM checkpoint_writes cw
USING old_threads ot
WHERE cw.thread_id = ot.thread_id AND cw.checkpoint_ns = ot.checkpoint_ns
),
deleted_blobs AS (
DELETE FROM checkpoint_blobs cb
USING old_threads ot
WHERE cb.thread_id = ot.thread_id AND cb.checkpoint_ns = ot.checkpoint_ns
)
DELETE FROM checkpoints cp
USING old_threads ot
WHERE cp.thread_id = ot.thread_id AND cp.checkpoint_ns = ot.checkpoint_ns
WITH threads_to_delete AS (
SELECT DISTINCT thread_id
FROM checkpoints
WHERE (checkpoint ->> 'ts')::timestamptz < :cutoff
),
deleted_writes AS (
DELETE FROM checkpoint_writes cw
USING threads_to_delete ttd
WHERE cw.thread_id = ttd.thread_id
),
deleted_blobs AS (
DELETE FROM checkpoint_blobs cb
USING threads_to_delete ttd
WHERE cb.thread_id = ttd.thread_id
)
DELETE FROM checkpoints cp
USING threads_to_delete ttd
WHERE cp.thread_id = ttd.thread_id

Copilot uses AI. Check for mistakes.
Comment on lines +33 to +43
count_query = text("""
SELECT count(DISTINCT thread_id)
FROM checkpoints
WHERE (checkpoint ->> 'ts')::timestamptz < :cutoff
""")

async with get_session() as session:
result = await session.execute(count_query, {"cutoff": cutoff_date})
thread_count = result.scalar()

logger.info(f"Found {thread_count} threads to delete.")
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mismatch between count and deletion logic. The count query (line 33-37) counts DISTINCT thread_id, suggesting it counts threads, but the deletion query deletes individual checkpoints matching the timestamp criteria. This means:

  1. The returned count represents the number of threads with old checkpoints, not the number of checkpoints being deleted
  2. The confirmation message in the script says "delete {count} threads and their history" but may only delete some checkpoints per thread
  3. The log message "Found {thread_count} threads to delete" is misleading if not all checkpoints for those threads are deleted

Either rename thread_count to clarify what it represents, or adjust the deletion logic to match the counting logic.

Copilot uses AI. Check for mistakes.
Comment on lines +12 to +80
async def cleanup_checkpoints(days: int = 90, dry_run: bool = False) -> int:
"""
Cleanup checkpoints older than the specified number of days.
Args:
days: Number of days to keep history for.
dry_run: If True, only count threads to be deleted without deleting.
Returns:
int: Number of threads deleted (or found, if dry_run is True).
"""
# Ensure DB is initialized (idempotent)
await init_db(**config.db)

# Use UTC for consistency with LangGraph timestamps
cutoff_date = datetime.now(timezone.utc) - timedelta(days=days)
logger.info(
f"Cleaning up threads older than {days} days (before {cutoff_date.isoformat()})"
)

# 1. Identify threads to delete
count_query = text("""
SELECT count(DISTINCT thread_id)
FROM checkpoints
WHERE (checkpoint ->> 'ts')::timestamptz < :cutoff
""")

async with get_session() as session:
result = await session.execute(count_query, {"cutoff": cutoff_date})
thread_count = result.scalar()

logger.info(f"Found {thread_count} threads to delete.")

if thread_count == 0:
return 0

if dry_run:
logger.info("Dry run enabled. No changes made.")
return thread_count

# 2. Perform deletion
logger.info("Deleting...")
async with get_session() as session:
async with session.begin():
delete_stmt = text("""
WITH old_threads AS (
SELECT thread_id, checkpoint_ns
FROM checkpoints
WHERE (checkpoint ->> 'ts')::timestamptz < :cutoff
),
deleted_writes AS (
DELETE FROM checkpoint_writes cw
USING old_threads ot
WHERE cw.thread_id = ot.thread_id AND cw.checkpoint_ns = ot.checkpoint_ns
),
deleted_blobs AS (
DELETE FROM checkpoint_blobs cb
USING old_threads ot
WHERE cb.thread_id = ot.thread_id AND cb.checkpoint_ns = ot.checkpoint_ns
)
DELETE FROM checkpoints cp
USING old_threads ot
WHERE cp.thread_id = ot.thread_id AND cp.checkpoint_ns = ot.checkpoint_ns
""")

await session.execute(delete_stmt, {"cutoff": cutoff_date})
logger.info("Deletion completed.")

return thread_count
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for the new cleanup_checkpoints function. The function performs critical database operations (deletion of checkpoint data) and should have tests covering:

  • Dry run mode behavior
  • Actual deletion behavior
  • Count accuracy
  • Edge cases (no checkpoints to delete, boundary dates)
  • Transaction rollback on errors

Similar functions in the codebase (e.g., tests/core/test_agent_asset.py, tests/models/test_user.py) have corresponding test files.

Copilot uses AI. Check for mistakes.
return

confirm = input(
f"Are you sure you want to delete {count} threads and their history? (y/N): "
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The confirmation message is misleading. It says "delete {count} threads and their history" but based on the cleanup logic, it may only delete some checkpoints per thread, not entire threads. This inconsistency could lead to user confusion about what will actually be deleted.

The message should accurately reflect what the cleanup operation does, or the cleanup logic should be adjusted to match the message.

Suggested change
f"Are you sure you want to delete {count} threads and their history? (y/N): "
f"Are you sure you want to delete {count} checkpoint{'s' if count != 1 else ''}? (y/N): "

Copilot uses AI. Check for mistakes.
)

# Cleanup old checkpoints daily at UTC 2:20
# Cleanup old checkpoints daily at UTC 2:20
Copy link

Copilot AI Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate comment detected. The comment "Cleanup old checkpoints daily at UTC 2:20" appears twice in consecutive lines.

Suggested change
# Cleanup old checkpoints daily at UTC 2:20

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants