Skip to content

fix(integrations): clear stale task status after connection disconnect (CS-166)#2577

Merged
tofikwest merged 3 commits into
mainfrom
tofik/cs-166-github-disconnect-cleanup
Apr 16, 2026
Merged

fix(integrations): clear stale task status after connection disconnect (CS-166)#2577
tofikwest merged 3 commits into
mainfrom
tofik/cs-166-github-disconnect-cleanup

Conversation

@tofikwest
Copy link
Copy Markdown
Contributor

@tofikwest tofikwest commented Apr 16, 2026

Summary

Fixes CS-166 — after a customer disconnected their GitHub integration, related tasks were stuck on "failed" status and the task detail page kept showing historical failed automation runs.

Root cause

ConnectionService.disconnectConnection() soft-deletes the connection (sets status: 'disconnected', clears credentials) but does not touch Task.status or the IntegrationCheckRun audit rows. Three independent gaps result:

  • Defect A — stuck task status: when a check from the now-disconnected connection had set Task.status = 'failed', nothing re-evaluates it. The task is pinned to failed indefinitely, because the 12-hour scheduler only processes done tasks.
  • Defect B — stale "App Automations" UI: CheckRunRepository.findByTask returned runs regardless of connection status, so the task detail panel still surfaced old failed runs under the active-integrations summary.
  • Defect C — scheduler re-marks done→failed: apps/app/src/trigger/tasks/task/task-schedule.ts pulls integrationCheckRuns with no connection filter. For a done task whose review window has elapsed, a stale failed run from the disconnected connection would flip the task to failed on the next tick.

Fix

Minimal, read-side filters + one re-evaluation on disconnect. No schema changes. Check run rows are preserved (matches the existing soft-delete design for IntegrationConnection).

  1. ConnectionService.disconnectConnection and deleteConnection now call reevaluateFailedTasksAfterDisconnect(connectionId). For each failed task that had runs from this connection, it re-derives the target status from only active (non-disconnected) automations. Uses a local mirror of the scheduler's getTargetStatus — same 3-way logic (no automations → todo, all passing → done, else failed).
  2. CheckRunRepository.findByTask filters out runs whose connection is disconnected.
  3. task-schedule.ts adds the same filter on its integrationCheckRuns include.

Test plan

  • cd apps/api && npx jest src/integration-platform/services/connection.service → 7/7 new tests pass:
    • Task with no other active automations → todo
    • Task with other passing app automations → done
    • Task with another failing automation → stays failed
    • Latest-run-per-checkId selection works
    • Non-failed tasks untouched
    • No runs from connection → no re-evaluation
    • Failing custom automation keeps task failed
  • Full API typecheck: no new errors on changed files
  • Manual: connect GitHub, run a failing check, disconnect → task status returns to todo; task detail "App Automations" panel no longer shows old runs
  • Manual: with a second active integration passing, disconnect GitHub → task transitions to done

Notes

  • Only re-evaluates tasks currently at status: 'failed'. We don't demote done or todo tasks — those transitions already have dedicated flows.
  • The 3-way status helper is inlined in ConnectionService rather than extracted. The scheduler's getTargetStatus lives in apps/app and can't be imported from apps/api; a shared package move would be a larger refactor than this bug requires.

🤖 Generated with Claude Code


Summary by cubic

Clears stale “failed” task states when an integration (e.g., GitHub) is disconnected and hides runs from disconnected connections so the UI and scheduler ignore outdated failures. Fixes CS-166.

  • Bug Fixes
    • On disconnectConnection/deleteConnection, re-derive each affected failed task’s status from active automations only (todo if none, done if all passing; otherwise unchanged). Latest-per-checkId selection is order-independent, and the cleanup is best-effort (wrapped in try/catch) so disconnect/delete still succeed. Audit rows stay.
    • Exclude runs from disconnected connections in CheckRunRepository.findByTask and apps/app/src/trigger/tasks/task/task-schedule.ts so stale failures don’t affect the "App Automations" panel or scheduled status.

Written for commit 9f140e1. Summary will update on new commits.

…t (CS-166)

After a user disconnects an integration (e.g. GitHub), tasks that were
marked 'failed' by that connection's check runs stayed stuck on 'failed'
forever, because nothing re-evaluates the task's automation state on
disconnect. In addition, the task detail "App Automations" panel kept
showing historical failed runs from the disconnected connection, and the
12-hour task-schedule job would re-mark done-then-overdue tasks as failed
using those stale runs.

Three changes, all reading-side filters + one re-evaluation on disconnect:

1. ConnectionService.disconnectConnection and deleteConnection now call
   reevaluateFailedTasksAfterDisconnect, which re-derives each affected
   failed task's target status from its remaining non-disconnected
   automations. Uses a local mirror of the scheduler's getTargetStatus.

2. CheckRunRepository.findByTask excludes runs from disconnected
   connections so the UI no longer counts stale "failed" history.

3. task-schedule.ts includes the same connection-status filter on its
   integrationCheckRuns include.

Check run rows are preserved for audit trail (matching the existing
soft-delete design for IntegrationConnection itself).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
app Ready Ready Preview, Comment Apr 16, 2026 9:04pm
comp-framework-editor Ready Ready Preview, Comment Apr 16, 2026 9:04pm
portal Ready Ready Preview, Comment Apr 16, 2026 9:04pm

Request Review

@linear
Copy link
Copy Markdown

linear Bot commented Apr 16, 2026

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 4 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/api/src/integration-platform/services/connection.service.ts">

<violation number="1" location="apps/api/src/integration-platform/services/connection.service.ts:148">
P2: Wrap `reevaluateFailedTasksAfterDisconnect` in a try-catch in both call sites. The connection is already disconnected when re-evaluation runs; if it throws, the primary operation should still succeed rather than surfacing an error to the caller for a best-effort side effect.</violation>

<violation number="2" location="apps/api/src/integration-platform/services/connection.service.ts:255">
P2: The `latestByCheckId` logic drops the `run.createdAt > existing.createdAt` guard present in the original `getTargetStatus`. This makes correctness silently dependent on the query's `orderBy` clause. Add the timestamp comparison to match the original's order-independent behavior.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

Comment thread apps/api/src/integration-platform/services/connection.service.ts Outdated
Comment thread apps/api/src/integration-platform/services/connection.service.ts Outdated
tofikwest and others added 2 commits April 16, 2026 16:59
Two follow-ups to cubic's review on PR #2577:

1. Make the latest-run-per-checkId selection in deriveTargetStatusForTask
   order-independent: compare run.createdAt instead of trusting the
   caller's orderBy. Matches the scheduler's getTargetStatus.

2. Wrap reevaluateFailedTasksAfterDisconnect in try/catch at both call
   sites (disconnectConnection, deleteConnection). The primary disconnect
   has already persisted by the time re-evaluation runs; a transient DB
   error during this best-effort cleanup must not surface to the caller.

Adds regression tests: (a) reverse-sorted input still picks newest run
per checkId, (b) disconnect resolves successfully even when the runs
query throws.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 2 files (changes from recent commits).

Requires human review: Auto-approval blocked by 2 unresolved issues from previous reviews.

@tofikwest tofikwest merged commit 5b9387d into main Apr 16, 2026
10 checks passed
@tofikwest tofikwest deleted the tofik/cs-166-github-disconnect-cleanup branch April 16, 2026 21:06
@claudfuen
Copy link
Copy Markdown
Contributor

🎉 This PR is included in version 3.23.2 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants