Skip to content

refactor(self-heal): comp does zero classification — every dynamic failure is pending#3306

Merged
tofikwest merged 8 commits into
mainfrom
tofik/dynamic-check-versioning
Jun 30, 2026
Merged

refactor(self-heal): comp does zero classification — every dynamic failure is pending#3306
tofikwest merged 8 commits into
mainfrom
tofik/dynamic-check-versioning

Conversation

@tofikwest

@tofikwest tofikwest commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

The model

comp becomes pure plumbing: it never judges customer-vs-us. A dynamic check that doesn't succeed — a finding, a customer/transport error, or a thrown error — is held as inconclusive ("pending", hidden from the customer) and handed to the self-heal agent, the only decider (see comp-private PR).

Changes

  • Deleted the error-code classifier (check-failure-classifier.ts) + its spec — the customer/our_side/finding guessing is gone.
  • Removed failureSignalsFromEvidence (HTTP-status/error-text extraction), splitFailuresByDisposition, ClassifiableFailure, FailureDisposition. No signals, no patterns, no guessing anywhere.
  • decideRunStatus is now just: success → success; dynamic non-success → inconclusive; else failed.
  • Both run paths (scheduled + manual) record findings by identity only ({connectionId, checkId, resourceId}).
  • New /reveal internal endpoint — persists the real success/failed (never held) so the agent can surface a genuine fail to the customer.
  • Rewrote the run-status spec for the new rule; deleted the split/signals specs.

Verification

My changed files compile; task-check-evaluation tests pass (19). Pre-existing worktree typecheck failures (sync-gws / variables / credential-vault) come from an earlier main-merge + an unbuilt generated prisma client — not from this PR (CI builds the client).

Pairs with

comp-private PR: the agent that does all the deciding (fix / show / disable+ticket).

🤖 Generated with Claude Code


Summary by cubic

Make dynamic checks classification-free. Any non-success is stored as inconclusive (pending, hidden) and sent to the self-heal agent. Added POST /connections/:connectionId/reveal so the agent can persist the real result and update the task on genuine failures; static/AWS/GCP/Azure behavior is unchanged.

  • Refactors

    • Removed comp-side failure classification, signal extraction, and related specs.
    • Simplified decideRunStatus: success → success; dynamic non-success → inconclusive; else failed. decideTaskStatus blocks "done" when any check is held, including error-only runs.
    • Scheduled/manual runs store failing findings by identity only.
    • Secured rerun/reveal by asserting taskId belongs to the connection’s org.
  • New Features

    • Added internal POST /connections/:connectionId/reveal to persist the true outcome (never held). A failing reveal sets the linked task to failed; a passing reveal does not force done.
    • Reveal task sync only updates active statuses (todo, in_progress, done); never overrides not_relevant or in_review.

Written for commit cf89c5c. Summary will update on new commits.

Review in cubic

tofikwest and others added 3 commits June 30, 2026 14:18
…ure is pending

New model: comp is pure plumbing. A dynamic check that doesn't succeed — a real
finding, a customer/transport error, or a thrown execution error — is held as
'inconclusive' ("pending", hidden from the customer) and handed to the self-heal
agent, which is the ONLY thing that decides our-bug (fix) vs real fail (show).

- decideRunStatus: dynamic + non-success → 'inconclusive'; no error-code logic.
- splitFailuresByDisposition: returns all-held (nothing decided on the comp side).
- The error-code classifier is now unreferenced (to be deleted in cleanup).

WIP: reveal-real-fail endpoint + the agent decision rewrite + spec updates still
to come. Not deployed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014PAsijjUQ1bMJuw8NuC1oR
…er-side / finding)

The agent calls /reveal when it verdicts a held check as a GENUINE fail (the
customer's creds/config are wrong, or a real compliance finding). Unlike /rerun
(which applies the dynamic hold rule and may re-hold as 'inconclusive'), /reveal
persists the TRUE status — success if it now passes, 'failed' with the real
findings shown (failedCount > 0) otherwise — so the customer sees the red instead
of a silent "pending". Mirrors rerunAndPersistCheck; never holds, never disables.

tsc clean. Not deployed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014PAsijjUQ1bMJuw8NuC1oR
…decides everything

Per the design: comp must do ZERO judging of customer-vs-us. Removed every bit of
pre-classification so the only place that decides is the self-heal agent.

- Deleted check-failure-classifier.ts (the error-code customer/our_side/finding
  judging) + its spec.
- Removed failureSignalsFromEvidence (HTTP-status/error-text extraction),
  splitFailuresByDisposition, ClassifiableFailure, FailureDisposition from
  task-check-evaluation.ts. decideRunStatus is now just:
  success → success; dynamic non-success → 'inconclusive' (pending); else failed.
- Run paths (scheduled + manual) and rerun/reveal now record findings by identity
  only ({connectionId, checkId, resourceId}); a dynamic failure is always held
  pending for the agent. No signals, no patterns, no guessing.
- Rewrote the run-status spec for the new rule; deleted the split/signals specs.

my changed files compile; task-check-evaluation tests pass (19). Pre-existing
worktree spec failures (sync-gws / variables / credential-vault) are unrelated.
Not deployed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014PAsijjUQ1bMJuw8NuC1oR
@vercel

vercel Bot commented Jun 30, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
app Ready Ready Preview, Comment Jun 30, 2026 8:00pm
comp-framework-editor Ready Ready Preview, Comment Jun 30, 2026 8:00pm
1 Skipped Deployment
Project Deployment Actions Updated (UTC)
portal Skipped Skipped Jun 30, 2026 8:00pm

Request Review

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 10 files

Confidence score: 2/5

  • In apps/api/src/integration-platform/services/internal-integration-debug.service.ts, persisting a reveal run without verifying taskId belongs to the same organization (and expected check/task) can let a bad internal call write into another task’s history/status, creating cross-tenant data integrity risk—add strict ownership/association validation before saving any reveal run.
  • In apps/api/src/integration-platform/utils/task-check-evaluation.ts, dynamic non-success runs with no findings being treated as inconclusive without blocking completion can allow tasks to move to done while error-only runs are still effectively unresolved—count these inconclusive error-only runs in the held/pending logic (or include pending run count) before merging.

Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic

Comment thread apps/api/src/integration-platform/utils/task-check-evaluation.ts
…eld error-runs keep task pending

- P1: reveal/rerun now assert the taskId belongs to the SAME org as the
  connection before persisting a run (shared assertTaskBelongsToOrg helper), so a
  wrong/forged internal call can't contaminate another tenant's task history.
- P2: count HELD runs (not held findings) toward heldCount — an error-only
  dynamic run (inconclusive, no findings) now keeps the task pending instead of
  letting it slip to 'done' while unresolved. Both run paths (scheduled + manual).

my files: tsc clean; task-check-evaluation tests pass (19).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014PAsijjUQ1bMJuw8NuC1oR
@vercel vercel Bot temporarily deployed to Preview – portal June 30, 2026 19:12 Inactive
@vercel vercel Bot temporarily deployed to Preview – app June 30, 2026 19:12 Inactive
@tofikwest

Copy link
Copy Markdown
Contributor Author

@cubic-dev-ai review it

@cubic-dev-ai

cubic-dev-ai Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

@cubic-dev-ai review it

@tofikwest I have started the AI code review. It will take a few minutes to complete.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 10 files

Confidence score: 3/5

  • In apps/api/src/integration-platform/services/internal-integration-debug.service.ts, the reveal path persists the real run but does not update the linked task status, so failed runs can appear in history while the task remains green/pending and operators get a false success signal. This creates a concrete workflow integrity risk—ensure reveal also resolves/synchronizes the associated task status before merging.

Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic

A revealed genuine fail persisted a 'failed' run but left the task green/pending —
a false success signal. Now a reveal that resolves to 'failed' sets the linked
task to 'failed' too (mirrors the run paths). A reveal that PASSES does not force
'done' (the task spans other checks; recomputed on the next scheduled run).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014PAsijjUQ1bMJuw8NuC1oR

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file (changes from recent commits).

Tip: Review your code locally with the cubic CLI to iterate faster.

Fix all with cubic | Re-trigger cubic

Comment thread apps/api/src/integration-platform/services/internal-integration-debug.service.ts Outdated
Previously the reveal flipped any non-failed task to failed, which would resurrect
a human-set not_relevant (dismissed) or in_review task. Restrict it to active
workflow statuses (todo / in_progress / done) so a reveal never overrides a
dismissed or under-review task.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014PAsijjUQ1bMJuw8NuC1oR
@vercel vercel Bot temporarily deployed to Preview – portal June 30, 2026 19:56 Inactive
@tofikwest

Copy link
Copy Markdown
Contributor Author

@cubic-dev-ai review it

@cubic-dev-ai

cubic-dev-ai Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

@cubic-dev-ai review it

@tofikwest I have started the AI code review. It will take a few minutes to complete.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 10 files

Confidence score: 5/5

  • Automated review surfaced no issues in the provided summaries.
  • No files require special attention.

Re-trigger cubic

@tofikwest tofikwest merged commit 33d6ff0 into main Jun 30, 2026
11 checks passed
@tofikwest tofikwest deleted the tofik/dynamic-check-versioning branch June 30, 2026 20:04
@claudfuen

Copy link
Copy Markdown
Contributor

🎉 This PR is included in version 3.94.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants