Skip to content

fix(core): auto-disable Integration when credentials are invalidated#576

Merged
d-klotz merged 3 commits intonextfrom
fix/core-auto-disable-integration-on-invalid-auth
Apr 16, 2026
Merged

fix(core): auto-disable Integration when credentials are invalidated#576
d-klotz merged 3 commits intonextfrom
fix/core-auto-disable-integration-on-invalid-auth

Conversation

@d-klotz
Copy link
Copy Markdown
Contributor

@d-klotz d-klotz commented Apr 16, 2026

Summary

When OAuth2Requester.refreshAuth fails and Module.markCredentialsInvalid flips Credential.authIsValid to false, Integration.status was left as ENABLED. The queue worker only short-circuits on DISABLED, so every subsequent webhook kept getting processed, failed again, and re-entered the error loop -- producing 94k+ error lines/day for Attio integrations with dead tokens.

This PR uses Frigg's existing Delegate pattern to propagate the failure upward:

  • Module fires a new CREDENTIAL_INVALIDATED event from markCredentialsInvalid (best-effort try/catch so it never alters refreshAuth's documented return false contract)
  • IntegrationBase catches the event in a new receiveNotification method and calls updateIntegrationStatus.execute(this.id, 'ERROR')
  • The delegate wire (module.delegate = this) is installed in IntegrationBase._appendModules, covering all 7 code paths that construct Integration instances (HTTP reads, queue workers, create/update/delete flows)
  • Queue worker (backend-utils.js) now discards on both DISABLED and ERROR statuses (previously only DISABLED)

Status semantics (per Frigg team lead feedback)

Status Meaning Who sets it Queue worker Re-auth clears it
ENABLED Active, processing webhooks Initial state + Gap C re-auth Processes n/a
DISABLED User intentionally paused User action only Discards Yes (Reconnect = opt-in)
ERROR Credentials dead (system) This PR (auto-flip) Discards Yes (new creds fix it)

This is Gap B of 3 from the Attio dead-token RCA:

Recovery is handled by Gap C: ProcessAuthorizationCallback.restoreIntegrationsForEntity flips {ERROR, DISABLED} back to ENABLED on successful re-auth, so Gaps B + C are complementary.

Files changed

File Change
packages/core/modules/module.js New DLGT_CREDENTIAL_INVALIDATED constant + this.notify(...) at end of markCredentialsInvalid(), wrapped in try/catch
packages/core/integrations/integration-base.js module.delegate = this wiring in _appendModules loop + new receiveNotification() handler that flips status to ERROR
packages/core/handlers/backend-utils.js Queue worker status discard extended from === 'DISABLED' to ['DISABLED', 'ERROR'].includes(status) at both check points (lines 157, 173)

Test plan

7 unit tests across 4 files (all green, 19 suites / 171 tests full regression clean):

  • Module fires CREDENTIAL_INVALIDATED to its delegate after credential flip
  • Module with delegate=null completes silently (backward compat)
  • IntegrationBase flips to ERROR on CREDENTIAL_INVALIDATED
  • IntegrationBase ignores unknown delegate strings
  • IntegrationBase no-ops when this.id is falsy (not hydrated)
  • IntegrationBase._appendModules sets module.delegate = this for all modules
  • Queue worker discards message when integration status is ERROR

🤖 Generated with Claude Code

When OAuth2Requester.refreshAuth fails and Module.markCredentialsInvalid
flips Credential.authIsValid to false, Integration.status was left as
ENABLED. The queue worker only short-circuits on DISABLED, so every
subsequent webhook kept getting processed, failed again, and re-entered
the loop — producing 94k+ error lines/day for Attio integrations with
dead tokens.

This fix uses Frigg's existing Delegate pattern to propagate the failure
upward:

1. Module fires a new CREDENTIAL_INVALIDATED event from
   markCredentialsInvalid (best-effort, try/catch so it never alters
   refreshAuth's documented `return false` contract).

2. IntegrationBase.receiveNotification catches the event and calls
   updateIntegrationStatus.execute(this.id, 'DISABLED').

3. The delegate wire (module.delegate = this) is installed in
   IntegrationBase._appendModules, which covers all 7 code paths that
   construct Integration instances (HTTP reads, queue workers,
   create/update/delete flows).

Recovery is handled by the Gap C fix (PR #574):
ProcessAuthorizationCallback.restoreIntegrationsForEntity flips
{ERROR, DISABLED} back to ENABLED on successful re-auth.

This is Gap B of 3 from the Attio dead-token RCA. Gap A (refresh
short-circuit when refresh_token is null) is PR #575.

6 unit tests covering: delegate propagation, null-delegate backward
compat, status flip, unknown-delegate no-op, missing-id guard, and
module-delegate wiring at construction.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@d-klotz d-klotz added release Create a release when this pr is merged prerelease This change is available in a prerelease. labels Apr 16, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1636cbaa51

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

console.log(
`[Frigg] Module ${notifier?.name || '?'} reported invalid credentials for integration ${this.id} — marking DISABLED`
);
await this.updateIntegrationStatus.execute(this.id, 'DISABLED');
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restrict auto-disable to permanent auth failures

Disabling the integration unconditionally on every CREDENTIAL_INVALIDATED event will also disable it for transient refresh failures, because OAuth2Requester.refreshAuth() emits DLGT_INVALID_AUTH for any caught error (including network/5xx issues), and Module.receiveNotification() maps that directly to markCredentialsInvalid(). With this new status flip, a temporary outage in the token endpoint can now move healthy integrations to DISABLED and stop all webhook processing until manual reauth, which is a functional regression from this change path.

Useful? React with 👍 / 👎.

d-klotz and others added 2 commits April 16, 2026 12:45
…ker discard

Per Frigg team lead feedback: DISABLED is reserved for user-intent
("turn off integration, keep settings"). System-driven credential
failures should use ERROR to preserve the semantic distinction.

Changes:
- IntegrationBase.receiveNotification: DISABLED -> ERROR
- backend-utils.js queue worker: extend status discard check at
  both lines 157 and 173 from `=== 'DISABLED'` to
  `['DISABLED', 'ERROR'].includes(status)` so ERROR integrations
  are also short-circuited (no wasted webhook processing)
- Updated log messages to show actual status dynamically
- Added test for ERROR discard in integration-defined-workers

Gap C (PR #574) already resets both {ERROR, DISABLED} on re-auth
via STATUSES_RESET_ON_REAUTH — no change needed there.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…gate-wiring comment

Cleans up an internal Quo-specific reference in IntegrationBase._appendModules
for upstream clarity. No behavior change.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Receives notifications from modules (the Delegate pattern) when
* something integration-level needs attention. Today this catches the
* `CREDENTIAL_INVALIDATED` event Module fires from `markCredentialsInvalid`
* and flips this integration's status to DISABLED so the queue worker
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation inconsistency: Comment states the integration status will be flipped to "DISABLED" but the actual implementation sets it to "ERROR" (line 568). While both statuses are functionally handled the same way in backend-utils.js, this misleading comment could cause confusion during future maintenance.

Correct the comment to:

// and flips this integration's status to ERROR so the queue worker
Suggested change
* and flips this integration's status to DISABLED so the queue worker
* and flips this integration's status to ERROR so the queue worker

Spotted by Graphite

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

This comment came from an experimental review—please leave feedback if it was helpful/unhelpful. Learn more about experimental comments here.

@sonarqubecloud
Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
20.1% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

@d-klotz d-klotz merged commit 4fbb93c into next Apr 16, 2026
5 of 8 checks passed
@d-klotz d-klotz deleted the fix/core-auto-disable-integration-on-invalid-auth branch April 16, 2026 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

prerelease This change is available in a prerelease. release Create a release when this pr is merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant