Added Sentry monitoring for Billing app load failures#27826
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a billing app load monitor: BillingService gains timeout/retry constants, tracked state, lifecycle cleanup, start/timeout/retry/reload/mark-as-loaded helpers, and one-time Sentry failure reporting. GhBillingIframe starts the monitor after setting iframe src and marks the app loaded on postMessage. Unit tests cover timeout-to-retry-to-failure flow, iframe reload on retry, Sentry report contents and tagging, monitor clearing on success, and no reporting when sentry_dsn is null. 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
ghost/admin/tests/unit/services/billing-test.js (1)
66-74: ⚡ Quick winAdd a regression test for “loaded once, then monitor again”
Please add a case that calls
markBillingAppLoaded(), then starts monitoring again for a new load attempt, and verifies a new timeout can be scheduled/fired.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ghost/admin/tests/unit/services/billing-test.js` around lines 66 - 74, Add a new unit test in billing-test.js that first calls service.markBillingAppLoaded(), then calls service.startBillingAppLoadMonitor() again to start a new load attempt, assert that service.billingAppLoadTimeout is set (non-null) after starting monitor, simulate/fire the timeout (or advance fake timers) and then assert that a report was created (testkit.reports() length increased) and that service.billingAppLoadTimeout is cleared (null) after the timeout handler runs; reference the existing helpers startBillingAppLoadMonitor and markBillingAppLoaded and assert on billingAppLoadTimeout and testkit.reports() to validate the regression.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@ghost/admin/app/services/billing.js`:
- Around line 63-77: The current logic sets billingAppLoaded = true in
markBillingAppLoaded so startBillingAppLoadMonitor returns early on subsequent
iframe loads and never re-arms monitoring; fix by making the monitor re-armable:
either remove the billingAppLoaded early-return in startBillingAppLoadMonitor
(so it only checks billingAppLoadTimeout) or change markBillingAppLoaded to
avoid permanently setting billingAppLoaded (e.g., do not set billingAppLoaded =
true or explicitly reset it back to false after calling
clearBillingAppLoadMonitor); update references to startBillingAppLoadMonitor,
markBillingAppLoaded, clearBillingAppLoadMonitor, billingAppLoaded, and
billingAppLoadTimeout accordingly.
---
Nitpick comments:
In `@ghost/admin/tests/unit/services/billing-test.js`:
- Around line 66-74: Add a new unit test in billing-test.js that first calls
service.markBillingAppLoaded(), then calls service.startBillingAppLoadMonitor()
again to start a new load attempt, assert that service.billingAppLoadTimeout is
set (non-null) after starting monitor, simulate/fire the timeout (or advance
fake timers) and then assert that a report was created (testkit.reports() length
increased) and that service.billingAppLoadTimeout is cleared (null) after the
timeout handler runs; reference the existing helpers startBillingAppLoadMonitor
and markBillingAppLoaded and assert on billingAppLoadTimeout and
testkit.reports() to validate the regression.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 23a496b9-4329-4d08-8878-f0adf6ab29ad
📒 Files selected for processing (3)
ghost/admin/app/components/gh-billing-iframe.jsghost/admin/app/services/billing.jsghost/admin/tests/unit/services/billing-test.js
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@ghost/admin/tests/unit/services/billing-test.js`:
- Around line 49-53: The test is flaky because it relies on a timing-sensitive
assertion that reportBillingAppLoadFailure is false immediately after awaiting
reloadBillingIframe; change this to assert call order instead: after awaiting
both waitUntil(...) for reloadBillingIframe and waitUntil(...) for
reportBillingAppLoadFailure, assert that reloadBillingIframe was called before
reportBillingAppLoadFailure (e.g. using
sinon.assert.callOrder(reloadBillingIframe, reportBillingAppLoadFailure) or by
comparing their firstCall.callId/timestamps) to make the expectation
deterministic; update the assertions around reloadBillingIframe,
reportBillingAppLoadFailure, and waitUntil to reflect this call-order check.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 44823a89-8349-4301-8f6b-237cac348b00
📒 Files selected for processing (2)
ghost/admin/app/services/billing.jsghost/admin/tests/unit/services/billing-test.js
d66096d to
a35b193
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@ghost/admin/app/services/billing.js`:
- Around line 68-76: The monitor currently stops being startable forever because
startBillingAppLoadMonitor() returns early when billingAppLoadFailureReported is
true; change the logic in startBillingAppLoadMonitor to not permanently block
retries by clearing or ignoring billingAppLoadFailureReported when initiating a
new monitor (e.g., reset billingAppLoadFailureReported = false before starting)
and ensure billingAppLoadTimeout and billingAppLoadAttempts are handled
consistently with billingAppLoaded; update references to
billingAppLoadFailureReported, billingAppLoadTimeout, billingAppLoaded, and
billingAppLoadAttempts in startBillingAppLoadMonitor and any failure-reporting
paths so a single reported failure does not disable future monitor attempts in
the same session.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f6eabd2f-08db-4ea5-8ed6-e62c6e5e29f1
📒 Files selected for processing (3)
ghost/admin/app/components/gh-billing-iframe.jsghost/admin/app/services/billing.jsghost/admin/tests/unit/services/billing-test.js
🚧 Files skipped from review as they are similar to previous changes (2)
- ghost/admin/app/components/gh-billing-iframe.js
- ghost/admin/tests/unit/services/billing-test.js
a34d32e to
5bf1f39
Compare
|
@aileen LGTM, added a few nits but nothing blocking |
ref [INC-284](https://app.incident.io/ghost/incidents/284) The Billing app iframe could fail to become ready without any telemetry, leaving reports of the Ghost(Pro) route not loading invisible to the team. Added a 10 second readiness monitor around `GhBillingIframe` startup that reports a standardized Sentry event when no valid Billing iframe message arrives, while clearing the monitor once the app communicates successfully.
ref [INC-284](https://app.incident.io/ghost/incidents/284) The Billing app readiness monitor now performs one iframe reload after a short backoff before reporting a failure to Sentry. This avoids reporting transient startup failures while still surfacing persistent cases where the Billing app never sends a valid message back to Admin.
ref [INC-284](https://app.incident.io/ghost/incidents/284) The Billing app load monitor now re-arms cleanly after a previous readiness message so a later iframe load can still be observed. The retry test now asserts call order between iframe reload and failure reporting, which avoids relying on timing-sensitive intermediate state.
ref [INC-284](https://app.incident.io/ghost/incidents/284) The Billing app load monitor now treats a previous failure report as completed state rather than a permanent block, so a later iframe load in the same session can be monitored again. Added a regression test covering the failure-to-new-load path.
5bf1f39 to
f40de34
Compare
ref [INC-284](https://app.incident.io/ghost/incidents/284) The Billing app load failure report now uses the standardized `billing_monitor` Sentry context and is captured as an exception so it matches the updated monitoring catalog. Updated the billing service test to assert the new context shape.
Summary
ref INC-284
Testing
corepack pnpm exec eslint app/services/billing.js app/components/gh-billing-iframe.js tests/unit/services/billing-test.js --cachecorepack pnpm exec ember exam --file-path tests/unit/services/billing-test.js --split 1 --parallel 1