feat(subscriptions): gate AI summaries on AI credit budget#59625
feat(subscriptions): gate AI summaries on AI credit budget#59625vdekrijger wants to merge 3 commits into
Conversation
AI summary generation called the LLM with no budget check. Adopt the chat assistant's pattern (ee/api/conversation.py): is_team_limited(AI_CREDITS) via the billing quota-limiting cache. Enable-time gate in the serializer raises QuotaLimitExceeded when toggling a summary on while over budget; generation-time gate in the snapshot activity skips the summary and delivers the rest (graceful degradation) so an org over budget stops consuming credits mid-month.
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
🎭 Playwright didn't run on this PR — your changes touch code that could affect E2E behavior, but Playwright is opt-in via label now to keep CI cost down. Add the Most PRs don't need this. Real regressions still get caught on master and fix-forward. |
ClickHouse migration SQL per cloud environmentNo ClickHouse migrations changed in this PR. |
Prompt To Fix All With AIFix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
ee/api/subscription.py:289-292
The `get_organization()` call (which may execute a DB query) happens before `_validate_summary_credit_budget()`, the cheaper Redis/cache check. If the budget check raises `QuotaLimitExceeded`, the org fetch was wasted. Placing the fast guard first avoids unnecessary work.
```suggestion
if self._is_becoming_active_summary(attrs):
self._validate_summary_credit_budget()
organization = self.context["get_organization"]()
self._validate_summary_enabled_org_limit(organization)
```
Reviews (1): Last reviewed commit: "feat(subscriptions): gate AI summaries o..." | Re-trigger Greptile |
Address review feedback on the AI-summary credit gate: - Surface an over-budget skip to recipients: thread a summary_skipped_over_budget flag from the snapshot activity through the workflow and delivery activity into the email template and Slack message, rendering a notice only when the summary was skipped specifically due to the AI credit budget. - Run the cheap quota cache check before the org DB fetch in the enable-time gate. - Add SUBSCRIPTION_SUMMARY_CREDIT_CHECK_FAILED counter on the fail-open path. - Parameterize the credit-budget tests; assert no notice on disabled/no-consent skips. - Simplify comments to focus on the why.
|
Reviews (2): Last reviewed commit: "feat(subscriptions): notify when AI summ..." | Re-trigger Greptile |
|
Want your agent to iterate on Greptile's feedback? Try greploops. |
|
Reviews (3): Last reviewed commit: "chore(subscriptions): trim verbose comme..." | Re-trigger Greptile |

Problem
Subscription AI summaries call an LLM on every delivery with no budget check. Now that subscriptions are available on the free tier (#59624), an org could keep generating summaries while over its AI credit budget. We should stop generating once an org is over budget — the same way the chat assistant already behaves.
Changes
Adopt the chat assistant's enforcement pattern (
ee/api/conversation.py):is_team_limited(QuotaResource.AI_CREDITS), which reads the billing quota-limiting cache (not the retrospective usage query).SubscriptionSerializer.validate): raisesQuotaLimitExceeded(402) when a summary is being toggled on while the org is over budget. Reuses the existing "becoming active" check, so grandfathered summaries and unrelated edits are unaffected.snapshot_subscription_insights): when over budget, skips the summary and delivers the rest of the subscription (graceful degradation) rather than failing the delivery. Fail-open — if the quota lookup itself errors, the summary is generated rather than silently dropped.How did you test this code?
I'm an agent (Claude Code) — automated suites I ran locally:
ee/api/test/test_subscription.py— enable-time gate over/under budget, the patch-transition case, and the grandfathering case (unrelated edits while over budget aren't blocked).posthog/temporal/subscriptions/test_snapshot_ai_consent.py— generation-time skip when over budget, generate when under, and fail-open when the quota check raises.Publish to changelog?
No — internal cost-control guard, no user-facing feature.
Docs update
n/a
🤖 Agent context
Authored with Claude Code (human-directed). Stacked on #59624.
Investigation that shaped this: the AI credit budget shown in
/usageis computed by a retrospective ClickHouse query and is display-only — the actual enforcement signal isis_team_limited(AI_CREDITS), populated in a Redis set by the billing quota-limiting system and (until this PR) used only by the Max chat assistant. We reuse that chokepoint rather than introducing a bespoke check, so this is the second adopter of an established pattern, not new billing infrastructure. The lookup is cheap (in-process-cached set membership), so it's safe in the per-delivery path. Behaviour differs by context deliberately: a hard 402 at enable-time (API), graceful skip-and-deliver at generation-time (background workflow).