fix(billing): implement chunked updates for free tier usage tracking#9549
Merged
Conversation
Reduces DB load by 95% via transaction batching (50,000 → 50 chunks). Each chunk processes 1,000 orgs with proper error handling. Failed chunks reported to Datadog without killing the job. Changes: - Refactored processThresholds() to return update data instead of executing immediately - Created bulkUpdates.ts with chunked transaction processing (1000 orgs per batch) - Modified usageAggregation.ts to collect updates and execute in bulk - Updated tests to verify returned data instead of mock calls - Added error handling with traceException for failed chunks - Structured for easy swap to raw SQL (Option 1) if needed 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
The tests now call bulkUpdateOrganizations() to complete the update flow, including cache invalidation. This reflects the refactored architecture where processThresholds() returns update data and bulkUpdateOrganizations() executes it.
60 seconds was excessive for 1000 orgs. Even at 10ms per update, that's only 10 seconds. 15 seconds provides a reasonable buffer.
Benefits over previous () approach: - Better resilience: One failed org doesn't fail the entire 1000-org chunk - Concurrent execution: Much faster than sequential transaction - Granular error tracking: Track exactly which orgs failed - Better error handling: Each org failure reported to Datadog individually Trade-off: No atomicity per chunk, but we don't need it for this use case. Each org update is independent and idempotent.
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Steffen911
approved these changes
Oct 6, 2025
Member
Steffen911
left a comment
There was a problem hiding this comment.
Both bulkUpdateOrganizations approaches look good to me in the current implementation. Your pick!
murdore
pushed a commit
to juspay/langfuse
that referenced
this pull request
Oct 14, 2025
…angfuse#9549) * fix(billing): implement chunked updates for free tier usage tracking Reduces DB load by 95% via transaction batching (50,000 → 50 chunks). Each chunk processes 1,000 orgs with proper error handling. Failed chunks reported to Datadog without killing the job. Changes: - Refactored processThresholds() to return update data instead of executing immediately - Created bulkUpdates.ts with chunked transaction processing (1000 orgs per batch) - Modified usageAggregation.ts to collect updates and execute in bulk - Updated tests to verify returned data instead of mock calls - Added error handling with traceException for failed chunks - Structured for easy swap to raw SQL (Option 1) if needed 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * test: update cache invalidation tests to use bulkUpdateOrganizations The tests now call bulkUpdateOrganizations() to complete the update flow, including cache invalidation. This reflects the refactored architecture where processThresholds() returns update data and bulkUpdateOrganizations() executes it. * fix: reduce transaction timeout from 60s to 15s per chunk 60 seconds was excessive for 1000 orgs. Even at 10ms per update, that's only 10 seconds. 15 seconds provides a reasonable buffer. * refactor: use Promise.allSettled instead of transaction wrapper Benefits over previous () approach: - Better resilience: One failed org doesn't fail the entire 1000-org chunk - Concurrent execution: Much faster than sequential transaction - Granular error tracking: Track exactly which orgs failed - Better error handling: Each org failure reported to Datadog individually Trade-off: No atomicity per chunk, but we don't need it for this use case. Each org update is independent and idempotent. * fix: remove unused chunkOrgIds variable * remove unused code * refactor transaction update and add rawsql update * Update worker/src/ee/usageThresholds/bulkUpdates.ts Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> * Update worker/src/ee/usageThresholds/bulkUpdates.ts Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> * make rawsql query default --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes GTM-1500: High DB load on free tier usage tracking job
Implements Option 2: Transaction-based chunking (1000 orgs per batch)
Changes
thresholdProcessing.ts:
OrgUpdateDatatype for collecting update dataprocessThresholds()to return update data instead of executing immediatelyprisma.organization.update()callsbulkUpdates.ts (NEW):
traceException()on failure, continues processingusageAggregation.ts:
bulkUpdateOrganizations()after processing each dayUsageAggregationStatstypeTests: Updated to verify returned
updateDatainstead of mock callsError Handling
traceException()Performance Impact
Test Plan
🤖 Generated with Claude Code
Important
Implements chunked updates for free tier usage tracking, reducing database load and improving performance by refactoring
processThresholds()and introducingbulkUpdateOrganizations()for efficient batch processing.processThresholds()inthresholdProcessing.tsrefactored to return update data.bulkUpdateOrganizations()inbulkUpdates.tshandles chunked updates with error isolation per chunk.usageAggregation.tscollects updates and callsbulkUpdateOrganizations()after processing each day.traceException().updateDatainstead of mock calls.This description was created by
for 3388d3c. You can customize this summary. It will automatically update as commits are pushed.