fix(transport): drop rate limited events #953
Open
+475
−35
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.





Description
We've been seeing OOMs in our service pods when we receive larger than normal bursts of traffic:

We tracked the increased memory usage to the Sentry.Transport.Sender processes whose mailboxes are getting backed up when the sender gets rate limited by Sentry:

The current implementation of the sender ignores Sentry's rate limit headers until a 429 status is received. At that point, each sender process will sleep until the "Retry-After" period has ended. During the time while the process is sleeping, the process mailbox continues to fill up with more events to send.
This pull request creates a new
Sentry.Transport.RateLimiterbacked by ets, which will store current rate limits for the categories specified in the Sentry rate limit response header. During times where the category of message to be sent is under a rate limit, the client will now drop the message, as is recommended by Sentry's rate limiting docs. This should prevent pileup in the process mailboxes. And because it's now tracking rate limits for each message category, it will still allow other categories to get through while one is rate limited so that less data is lost going forward.AI Disclosure: I used Claude Code for an initial pass on this work, before thoroughly reviewing and refining it manually.
I'm happy to make or receive any changes requested to help get this through.
Issues