Skip to content

Conversation

@dsaxton-1password
Copy link

@dsaxton-1password dsaxton-1password commented Nov 3, 2025

Summary

Currently Workers.flush doesn't respect flush_queue_size which can cause a large number of events to be sent to Amplitude when this method is called. This can then trigger errors such as 413 (Content Too Large). The change here updates the method to send the data in batches according to flush_queue_size in order to prevent this.

This modifies the return type of Workers.flush from Future to List[Future] which I believe only impacts the behavior in Timeline.flush but I could be wrong there.

Checklist

  • Does your PR title have the correct title format?
  • Does your PR have a breaking change?: Hopefully not

@dsaxton-1password dsaxton-1password changed the title Respect flush_queue_size when calling Workers.flush fix: Respect flush_queue_size when calling Workers.flush Nov 3, 2025
@sojingle
Copy link
Contributor

Hi @dsaxton-1password ,
Flush is not intended to be called frequently. Normally, with buffer_consumer, events are automatically sent periodically based on flush_interval_millis and flush_queue_size. Do you have scenarios where you need to call flush frequently?

@dsaxton-1password
Copy link
Author

dsaxton-1password commented Nov 14, 2025

Hi @dsaxton-1password , Flush is not intended to be called frequently. Normally, with buffer_consumer, events are automatically sent periodically based on flush_interval_millis and flush_queue_size. Do you have scenarios where you need to call flush frequently?

Not frequently, but the way our event consumer logic works there is a need to periodically "checkpoint" our progress before continuing. To do this we explicitly call Client.flush to ensure all the events are processed before checkpointing.

However, since there is no guarantee that the total queue has a reasonable size when we do this (we could be ingesting events much faster than they have been flushed in the background) we can end up sending an exceedingly large payload that then fails. It's also worth noting that we're already setting both the max flush queue size and the flush interval when we initialize our client but the issue happens nonetheless.

The way I see it there shouldn't be any disadvantage to sending the data in batches when someone calls flush (especially if the batches are the same size as what normally gets sent in the background), but there may be a benefit.

@sojingle
Copy link
Contributor

sojingle commented Nov 19, 2025

Hi @dsaxton-1password
Thank you for providing the additional information and this PR. Changing the API return value type would break compatibility, so I rewrote the fix based on your suggestion. Please help review and validate the PR in your service #64, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants