-
Notifications
You must be signed in to change notification settings - Fork 28
feat(insights): add new batch worker for event processing #201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* Make sure we are passing jsonl * Log success API request * Fix ts (it was reporting seconds since epoch) * Also fix bug in events worker
I forgot that configure is kinda lazy, so we should make sure the events_worker has the correct connection & config after it changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a new asynchronous EventsWorker to batch and send events to Honeybadger Insights, and it makes minor refactorings and type improvements throughout the codebase. Key changes include:
- Introduction of the EventsWorker and associated types, configuration options, and connection protocol updates.
- Updates to tests and integrations (Flask, Django, ASGI, AWS Lambda) to support the new events workflow.
- Minor refactors in logging, naming, and type annotations across several modules.
Reviewed Changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
honeybadger/version.py | Updated version string quote style. |
honeybadger/types.py | Added events-related types and enums. |
honeybadger/tests/* | Updated tests to validate new events handling and add type ignores where needed. |
honeybadger/protocols.py | Introduced the Connection protocol including send_events. |
honeybadger/fake_connection.py | Renamed send_event to send_events and updated logging messages. |
honeybadger/events_worker.py | New file implementing asynchronous batching and sending of events. |
honeybadger/core.py | Updated event method to integrate the EventsWorker and use updated timestamp. |
honeybadger/contrib/* | Minor adjustments and type ignores to support modern language features. |
honeybadger/connection.py | Updated request sender to return structured events send results. |
honeybadger/config.py | Added new events_worker configuration options with default values. |
dev-requirements.txt | Added additional type checking dependencies. |
README.md | Updated documentation to reflect the new events_worker configuration options. |
Comments suppressed due to low confidence (1)
honeybadger/config.py:30
- The conversion of boolean options using bool(val) may yield unexpected results because non-empty strings (e.g., 'False') evaluate to True. Consider implementing a more robust boolean converter that explicitly checks for string values 'true' or 'false'.
elif option_types[option] is bool:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a run through this, but not in great detail because I have little to none multi-threading experience in python. I'll probably give it another go, but here's some initial feedback.
One issue that we've had in other languages (PHP, Node.js) is that our package may run inside an AWS Lambda function:
The process may exit while there are events in the queue. Therefore we should be able to hook into the shutdown flow and flush the queue if necessary. Do you think we should handle this scenario here as well?
Since lambda uses TERM for the shutdown signal, and atexit does not cover the term signal, this change listens for TERM and does a proper exit (which then uses atexit). It sounds like there is not really a perfect way to handle this, but if there are any other signal callbacks, then we assume that some other listener will handle a proper shutdown and delegate to that.
Thanks for the review @subzero10. I had thought that This works for the few tests that I ran it through, but as I am finding with many things in the Python instrumentation space, despite PEP 20, there is not one easy way to detect an exiting process :) |
This adds an EventsWorker, with similar logic as the one added for the Elixir client.
It gathers batches of events and if it either crosses the
events_timeout
or passes theevents_batch_size
queue length, it will attempt to send the batch. Each batch gets 3 tries before being dropped and they are independently counted.If the backend returns a throttling response (429) the events worker will wait for
events_throttle_wait
before attempting to send a batch payload. Batches will be dropped in FIFO order afterevents_max_batch_retries
attempts. This is also the case with throttling to keep the set of events fresh for when the backend resets its throttle (or the user upgrades)We drop events after
events_max_queue_size
has been reached (including the counts of events that are waiting in batches) and periodically log the dropped counts.On top of adding the events worker, I added some small fixes, rewordings, and QoL things like types and protocols. I added a mypy check to pytest so types are checked as part of the test runs.