Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource leaks, timeouts when using the default "thread" reporting mechanism: new thread created for each message #412

Closed
bsmedberg-xometry opened this issue Jul 21, 2022 · 8 comments · Fixed by #416
Assignees

Comments

@bsmedberg-xometry
Copy link

We tried adopting pyrollbar into our production applications yesterday, and had to disable it for causing timeouts and other resource issues.

After investigation, I believe that the problem is in the way the default thread reporting mechanism works:

  1. upon report, it launches a completely new thread for each item that is submitted:
    thread = threading.Thread(target=_send_payload, args=(payload_str, access_token))
  2. it stores references to these threads in a global Queue object _threads
  3. when the thread runs, it pulls something off the _threads Queue... but there's no guarantee that it pull itself off the queue; it could be pulling any thread off the queue!

The purpose of the global _threads queue is to allow the code to .join() the queue at shutdown (although this mechanism is only used for lambdas).

I'm pretty sure that this mechanism is leaking either actual OS threads or (more likely) python objects and resources like mutexes; I'm not 100% sure how, but the fact that threads don't pull themself out of the queue is likely the problem.

In any case, this threading mechanism is very inefficient: individual threads a heavyweight mechanism for queuing small tasks. It would be much better to have a single background thread or a small background thread pool that processes a queue of items to post; this work thread(s) poll/block on the queue and post items.

@danielmorell
Copy link
Collaborator

Let me investigate this a bit more. But initially, I think using a tread pool might be the best option.

@danielmorell danielmorell self-assigned this Jul 22, 2022
@ghost
Copy link

ghost commented Nov 4, 2022

The fix is on its way, but it takes some time as touching the thread pool involves some risk. Alternatively, you can try using our httpx or async handler. Using one of those would entirely bypass the new thread issue. The handler can be set by passing the handler argument to the init() function: rollbar.init(handler='async')

@danielmorell
Copy link
Collaborator

danielmorell commented Nov 28, 2022

Hey @bsmedberg-xometry, the beta release with the fix for this is out https://pypi.org/project/rollbar/0.16.4b0/. Please test it out and let us know how it performs.

@bsmedberg-xometry
Copy link
Author

Thank you! We can probably get this into our stage environment very quickly, but probably can't test under load until next week after month close.

@danielmorell
Copy link
Collaborator

That makes a lot of sense. If you have not read the write up in the PR, I would highly recommend it. It explains the impact of your Python version on the default number of worker threads in the pool.

If you have any issues, thoughts, or discoveries please let me know. I am happy to help!

@ghost
Copy link

ghost commented Jan 12, 2023

Hey @bsmedberg-xometry, We just had an internal discussion where I was informed that the threading error didn't get resolved with v0.16.4b0. Can you share more details on what went wrong exactly? We're happy to investigate this issue further.

@ghost ghost reopened this Jan 12, 2023
@danielmorell danielmorell mentioned this issue Mar 2, 2023
12 tasks
@ghost
Copy link

ghost commented Jun 27, 2023

We released a new batch payload transformer with v0.16.4b1, that should resolve this issue. @bsmedberg-xometry Cany you test and validate if it works for you? Let me know if you need any help.

@danielmorell
Copy link
Collaborator

This was resolved a while ago.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants