Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: memory leak when publishing messages #406

Merged
merged 3 commits into from
May 6, 2021
Merged

Conversation

plamut
Copy link
Contributor

@plamut plamut commented Apr 30, 2021

Fixes #395.

If publish threads are marked as daemonic, the leak seemingly disappears.

How to verify

Set up a topic and a subscription, install memoryprofiler. Add the @profile decorator to the main() function from the sample code and run it (with and without the fix):

$ mprof run sample.py

After each run, plot the memory usage graph:

$ mprof plot

The graphs should look similar to the ones posted here.

PR checklist

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

If publish threads are marked as daemonic, the leak seemingly
disappears.
@plamut plamut requested a review from pradn April 30, 2021 12:17
@plamut plamut requested a review from a team as a code owner April 30, 2021 12:17
@product-auto-label product-auto-label bot added the api: pubsub Issues related to the googleapis/python-pubsub API. label Apr 30, 2021
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Apr 30, 2021
@plamut
Copy link
Contributor Author

plamut commented Apr 30, 2021

Server 500 when running samples, re-starting.

@plamut plamut added kokoro:force-run Add this label to force Kokoro to re-run the tests. kokoro:run Add this label to force Kokoro to re-run the tests. labels Apr 30, 2021
@yoshi-kokoro yoshi-kokoro removed kokoro:run Add this label to force Kokoro to re-run the tests. kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Apr 30, 2021
@plamut plamut changed the title fix: publisher memory leak fix: memory leak when publishing messages Apr 30, 2021
Copy link
Contributor Author

@plamut plamut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a true cause of the issue, it's a bug in CPython.

google/cloud/pubsub_v1/publisher/_batch/thread.py Outdated Show resolved Hide resolved
google/cloud/pubsub_v1/publisher/client.py Outdated Show resolved Hide resolved
The additional linked comment explains which `CPython` issue is the root cause of this.
Copy link
Contributor Author

@plamut plamut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix typo

google/cloud/pubsub_v1/publisher/_batch/thread.py Outdated Show resolved Hide resolved
google/cloud/pubsub_v1/publisher/client.py Outdated Show resolved Hide resolved
@plamut plamut added the release blocking Required feature/issue must be fixed prior to next release. label May 5, 2021
@pradn
Copy link
Contributor

pradn commented May 6, 2021

The fix is good, but I wonder if we'd run into issues by setting these threads as daemons.

The entire Python program exits when no alive non-daemon threads are left.

So, this would mean that scripts that may have created a publisher client, published a bunch of messages, and had only exited when the commit threads (non-daemon) finished running - these scripts could now terminate earlier. However, this may not be a problem, because if users want to guarantee publishes happened, they need to wait for the publish futures to finish with success.

Anyway, just wondering what your thoughts were here.

@pradn
Copy link
Contributor

pradn commented May 6, 2021

Peter and I spoke offline: possible premature script termination b/c the threads are no daemons doesn't break a customer guarantee, because as long as the user hasn't received a completed future, they can't assume the publish succeeded.

Copy link
Contributor

@pradn pradn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@plamut plamut merged commit 3976580 into googleapis:master May 6, 2021
@plamut plamut deleted the iss-395 branch May 6, 2021 20:12
@plamut
Copy link
Contributor Author

plamut commented May 6, 2021

Since it's already late into the workweek, I'll cut a release containing this on Monday.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: pubsub Issues related to the googleapis/python-pubsub API. cla: yes This human has signed the Contributor License Agreement. release blocking Required feature/issue must be fixed prior to next release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

When I publish a data to pubsub topic, I have a memory leak on kubernetes pod.
3 participants