Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix leak of iopub object in activity monitoring #3424

Merged
merged 1 commit into from Apr 3, 2018

Conversation

Projects
None yet
2 participants
@kevin-bates
Copy link
Member

kevin-bates commented Mar 14, 2018

After analyzing various leaked items when running either Notebook or
Jupyter Kernel Gateway, one item that recurred across each kernel
startup and shutdown sequence was the a zmq socket instance associated
with the iopub port. The leaked instance occurs because the activity
monitor and normal port creation logic both call create_iopub.

Although this ends up creating a wrapper around the same port, the
object (i.e., wrapper) created by the activity monitor is leaked. By
setting the kernel manager's _activity_stream member to None when
the kernel is shutdown, cleanup of the iopub wrapper object will take
place.

Fix leak of iopub object in activity monitoring
After analyzing various leaked items when running either Notebook or
Jupyter Kernel Gateway, one item that recurred across each kernel
startup and shutdown sequence was the a zmq socket instance associated
with the iopub port.  The leaked instance occurs because the activity
monitor and normal port creation logic both call `create_iopub`.

Although this ens up creating a _wrapper_ around the same port, the
object (i.e., wrapper) created by the activity monitor is leaked.  By
setting the kernel manager's `_activity_stream` member to `None` when
the kernel is shutdown, cleanup of the iopub wrapper object will take
place.

kevin-bates added a commit to kevin-bates/enterprise_gateway that referenced this pull request Mar 14, 2018

Trap exceptions during shutdown of comm port, fix process proxy leak
PR jupyter#279 introduced some fixes to address file descriptor leaks - one of
which was to shutdown the communication port.  Since the kernel launcher
listening on the other side may have already terminated, the shutdown
method could throw an exception.  This change catches, logs, then ignores
such exceptions.

While looking into other leaks (in this case memory), it was discovered
that the process proxy instance was being leaked across kernel cycles.
This change addresses that particular leak.

Note: Other PRs have also been submitted to address leaks in
`jupyter_client` and `notebook`.  These are:

[PR 360 - Fix memory leak of kernel Popen object](jupyter/jupyter_client#360)
[PR 361 - Fix memory leak of IOLoopKernelManager object](jupyter/jupyter_client#361)
[PR 3424 - Fix memory leak of iopub object in activity monitoring](jupyter/notebook#3424)
@takluyver

This comment has been minimized.

Copy link
Member

takluyver commented Mar 15, 2018

I don't immediately see why this would result in a leak; the kernel object itself is deleted shortly afterwards (the super(...).shutdown_kernel() call calls .remove_kernel(), which removes it from the {id: kernel} dictionary. The kernel object (which is a KernelManager instance) should be garbage collected, and its _activity_stream should be cleaned up.

What am I missing?

@kevin-bates

This comment has been minimized.

Copy link
Member Author

kevin-bates commented Mar 15, 2018

@takluyver - thanks for the response. I agree that its quite non-obvious. However, w/o this statement, the iopub object wrapper created here will not be garbage collected - with a new (leaked) instance persisted across each subsequent kernel cycle.

I found this and the two PRs (360, 361) submitted to jupyter_client via use of gc.collect, gc.garbage and gc.get_referrers calls in the start_kernel() and shutdown_kernel() methods of jupyter_client, notebook, kernel gateway, and enterprise gateway along with resource.getrusage(resource.RUSAGE_SELF).ru_maxrss to monitor growth.

However, since this leak was found prior to the resolution to the IOLoopKernelManager leak (via PR 361) then I suppose the change may not be required - although, in my opinion, is good practice. I didn't go and back out this change after breaking the circular references that were preventing the leak in 361.

@kevin-bates

This comment has been minimized.

Copy link
Member Author

kevin-bates commented Mar 15, 2018

Hmm - I just commented out this change (and running with all other fixes in place) and see the iopub object and the kernel manager instance leaks return. I don't see an obvious circular reference introduced by _activity_stream, but given the km instance returns indicates this change is required.

lresende added a commit to jupyter/enterprise_gateway that referenced this pull request Mar 19, 2018

Fix process proxy leak
While looking into other leaks (in this case memory), it was discovered
that the process proxy instance was being leaked across kernel cycles.
This change addresses that particular leak.

Note: Other PRs have also been submitted to address leaks in
`jupyter_client` and `notebook`.  These are:

[PR 360 - Fix memory leak of kernel Popen object](jupyter/jupyter_client#360)
[PR 361 - Fix memory leak of IOLoopKernelManager object](jupyter/jupyter_client#361)
[PR 3424 - Fix memory leak of iopub object in activity monitoring](jupyter/notebook#3424)
@kevin-bates

This comment has been minimized.

Copy link
Member Author

kevin-bates commented Apr 2, 2018

@takluyver - is there anything else you need for this?

@takluyver takluyver added this to the 5.5 milestone Apr 3, 2018

@takluyver

This comment has been minimized.

Copy link
Member

takluyver commented Apr 3, 2018

No, I don't think so. I'll close and reopen it to rerun the tests.

@takluyver takluyver closed this Apr 3, 2018

@takluyver takluyver reopened this Apr 3, 2018

@takluyver takluyver merged commit bf173a8 into jupyter:master Apr 3, 2018

4 checks passed

codecov/patch 100% of diff hit (target 0%)
Details
codecov/project 77.71% (+<.01%) compared to e321c80
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@kevin-bates

This comment has been minimized.

Copy link
Member Author

kevin-bates commented Apr 3, 2018

Thank you Thomas!

@kevin-bates kevin-bates deleted the kevin-bates:fix-leak-iopub branch May 3, 2018

@kevin-bates kevin-bates referenced this pull request May 3, 2018

Closed

Release 5.5 #3591

9 of 9 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.