Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pybind/rados: fix object lifetime issues and other bugs in aio #7778

Merged
merged 2 commits into from Mar 1, 2016

Conversation

marcan
Copy link
Contributor

@marcan marcan commented Feb 24, 2016

This fixes some subtle and some not so subtle object lifetime issues in the aio completion implementation in rados.pyx.

WRT the comment at the end of the second commit, where we still have a case of completion objects living longer than they should if an exception arrives at the wrong time (not likely to be a problem in practice, but worth fixing): I'm working on aio support for the rbd bindings, and am trying out a different approach to keeping the completion objects alive. If that works out I'll switch the rados bindings to the same approach and that should fix the problem.

cc @jdurgin @sileht

aio_read:
The reference to ret_s begins existing at PyBytes_FromStringAndSize and
is handed over to the callback if rados_aio_read succeeds. This creates
a lot of subtle scenarios where it might not be XDECREFed (e.g. if
a KeyboardInterrupt arrives at the wrong time). Instead, store the pointer
to that buffer in the completion object, and hand over responsibility for
the XDECREF to it. This guarantees that the "special" reference to this
incomplete object will be released when the completion object is
deallocated.

Also make sure we don't try to _PyBytes_Resize with a negative length.

Add a failure case to the aio_read test in test_rados.py

Completion: the wrapper methods weren't being called, which prevents
the completion objects from being freed until the Ioctx is. Fix this
and add a refcount check to the aio_read test.

Signed-off-by: Hector Martin <marcan@marcan.st>
Tracking completions is critical for memory safety - if the
aio function succeeds, the completion must be tracked. However,
if a KeyboardInterrupt or similar arrives between the call and
the tracking, the completion will not be tracked.

Fix this by tracking the completion before the aio call, and
explicitly cleaning up in the failure case.

This leaves the opposite problem, where an unexpected exception
(other than simple error return from the aio function) will cause
the completion to not be freed until the Ioctx is destroyed, but
that is a relatively minor issue.

Signed-off-by: Hector Martin <marcan@marcan.st>
@sileht
Copy link
Contributor

sileht commented Feb 24, 2016

That looks good.

@jdurgin
Copy link
Member

jdurgin commented Mar 1, 2016

Looks good to me as well. Thanks for the detailed explanations, they make these sorts of fixes much easier to read.

jdurgin added a commit that referenced this pull request Mar 1, 2016
pybind/rados: fix object lifetime issues and other bugs in aio

Reviewed-by: Mehdi Abaakouk <sileht@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
@jdurgin jdurgin merged commit 4f98ba2 into ceph:master Mar 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants