Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leak in nfs_mount_8_cb() when called as a consequence of nfs_destroy_context() #200

Closed
earlchew opened this issue May 9, 2017 · 6 comments

Comments

@earlchew
Copy link
Contributor

earlchew commented May 9, 2017

A very interesting case. A call to nfs_destroy_context() causes PDUs to be dequeued with RPC_STATUS_CANCEL, but nfs_mount_8_cb attempts to reconnect:

finished:
         ...
        rpc_disconnect(rpc, "normal disconnect");

        if (rpc_connect_program_async(nfs->rpc, nfs->server, NFS_PROGRAM, NFS_V3, nfs_mount_9_cb, data) != 0) {
                data->cb(-ENOMEM, nfs, command_data, data->private_data);
                free_nfs_cb_data(data);
                return;
        }
==21843== 238 (32 direct, 206 indirect) bytes in 1 blocks are definitely lost in loss record 608 of 1,534
==21843==    at 0x4C2CC4B: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==21843==    by 0x916969F: rpc_connect_program_async (libnfs.c:796)
==21843==    by 0x9169DDF: nfs_mount_8_cb (libnfs.c:1090)
==21843==    by 0x9161C32: rpc_destroy_context (init.c:310)
==21843==    by 0x9168FAA: nfs_destroy_context (libnfs.c:540)
==21843==    by 0x8F3B740: ??? (in /usr/lib/python2.7/dist-packages/libnfs/_libnfs.so)
==21843==    by 0x49D189: call_function.9501 (ceval.c:4025)
==21843==    by 0x48BDDB: PyEval_EvalFrameEx.constprop.14 (ceval.c:2669)
==21843==    by 0x49D58C: call_function.9501 (ceval.c:4111)
==21843==    by 0x48BDDB: PyEval_EvalFrameEx.constprop.14 (ceval.c:2669)
==21843==    by 0x49C617: PyEval_EvalCodeEx (ceval.c:3257)
==21843==    by 0x4E6A0F: function_call (funcobject.c:526)
@earlchew
Copy link
Contributor Author

earlchew commented May 9, 2017

@sahlberg Is this the right approach?

diff --git a/lib/libnfs.c b/lib/libnfs.c
index af51a90..ae8fea2 100755
--- a/lib/libnfs.c
+++ b/lib/libnfs.c
@@ -1067,6 +1067,12 @@ finished:

        rpc_disconnect(rpc, "normal disconnect");

+        if (status == RPC_STATUS_CANCEL) {
+                data->cb(-EINTR, nfs, "Command was cancelled", data->private_data);
+                free_nfs_cb_data(data);
+                return;
+        }
+
        if (rpc_connect_program_async(nfs->rpc, nfs->server, NFS_PROGRAM, NFS_V3, nfs_mount_9_cb, data) != 0) {
                data->cb(-ENOMEM, nfs, command_data, data->private_data);
                free_nfs_cb_data(data);

@sahlberg
Copy link
Owner

sahlberg commented May 9, 2017

That looks right. Perhaps also add a check for RPC_STATUS_TiMEOUT

Awesome work and many thanks for helping this library get better.

@earlchew
Copy link
Contributor Author

earlchew commented May 9, 2017

@sahlberg RPC_STATUS_TiMEOUT ? Do you mean RPC_STATUS_ERROR ?

@sahlberg
Copy link
Owner

sahlberg commented May 9, 2017 via email

@earlchew
Copy link
Contributor Author

earlchew commented May 9, 2017

@sahlberg Hmm ... and what about RPC_STATUS_ERROR ?

@earlchew
Copy link
Contributor Author

@sahlberg I'm thinking this patch is not entirely correct because nfs_mount_12_cb() is called more than once.

If any one of the callbacks gets RPC_STATUS_ERROR or RPC_STATUS_CANCEL, the entire batch should probably fail with the same error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants