Fix memory leak in _cbson._element_to_dict #316

messa · 2016-10-20T13:25:56Z

The memory leak happens when I use document_class=RawBSONDocument, because its __inflated property calls bson._iterate_elements that in turn calls _element_to_dict. The memory leak is present only in the C extension; if it is disabled everything becomes OK.

When the default document_class is used (i.e. dict) there is no memory leak, because that uses different _cbson functions where the ref counts are handled properly (_elements_to_dict in _cbsonmodule.c).

behackett · 2016-10-20T16:08:33Z

Thanks for the bug report and patch, I've done a bit of triage and opened https://jira.mongodb.org/browse/PYTHON-1171.

behackett · 2016-10-20T20:39:43Z

@messa, can you change your PR to switch from OO to NN in the Py_BuildValue call instead of moving the Py_DECREF calls? That should be slightly more efficient, avoiding two calls to Py_INCREF and two calls to Py_DECREF in the common case.

Also, I'm curious why you are inflating the raw documents?

messa · 2016-10-20T21:48:19Z

Changed and force-pushed (hope that's OK; previous version here).

I'm not inflating the raw documents myself. The low-level cursor batch document is inflated when the result documents are retrieved from it:

mongo-python-driver/pymongo/cursor.py

Line 992 in 8fdb581

documents = cursor['firstBatch']

It's just a "shallow inflate" but it copies all the document contents so when memory leaked it eats memory pretty fast.

messa · 2016-10-20T22:08:30Z

Btw. my use case for RawBSONDocument:

right now I've just needed a quick pre-heat script that reads all documents in a collection, without doing anything with them. I just tried raw documents to compare how much faster such script will be so I discovered this issue.
we have some data processing component implemented in Python with the core parts in C++ via Cython. I'm thinking about using pymongo raw documents so I can move document loading from Python to pure C++ (with C/C++ bson lib) so the Python part just transfers byte blobs from pymongo to Cython/C++ code.

behackett · 2016-10-20T22:21:27Z

I see, we leak two references for each cursor batch regardless of the application explicitly inflating the individual documents. Sigh. Sadly, coverity doesn't appear to understand Python's reference counting at all and I completely missed this in code review back in 3.2.

Our thinking for this feature was to allow various optional BSON decoders, possibly written in languages other than Python (but with slim python wrappers). python-bsonjs was the first project of that type we wrote. Your use case is very similar.

behackett · 2016-10-20T22:53:07Z

I've confirmed the problem, and that your patch fixes it. Patch merged to master. Thanks again!

behackett · 2016-10-27T23:33:31Z

We've release PyMongo 3.3.1, with your fix for this issue.

https://pypi.python.org/pypi/pymongo/3.3.1

Fix ref count management when building _element_to_dict result tuple

0c70230

messa force-pushed the messa_fix_decref_after_tuple branch from 4f57152 to 0c70230 Compare October 20, 2016 21:44

behackett closed this Oct 20, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix memory leak in _cbson._element_to_dict #316

Fix memory leak in _cbson._element_to_dict #316

Uh oh!

messa commented Oct 20, 2016

Uh oh!

behackett commented Oct 20, 2016

Uh oh!

behackett commented Oct 20, 2016

Uh oh!

messa commented Oct 20, 2016 •

edited

Loading

Uh oh!

messa commented Oct 20, 2016 •

edited

Loading

Uh oh!

behackett commented Oct 20, 2016

Uh oh!

behackett commented Oct 20, 2016

Uh oh!

behackett commented Oct 27, 2016

Uh oh!

Uh oh!

Fix memory leak in _cbson._element_to_dict #316

Fix memory leak in _cbson._element_to_dict #316

Uh oh!

Conversation

messa commented Oct 20, 2016

Uh oh!

behackett commented Oct 20, 2016

Uh oh!

behackett commented Oct 20, 2016

Uh oh!

messa commented Oct 20, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

messa commented Oct 20, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

behackett commented Oct 20, 2016

Uh oh!

behackett commented Oct 20, 2016

Uh oh!

behackett commented Oct 27, 2016

Uh oh!

Uh oh!

messa commented Oct 20, 2016 •

edited

Loading

messa commented Oct 20, 2016 •

edited

Loading