Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in vector_read for pyxrootd #1400

Closed
nikoladze opened this issue Feb 15, 2021 · 4 comments · Fixed by #1440
Closed

Memory leak in vector_read for pyxrootd #1400

nikoladze opened this issue Feb 15, 2021 · 4 comments · Fixed by #1440
Assignees

Comments

@nikoladze
Copy link
Contributor

There seems to be a memory leak in the vector_read function in the python bindings. To reproduce:

# debug_vector_read.py
import gc
import pyxrootd.client

filename = "root://eospublic.cern.ch//eos/opendata/atlas/OutreachDatasets/2020-01-22/4lep/MC/mc_345060.ggH125_ZZ4lep.4lep.root"

def test():
    f = pyxrootd.client.File()
    res = f.open(filename)
    if not res[0]["ok"]:
        raise IOError(res)
    chunksize = 1024 * 256
    chunks = [(i * chunksize, chunksize) for i in range(5)]
    res = f.vector_read(chunks)
    print(res[0]["message"])
    f.close()

for i in range(20):
    print(i)
    test()
    gc.collect()

When running this through memory-profiler with

mprof run debug_vector_read.py
mprof plot

i get this plot:

memory_leak

(Tested with xrootd 5.0.3 from conda-forge)

It seems the problem are these buffers which are never deleted:

char *buffer = new char[length];

Not sure if or when it is allowed to delete them, but if i collect the buffers into an array and delete them after the call to self->file->VectorRead(...) the memory leak seems to be gone:

no_memory_leak

@nikoladze
Copy link
Contributor Author

Another suggestions of type "i don't know what i'm doing": Maybe the delete has to be here after this in the conversion for VectorReadInfo

PyObject *buffer = PyBytes_FromStringAndSize( (const char *) chunk.buffer,

analogous to what is done here for ChunkInfo:

template<> struct PyDict<XrdCl::ChunkInfo>
{
static PyObject* Convert( XrdCl::ChunkInfo *chunk )
{
PyObject *o = PyBytes_FromStringAndSize( (const char*)chunk->buffer,
chunk->length );
delete[] (char*) chunk->buffer;
return o;
}
};

@nsmith-
Copy link
Contributor

nsmith- commented Mar 5, 2021

Would it be better to allocate using a python allocator rather than malloc here? Also there might be an opportunity for copy elision if we can pass a set of pointers into vector_read to be filled.

@nsmith-
Copy link
Contributor

nsmith- commented Mar 5, 2021

But yeah looking at https://docs.python.org/3/c-api/bytes.html#c.PyBytes_FromStringAndSize that's a copy, and as far as I can see the original pointer gets forgotten. I wonder why it doesn't just use the existing ChunkInfo convert routine?

@simonmichal simonmichal self-assigned this Mar 8, 2021
@simonmichal
Copy link
Contributor

Thanks guys for reporting it, I will have a look!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants