New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory leak occurs when creating a new type #732

Closed
zaxrok opened this Issue Feb 5, 2018 · 8 comments

Comments

Projects
None yet
3 participants
@zaxrok

zaxrok commented Feb 5, 2018

When I use the following code, I observe that memory leak is very fast
MsgSchema = type('MsgSchema', (marshmallow.Schema,), {})

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import marshmallow
import time
import psutil
import os

if __name__ == '__main__':
    pid = os.getpid()
    process = psutil.Process(pid)
    while True:
        MsgSchema = type('MsgSchema', (marshmallow.Schema,), {})
        mem = process.get_memory_info()[0]
        print("memory usage(%s):%d" % (pid, mem))
        time.sleep(0.01)
@sloria

This comment has been minimized.

Member

sloria commented Feb 9, 2018

Ah, this is because creating a new Schema class adds it to a registry in memory. You can prevent dynamically-generated classes from getting entered into the registry by passing a blank name.

 MsgSchema = type('', (marshmallow.Schema,), {})

Now that you mention this, though, I realize that we should only add a class to the registry if an existing registry with the same fully-qualified module path doesn't already exist.

@asmodehn

This comment has been minimized.

Contributor

asmodehn commented Feb 9, 2018

The schema name is always the same here, so I would assume that the registry contains only one element, and that the other classes are deleted when needed. Is that a correct assumption ? Or is there anything preventing the memory to be freed in the schema class ?

@sloria

This comment has been minimized.

Member

sloria commented Feb 9, 2018

Looks like we always register the fully-qualified module path, even if it's already in the registry

_registry.setdefault(fullpath, []).append(cls)

This appears to be a bug. I'd gladly review and merge a PR fixing this.

@asmodehn

This comment has been minimized.

Contributor

asmodehn commented Feb 10, 2018

Just to be clear, are you saying that there is no need for one classname to hold a list of classes, and instead, we should have only one class per name in the registry, and that a registered class entry gets overwritten if a new class is registered with the same name ?

@sloria

This comment has been minimized.

Member

sloria commented Feb 10, 2018

No, we still need to store a list of classes for each name.

However, we don't need to store duplicate values in the lists.

@asmodehn

This comment has been minimized.

Contributor

asmodehn commented Apr 20, 2018

No, we still need to store a list of classes for each name.
However, we don't need to store duplicate values in the lists.

When storing serializer class in the registry, the module name is used to disambiguate between same classes, where same is defined by the fullpath (module+classname) there.

However when dealing with fullpath just under, that would mean that each new entry is understood as same as an existing one. Therefore we don't need a list there anymore... Or we do need a way to differentiate between different classes that does not rely on their fullpath (which might not be suitable for complexity cost).

@asmodehn

This comment has been minimized.

Contributor

asmodehn commented Apr 20, 2018

Found another related issue where storing a class with same name in the registry would erase classes with same name (but from different modules) from the list.

@sloria

This comment has been minimized.

Member

sloria commented Nov 2, 2018

This is fixed in 2.16.3 and 3.0.0b20.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment