Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in EnumerateLibrary #3702

Closed
jose-mr opened this issue Jan 7, 2021 · 1 comment · Fixed by #3725
Closed

Memory leak in EnumerateLibrary #3702

jose-mr opened this issue Jan 7, 2021 · 1 comment · Fixed by #3725
Assignees
Labels
Milestone

Comments

@jose-mr
Copy link

jose-mr commented Jan 7, 2021

Describe the bug
Calling Chem.rdChemReactions.EnumerateLibrary in a loop increases the memory usage even if there are no future references to the EnumerationLibrary object.

To Reproduce

from rdkit import Chem
from rdkit.Chem import rdChemReactions

@profile
def run_once(rxn, mols):
    library = Chem.rdChemReactions.EnumerateLibrary(rxn, [mols, ])
  
@profile
def run_10000(rxn, mols):
    for _ in range(10000):
        library = Chem.rdChemReactions.EnumerateLibrary(rxn, [mols, ])
  
rxn = rdChemReactions.ReactionFromSmarts("[C:1]-[C:2]>>[C:1].[C:2]")
mols = [Chem.MolFromSmarts("[C]-[C]")]*10000
run_once(rxn, mols)
run_10000(rxn, mols)

Expected behavior
I would expect the memory footprint of both functions to be the same, but they are not. Here is the output of running this code with memory_profiler:

python -m memory_profiler minimal_example.py

Line # Mem usage Increment Occurences Line Contents

 5   74.211 MiB   74.211 MiB           1   @profile
 6                                         def run_once(rxn, mols):
 7   75.699 MiB    1.488 MiB           1       library = Chem.rdChemReactions.EnumerateLibrary(rxn, [mols, ])

Filename: minimal_example.py

Line # Mem usage Increment Occurences Line Contents

 9   75.699 MiB   75.699 MiB           1   @profile
10                                         def run_10000(rxn, mols):
11 6261.906 MiB    0.000 MiB       10001       for _ in range(10000):
12 6261.906 MiB 6186.207 MiB       10000           library = Chem.rdChemReactions.EnumerateLibrary(rxn, [mols, ])

The system monitor also shows an increased memory usage.

Configuration (please complete the following information):

  • RDKit version: 2020.09.3
  • OS: Ubuntu 18.04
  • Python version (if relevant): Python 3.8.2
  • Are you using conda? yes
  • If you are using conda, which channel did you install the rdkit from? conda-forge

Additional context
I have noticed this in a much more complicated piece of code, but this minimal example seems to reproduce the problem.
Thank you for any help!

@jose-mr jose-mr added the bug label Jan 7, 2021
@bp-kelley
Copy link
Contributor

@jose-mr Thanks for the bug report! I'll look into it.

@bp-kelley bp-kelley self-assigned this Jan 7, 2021
bp-kelley pushed a commit to bp-kelley/rdkit that referenced this issue Jan 12, 2021
bp-kelley pushed a commit to bp-kelley/rdkit that referenced this issue Jan 12, 2021
@greglandrum greglandrum added this to the 2020_09_4 milestone Jan 13, 2021
greglandrum pushed a commit that referenced this issue Jan 20, 2021
Co-authored-by: Brian Kelley <bkelley@relaytx.com>
greglandrum pushed a commit that referenced this issue Jan 21, 2021
Co-authored-by: Brian Kelley <bkelley@relaytx.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants