Skip to content

RAM consumption #351

@chrisk280

Description

@chrisk280

I'm having an issue with RAM consumption. It seems that when i process a lot of files (several thousand but one after another) im running out of RAM and i think it shouldn't happen in this situation.
When using the following code eventually my program gets killed because the OS is running out of RAM.
I have also tried to implement an easy way to monitor the storage usage. You can see that more and more dict objects are created (and not deleted i guess) which take more and more RAM.
If i run the xmltodict parser instead everything works just fine (but it is harder to access the data).

from stix.core import STIXPackage
from pympler import summary
from pympler import muppy
#import xmltodict

# define a list of paths
paths = ...

n = 0
for path in paths:
    if n%10 == 0:
        sum = summary.summarize(muppy.get_objects())
        summary.print_(sum)
        n = 0
    # I've got no issues when parsing with xmltodict
    #with open(path, "rb") as file:
        #data_dicts = xmltodict.parse(file)
    stix_package = STIXPackage.from_xml(path)
    data_dicts = stix_package.to_dict()
    n += 1

The short version of my output is something like this:

types objects total size
<class 'dict 6680 3.45 MB
... ... ...
<class 'dict 40947 11.29 MB
... ... ...
<class 'dict 100243 28.67 MB
and so on
<class 'dict 176329 50.51 MB

The number of objects and the storage consumption keeps rising.

Thanks for your help!

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions