-
Notifications
You must be signed in to change notification settings - Fork 89
Closed
Labels
Description
I'm having an issue with RAM consumption. It seems that when i process a lot of files (several thousand but one after another) im running out of RAM and i think it shouldn't happen in this situation.
When using the following code eventually my program gets killed because the OS is running out of RAM.
I have also tried to implement an easy way to monitor the storage usage. You can see that more and more dict objects are created (and not deleted i guess) which take more and more RAM.
If i run the xmltodict parser instead everything works just fine (but it is harder to access the data).
from stix.core import STIXPackage
from pympler import summary
from pympler import muppy
#import xmltodict
# define a list of paths
paths = ...
n = 0
for path in paths:
if n%10 == 0:
sum = summary.summarize(muppy.get_objects())
summary.print_(sum)
n = 0
# I've got no issues when parsing with xmltodict
#with open(path, "rb") as file:
#data_dicts = xmltodict.parse(file)
stix_package = STIXPackage.from_xml(path)
data_dicts = stix_package.to_dict()
n += 1The short version of my output is something like this:
| types | objects | total size |
|---|---|---|
| <class 'dict | 6680 | 3.45 MB |
| ... | ... | ... |
| <class 'dict | 40947 | 11.29 MB |
| ... | ... | ... |
| <class 'dict | 100243 | 28.67 MB |
| and | so | on |
| <class 'dict | 176329 | 50.51 MB |
The number of objects and the storage consumption keeps rising.
Thanks for your help!
Reactions are currently unavailable