Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huge memory usage by MISP objects #468

Closed
mback2k opened this issue Oct 2, 2019 · 4 comments
Closed

Huge memory usage by MISP objects #468

mback2k opened this issue Oct 2, 2019 · 4 comments

Comments

@mback2k
Copy link
Contributor

mback2k commented Oct 2, 2019

Hello everyone,

we are running into issues, because creating a lot of MISPObjects (and other similar types) requires a lot of memory at the moment. We had an example here with roughly 60 GB of Python memory usage in our toolchain.

This is caused by the fact that each instance of MISPObject loads the JSON template files for itself instead of using a global cache (based on filename and hash maybe):

PyMISP/pymisp/mispevent.py

Lines 1206 to 1218 in de6a64b

def _load_template_path(self, template_path):
if not os.path.exists(template_path):
return False
with open(template_path, 'rb') as f:
if OLD_PY3:
self._definition = json.loads(f.read().decode())
else:
self._definition = json.load(f)
setattr(self, 'meta-category', self._definition['meta-category'])
self.template_uuid = self._definition['uuid']
self.description = self._definition['description']
self.template_version = self._definition['version']
return True

I can try to implement such caching, but would you be interested in that?

Best regards,
Marc

@Rafiot
Copy link
Member

Rafiot commented Oct 2, 2019

Hello Marc, yes, definitely, that would be very interesting.

The other thing I was thinking about was to auto-generate enums with types and categories and use that in the attributes instead of the strings every time. I'd assume it will be nicer on the memory usage, but your idea is probably also a good call, at least for the templates bundled-in the package.

@Rafiot
Copy link
Member

Rafiot commented Oct 10, 2019

I merged a improvement, it might be only a partial fix, but please open a new issue if needed.

@Rafiot Rafiot closed this as completed Oct 10, 2019
@mback2k
Copy link
Contributor Author

mback2k commented Oct 16, 2019

@Rafiot Just for your information: I found our additional memory issue in our code and was finally able to solve it. Under some circumstances a list was being appended to itself a few hundred times and that caused exponential memory usage. Thanks again for your support with the PyMISP improvements, those also helped a lot.

@Rafiot
Copy link
Member

Rafiot commented Oct 17, 2019

Thank you for opening the issue, it helped a lot to pinpoint the problem. Do not hesitate to get in touch again if you have an other one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants