New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't cache exception values in the module cache #837
Conversation
426c922
to
910809e
Compare
Exceptions hold tracebacks, which can get very large (especially for highly recursive functions). Especially if the exceptions are being used for control flow instead of actual exception management, we should be throwing them away as fast as possible. This commit deals with this with a bit of a hacky solution, but really the way the module cache handles things ought to be reworked a bit.
910809e
to
1d9fdff
Compare
I think that using |
The changes as per @pstch comments above diff --git a/astroid/manager.py b/astroid/manager.py
index 82208ad..4f88798 100644
--- a/astroid/manager.py
+++ b/astroid/manager.py
@@ -236,11 +236,11 @@ class AstroidManager:
value = exceptions.AstroidImportError(
"Failed to import module {modname} with error:\n{error}.",
modname=modname,
- error=ex,
+ error=ex.with_traceback(None),
)
self._mod_file_cache[(modname, contextfile)] = value
if isinstance(value, exceptions.AstroidBuildingError):
- raise value
+ raise value.with_traceback(None)
return value
def ast_from_module(self, module, modname=None): |
Thank you for the PR, and also for the interesting article about the step you took to analyse the problem (I edited your issue with it). I don't know if we use this stack-trace in pylint, I'll try to find the time to check the pylint's test suite against your branch and @pstch proposed change this week. |
@Pierre-Sassoulas I haven'th ad time to pick this up (I'll have time next week), I believe that @pstch's patch would be the "correct" way to tackle this issue overall, although there might be a bit more of a memory cost. If next week you haven't gotten around to this I'll try the alternative and measure the memory effects as well to see what the result is. |
I have made an alternate PR (#847) that uses the |
Hello, thank you for making the alternate branch, I also think the simpler solution is better. I opened a pylint MR (pylint-dev/pylint#3922) in order to see the effect on pylint's test for all envs and everything is still passing which means we don't use those informations in pylint. And pylint is the main "client" for astroid so I think this is mergeable. There is no noticeable effects on execution time, but the memory usage is lower as expected. |
Closing in favor of #847 |
Steps
Description
Exceptions hold tracebacks, which can get very large (especially for highly recursive functions). Especially if the exceptions are being used for control flow instead of actual exception management, by throwing them away as fast as possible we can avoid generating a lot of memory pressure (though at a cost of losting information)
This commit decides to just store a simple
Error
object forAstroidImportError
s, instead of the full exception. While this reduces visibility in stuff and is a bit hacky, on a "real world" codebase this reduced memory usage by ~80% on a run over just 100 files.I haven't written a changelog entry yet because I would like to know if this is "mostly OK". If it looks good I'll try adding more documentation to the code itself + a changelog entry.
A picture is worth 1000 words, so here are two:
^ Basically "look at all the objects stored on an
AstroidImportError
! Do we need all these tracebacks for the lifetime of the program?^ Left: memory usage on 100 files before the patch. Right: memory usage on 100 files after the patch.
For more detail see : https://rtpg.co/2020/10/12/pylint-usage.html
Type of Changes