New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cache handler names #207
cache handler names #207
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, I'm not quite convinced that this computation contributes into the runtime significantly, but I agree that caching it is a better practice than recomputing it every time. One comment for your consideration, LGTM otherwise.
m2cgen/interpreters/utils.py
Outdated
@@ -7,3 +10,13 @@ | |||
def get_file_content(path): | |||
with open(path) as f: | |||
return f.read() | |||
|
|||
|
|||
@lru_cache(maxsize=16) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to put a larger number here to accommodate for future AST extension? I'd even suggest to make it unbounded if possible, since the total number of expressions in AST will always be pretty limited. Otherwise perhaps we can set it to something like 32.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that for the sake of efficiency it should be as small as possible. I even thought to set maxsize=8
.
For future AST extension we can easily update the maxsize
here while adding new expressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, my concern is that one should always keep this number on the back of their minds when adding new expression. Why do you say it should be as small as possible though? The cache is super cheap in this case, isn't it? It's if we just had a map with 16 pointer to string pairs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, my concern is that one should always keep this number on the back of their minds when adding new expression.
Agree! It is not so convenient. Driving in a such direction we can simply replace two these functions with manual dict
.
But according to the source code, lru_cache
is based on linked list, not hashmap
https://github.com/python/cpython/blob/81a5fc38e81b424869f4710f48e9371dfa2d3b77/Lib/functools.py#L781
So I believe size matters here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, maybe I'm overoptimizing things. I've proposed new changes to this PR so that we won't have to care about keeping maxsize
actual. Now it has enough capacity automatically. Please let me know what you think about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks 👍
But it still contributes due to the number of calls 🙂
|
This is a pretty deep analysis, well done 👍 |
This method (now function to not overcomplicate the code with handling of two decorators) is used very extensively. So the profit of caching it is obvious.
For example,