New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement cache meachanism to speed up module search #175
Conversation
Works without any modifications necessary now, results seem about the same as I mentioned in there with the previous version - about 0.25s duration on the first test, and then it's difficult to tell if there's any change to other ones. Seems to just add about 0.25s to the run time in total, from a few unscientific runs. |
freezegun/api.py
Outdated
if _get_module_attributes_hash(module) == module_hash: | ||
return cached_attrs | ||
else: | ||
return _get_module_attributes(module) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to be setting these back on global_modules_cache
to update the cache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
freezegun/api.py
Outdated
result = {} | ||
for mod_name, module in list(sys.modules.items()): | ||
# ignore this module | ||
if mod_name == __name__: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@spulec: in the _freeze_time.start
method I saw a similar logic (if mod_name is None or module is None
) but I didn't understand the None
checks and didn't put them here. When are they None
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
answering myself: module is None when it's a freezegun module.
1 similar comment
199c655
to
8d5193e
Compare
I will test this against |
Any update on merging this PR? I'm also interested in the speedups this provides, but I'm running into a strange bug with the module cache:
|
for attribute_name, attribute_value in all_module_attributes: | ||
if id(attribute_value) in _real_time_object_ids: | ||
date_attrs.append((attribute_name, attribute_value)) | ||
_GLOBAL_MODULES_CACHE[module.__name__] = (_get_module_attributes_hash(module), date_attrs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I encountered this bug:
local.virtualenv/lib/python2.7/site-packages/freezegun/api.py:557: in wrapper
with self:
local.virtualenv/lib/python2.7/site-packages/freezegun/api.py:434: in __enter__
return self.start()
local.virtualenv/lib/python2.7/site-packages/freezegun/api.py:490: in start
module_attrs = _get_cached_module_attributes(mod_name, module)
local.virtualenv/lib/python2.7/site-packages/freezegun/api.py:96: in _get_cached_module_attributes
global_modules_cache = _get_global_modules_cache()
local.virtualenv/lib/python2.7/site-packages/freezegun/api.py:53: in _get_global_modules_cache
_setup_modules_cache()
local.virtualenv/lib/python2.7/site-packages/freezegun/api.py:75: in _setup_modules_cache
_setup_module_cache(module)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
module = <AliasModule 'py.error' for 'py._error.error'>
def _setup_module_cache(module):
global _GLOBAL_MODULES_CACHE
#if not hasattr(module, "__name__"):
# return
date_attrs = []
all_module_attributes = _get_module_attributes(module)
for attribute_name, attribute_value in all_module_attributes:
if id(attribute_value) in _real_time_object_ids:
date_attrs.append((attribute_name, attribute_value))
> _GLOBAL_MODULES_CACHE[module.__name__] = (_get_module_attributes_hash(module), date_attrs)
E AttributeError: __name__
all_module_attributes = []
date_attrs = []
module = <AliasModule 'py.error' for 'py._error.error'>
local.virtualenv/lib/python2.7/site-packages/freezegun/api.py:88: AttributeError
It seems ok with the following code:
def _setup_modules_cache():
for mod_name, module in list(sys.modules.items()):
if mod_name == __name__ or not mod_name or not module or not hasattr(module, "__name__"):
continue
...
Fixed version of spulec#175
Thanks! |
|
||
|
||
def _get_module_attributes_hash(module): | ||
return '{0}-{1}'.format(id(module), hashlib.md5(','.join(dir(module)).encode('utf-8')).hexdigest()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In line 63, you catch a TypeError
when you call dir(module)
, here you don't. If a TypeError
is raised there, that doesn't seem to prevent this function from being called. Is that an issue?
Why don't you pass the already existing all_module_attributes
to this function and extract the names from that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, fixed with 689a7bb.
freezegun/api.py
Outdated
@@ -66,7 +66,7 @@ def _setup_modules_cache(): | |||
# FIXME: move this definition to be at the top-level | |||
real_time_object_ids = set(id(obj) for obj in real_date_objects) | |||
result = {} | |||
for mod_name, module in sys.modules.items(): | |||
for mod_name, module in list(sys.modules.items()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's unnecessary - if sys.modules
is dict , then sys.modules.items()
is iterable containing pairs of key and value in both py2 and py3
casting it to list
will not break it, but slow down a bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without creating the copy, I'm occasionally getting a RuntimeError: dictionary changed size during iteration
, which makes sense since additional modules can get loaded.
Ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would assume that the size of that collection is small enough so that casting to list once does not cause a relevant slowdown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so commit message is misleading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the PR title is about an architectural optimization that saves seconds of run time, not microoptimizations that make a difference of a few microseconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant this particular commit, not entire PullRequest
Reading Make sys.modules.items() compatible with python 3
I expect some py3 adjustment, but I get runtime error prevention.
That was my point, sir :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commit message is not misleading. In python 2, items()
returns a list (copy), which is what is needed here. But in python 3 it returns an iterator, which causes faulty behavior. So it is casted to list for python 3 support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
With 0.3.10 over 0.3.9 I'm seeing about a 16% speedup. Not bad. |
How do you measure speed improvement? |
I ran the tests and looked at the time that nose said it took to run them. Did that 4 times and took the average. Then uninstalled 0.3.9 & installed 0.3.10 and did the same. The difference between the two averages was about 16%. |
This should fix the performance issues reported in issue #69.
I'd like to hear from other people that experienced the slowdowns to see if this pull request will fix their issues.
cc @Deimos