New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Establish a uniform way to clear all caches in a given module #80666
Comments
Some modules have caches. There is a need to clear all tests before running tests. Brett proposed to add in all modules with caches a function with the standardized name which is responsible for clearing module related caches. [1] The proposed PR adds a new function clear_caches() in the sys module. It iterates all imported modules and calls function __clearcache__() if it is defined. def clear_caches():
for mod in reversed(list(sys.modules.values())):
if hasattr(mod, '__clearcache__'):
mod.__clearcache__() clear_caches() will be used in test.regrtest and can be used in user code. The PR defines also function __clearcache__ for modules which are cleared manually in the current code. This is a preliminary implementation, bikeshedding is welcome. [1] https://mail.python.org/pipermail/python-ideas/2019-March/056165.html |
Not sure that I agree there is a testing need to clear all caches regardless of what they do. Test code should explicitly state whether it relies on a particular cache being cleared at some particular point in time. Also, the concept of "need to clear all caches" isn't well-formed. Would you want sys.intern caches to be cleared? What about the internal caches in SQLite? And do you think polling for a new magic attribute is the right approach? We would get looser coupling and better control by letting modules register themselves as needed. --- re.py --- sys.register_cache_clear_function(callback=purge, doc='pattern cache and re cache') --- ipaddress.py -- sys.register(IPv4Address.is_private.is_getter.cache_clear, doc='check for private networks) |
An auto-magic cache clearing mechanism is really tempting. I tend to agree with Raymond though, if code needs and progress a cache clearing mechanism it should be treated and accessible. They're are probably some problematic caches still within unittest however. Do test results still keep alive all tracebacks until test reporting? |
Sorry.
|
Tests codify knowledge about the system under test, so it doesn't matter that the test suite has to know how to clear caches. It's specifically a good thing that the test writer knows which caches exist and need clearing, and how to do it. The harder thing mighty be determining what scope to do the clearing (per test, class or module) bit unittest exposes hooks for fixtures at those points for anything that needs doing automatically. |
I suggest the documentation be written in more detail. For example, in __clearcache__'s section, state explicitly that this magic function is for module-level cache, and it will be invoked by sys.clear_caches(). Maybe also introduce the background: some caches may grow unlimitedly, sys.clear_caches() gives the user a chance to empty them. |
My initial idea was to add a lightweight module cachesreg with two functions: register() and clear_caches(). PR 12639 implements it. But I like Brett's idea more, because it is simpler. The only disadvantage of it is that if you make a typo in __clearcache__, this function will be silently ignored. I thought also about different levels of cachesreg.cachesreg.register() could take two arguments -- the level and the clearing function. Then cachesreg.clear_caches() could allow to clear only caches of specified level and smaller/larger. Both PRs add clearing callbacks only for modules which already cleared in regrtests. There are more caches, and with implementing any of these ideas it will be easier to add clearing caches in other modules. |
I think this is a great idea. We would have needed this many times for tests over the years. |
If it only has two functions, it could be a sub-module sys.cachesreg Or a lifecycle module, as the name, dedicated to such kind of functions. Register callback functions for memory low, poweroff system, etc. |
RE: "And do you think polling for a new magic attribute is the right approach?": my thinking behind that idea is that by standardizing the function name it's easy to tell if there's a cache, but you can also do away with the registration with a 3 lines of code. To me, the priority is clearing caches on a per-module basici and having a clear-all mechanism can be beneficial, not the other way around. |
[Brett]
That makes more sense. I'm changing the title to match the actual feature request and intent: "Add a way to clear all caches" -> "Establish a uniform way to clear all caches in a given module" |
Quick question, would the existing sys.reload() logic suffice? -- mymodule.py -- cache = {} # On reload, this would clear the cache
def f(x):
if x in cache:
return cache[x]
y = x**2
cache[x] = y
return y |
Did you mean importlib.reload() instead of sys.reload()? And technically it would *if* you're okay with the other side-effects of reloading, e.g. making sure no one has a reference to any objects from the module's namespace which won't change in-place (e.g. if you stored a reference to the cache in some code then the reload wouldn't clear it for the stored reference). |
Yes.
If you want to go forward with this, go for it. I would like to be able to explain to another person why this is needed, but personally can't visualize a circumstance where a person is testing module, doesn't know how to use the existing cache clearing APIs, but needs to clear caches (not sure why), and doesn't either know or want what happens on import. I've never seen this situation arise, but if it's something you want, I won't stand it the way. BTW, if you're going to have some sort of clear_all(), perhaps it should cover the sys.intern() dictionary and string hashes as well. AFAICT, there's nothing special about a regex cache that gives it a greater need to be cleared |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: