-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache control #5136
Comments
@Makman2 with reference to the |
I am also curious about |
I think this issue can be a part of the caching/performance optimization project. |
It's that we don't compress files, but binary data which might not have much redundancy (actually it's also files, but this shall change, explaining below). If data has no redundancy compression is too ineffective and maintaining compression features would be useless.
No no no :) It's just that I want to be able to do The idea with the CI is just a possible use-case (has also to be investigated). Consider a very large project, which generates a 100MB cache (that's already quite insane). The coala analysis has taken 2h. So new developers can speed up their runs, they would just download this file, which is configured being offered on CI. They do
Yes.
So about caching again: The new core caches the task objects emitted by the bear. These task objects are effectively just arguments to the def analyze(self, filename, file, ...):
... The argument |
@Makman2 I wanted to know more about the design of this. Can all these flags reside in a separate module ( |
Don't understand that quite^^ What do you want to cache like |
I'm sorry that was poorly phrased. I meant that whether the implementations of these flags will reside in a separate module or as functions in Core.py |
Separate module, but could be located inside |
Or even inside |
Once the NextGen-Core is implemented, we have way more possibilities for different cache operating modes that shall be available as CLI arguments in coala.
--cache-strategy
/--cache-protocol
:Controls how coala manages caches for the next run.
Following modes could be implemented:
none
: Don't use a cache at all. A shortcut-flag could be additionally implemented,--no-cache
, effectively meaning--cache-protocol=none
primitive
: Use a cache that grows infinitely. All cache entries are stored for all following runs, and aren't removed. Effective when many recurrent changes happen in coafiles and settings. Fastest in storing.lri
/last-recently-used
Cached items persist only until the next run.
Stretch issue: Implement count-parameters that allow to control when to discard items from the cache, e.g. after 3 runs of coala without using a cached item, discard it.
My recommendation is to use
lri
as default, as coala mostly is executed locally.--clear-cache
Clears the cache.
--export-cache
/--import-cache
Maybe useful to share caches. Like CI server for any project run coala, and you can download the cache from there as an artifact to speed up your builds / coala runs.
--cache-compression
Accepts as arguments:
none
: No cache compression. This is default.lzma
orgzip
).Cache compression should be evaluated before regarding its effectiveness, because the cache will mainly store hashes which usually aren't really redundant, the gain might be very low. The little performance penalty when loading the cache might be too much when respecting a possible very low gain of cache space reduction.
--optimize-cache
A little performance penalty to make the cache loading faster. Particularly this feature shall utilize
pickletools.optimize
. But this is not exclusive to this flag.The text was updated successfully, but these errors were encountered: