-
Notifications
You must be signed in to change notification settings - Fork 238
compiler: Cache Dependence
instances
#2634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this an unbounded cache?
Even though it's using weak references, the size of this cache could grow arbitrarily, thus causing an increase in memory consumption over long runs.
Don't we need a mechanism to purge it from time to time or at least evict the entries with value None, or make it an LRU bounded cache?
FWIW, this function is called automatically by the devito runtime, "from time to time"
https://github.com/devitocodes/devito/blob/main/devito/types/caching.py#L165
(in essence, before allocating new data and based on heuristics)
@FabioLuporini technically unbounded yes, but entries are evicted by I considered using a bounded LRU cache of strong references but then we're hardcoding a cache limit for something that should scale with the problem size |
ah, I see! How much memory growth have you observed with a big Operator (ideally something high order)? You also need to rebase the branch on latest |
On that demo script from you I've been using, memory usage was within the noise. Haven't tried on anything huge (don't really have any real-world operators to test with that take up a lot of memory in building) but at least theoretically it should be basically nothing. Will try some more contrived higher-order cases later.
Yeah will rebase shortly—also want to make a small change to make it obvious that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Adds a weak-value cache and a class‐level memoization decorator to speed up construction of Dependence
instances, updates IR support to use immutable mappings, and expands tests to cover the new cache.
- Introduce
WeakValueCache
indevito.tools.abc
and corresponding tests intests/test_tools.py
- Add
_memoized_instances
decorator indevito/tools/memoization.py
and apply it to cacheRelation
instances - Update
TimedAccess
hashing and signature indevito/ir/support/basic.py
; switch IR tests to usefrozendict
intests/test_ir.py
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
tests/test_tools.py | Add WeakValueCache tests, including concurrent use cases |
tests/test_ir.py | Replace mutable dicts with frozendict for _directions |
devito/tools/memoization.py | Implement _memoized_instances decorator for class caching |
devito/tools/abc.py | Implement WeakValueCache |
devito/ir/support/basic.py | Refine TimedAccess signature, override __hash__ , cache Relation |
Comments suppressed due to low confidence (3)
tests/test_tools.py:250
- The test uses
time.sleep
buttime
is not imported. Addimport time
at the top of the file.
time.sleep(creation_time)
devito/tools/abc.py:309
- The annotation uses
Future[ReferenceType[ValueType]]
butFuture
is imported fromconcurrent.futures
and is not subscriptable. ImportFuture
fromtyping
or remove the subscription to avoid a runtime TypeError.
self._futures: dict[KeyType, Future[ReferenceType[ValueType]]] = {}
devito/tools/abc.py:347
- The parameter annotation
Future[ref[ValueType]]
will error at runtime sinceFuture
is not a generic class. Adjust the annotation (e.g., usetyping.Future
) or remove it.
def on_obj_destroyed(k: KeyType = key, f: Future[ref[ValueType]] = future):
4b3e65d
to
2c7d6f7
Compare
Updates the requirements on [fsspec](https://github.com/fsspec/filesystem_spec) to permit the latest version. - [Commits](fsspec/filesystem_spec@0.0.1...2025.5.1) --- updated-dependencies: - dependency-name: fsspec dependency-version: 2025.5.1 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
Updates the requirements on [pytest-cov](https://github.com/pytest-dev/pytest-cov) to permit the latest version. - [Changelog](https://github.com/pytest-dev/pytest-cov/blob/master/CHANGELOG.rst) - [Commits](pytest-dev/pytest-cov@v1.0...v6.2.1) --- updated-dependencies: - dependency-name: pytest-cov dependency-version: 6.2.1 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com>
14c3e72
to
4317211
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2634 +/- ##
==========================================
- Coverage 91.30% 87.66% -3.65%
==========================================
Files 245 245
Lines 48752 48994 +242
Branches 4298 4307 +9
==========================================
- Hits 44515 42952 -1563
- Misses 3529 5311 +1782
- Partials 708 731 +23
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it make sense to add a test that ensures we're not leaking memory and/or memory consumption is bounded ? this is mostly me being paranoid
FWIW, you can get the running process' eaten memory via Process(pid).memory_info().rss / (1024*1024)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see a few more tests too, since this is such a crucial part of the code. Maybe that exercise the deadlock safety by deliberately desynchronising threads. But this can be a discussion too, in case anybody else has ideas
devito/tools/abc.py
Outdated
from weakref import ReferenceType, ref | ||
import weakref |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strange that you do both of these!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid squatting on finalize
I suspect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ReferenceType
is the idiomatic type hint for weak refs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you mean that I also import the namespace I guess. Yeah I could also import finalize
itself I guess, just felt like it was better to be explicit (weakref.finalize
does a better job of describing what that's doing)—in fact on second thought I should probably be using weakref.ref
explicitly as well instead of importing it by name
That said I still want ReferenceType
imported by name because it's still explicit but much cleaner in the type hints than weakref.ReferenceType
imo
devito/tools/abc.py
Outdated
self._futures: dict[KeyType, Future[ReferenceType[ValueType]]] = {} | ||
self._lock = RLock() | ||
|
||
def get_or_create(self, key: KeyType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the rationale behind not subclassing defaultdict
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It serves a different purpose. We don't want the actual storage machinery of defaultdict
, and implementing the rest of the mapping interface to mirror that of defaultdict
would just add unnecessary bloat
devito/tools/abc.py
Outdated
Gets the value for `key`, or creates it with `supplier` if it doesn't exist. | ||
If another thread is already creating the value, blocks until it's available. | ||
""" | ||
while True: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure about the logic in this method. Someone might over rule what I'm saying, but:
This is getting close to spaghetti, a while loop that is only exited by returning (with multiple returns) or raising an exception. I feel like there must be a nicer way of implementing the same logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I'm aware, while loops are the canonical approach to these types of thread clash-retry constructions. The alternatives are duplicating everything that happens within the lock, or using recursion (or otherwise adding unnecessary complexity)—though on second thought I think a recursive solution might be a little cleaner, so I'll take a look at reimplementing that way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed my mind actually, I was right the first time lol. Best practice is definitely the while loop; the recursive version is just growing the stack for no reason
Basically my argument is:
- The while loop clearly communicates what we're doing (trying some threaded access until we get to a certain state) and
- Any other solution is less idiomatic, since we otherwise can't use the cache lock as a context manager without blocking for the duration of construction or
- Barring (2), an idiomatic solution would require unnecessary code duplication.
I did rewrite this method to factor a bit of logic out of the loop so maybe you'll be happier with the current implementation, but either way, this is just the canonical approach
devito/tools/abc.py
Outdated
obj = supplier() | ||
|
||
# Listener for when the weak reference expires | ||
def on_obj_destroyed(k: KeyType = key, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do futures not support some sort of callback already?
Or could the _futures
object itself be a weak dict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might be conflating the futures with weak references here. Not sure whether futures "support a callback" but if so that would be something that runs when the future is resolved; that's what we use future.result()
for on non-construction threads (to block until that result is ready).
Using a weak dict for the futures would be self-defeating; we want to weakly reference the cached instances, not the futures. If we store weak references to futures, then as soon as all threads consuming a future are done with it the entry will be evicted—even if the object that the future resolves to hasn't been garbage collected. So, we'd fail to cache something that might still be persisting in memory, and would then reconstruct and cache it again for no reason
@FabioLuporini the memory the cache takes up is bounded by the number of instances of the object that exist. There is a test which checks that cache entries are evicted when their values go out of scope, so this is guaranteed—I can try to add a test specifically for memory consumption if you'd like, but imo that's just going to be a more heuristic test (is the cache using "too much" memory? how much is too much?) of something that is already tested concretely (i.e. that the cache memory is bounded by the actual instance memory, which in turn ensures we're not leaking memory) |
Adds a
@_memoized_instances
decorator for caching instances returned by__new__
. This decorator is applied toRelation
andDependence
for a somewhat substantial speedup in build times.Tested locally (with Python 3.10 on dewback), this yields a pretty consistent performance improvement on an old profiling script with some modifications (removed inline calls to cProfile and ran it against the script up to the first
Operator
construction). This gist highlights performance of a couple hotspots as well as the top consumers from cProfile; overall, the operator in that script builds roughly 15% faster in most of my runs with cachedDependence
instances. I'm seeing a similar improvement in most of the examples as well.Waiting on #2631 to be merged as there are 3.10 type hints in a couple places.