-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use slots in common classes #1434
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1434 +/- ##
==========================================
+ Coverage 91.21% 91.22% +0.01%
==========================================
Files 52 52
Lines 7237 7248 +11
Branches 1018 1021 +3
==========================================
+ Hits 6601 6612 +11
Misses 457 457
Partials 179 179 ☔ View full report in Codecov by Sentry. |
@gadomski This is ready for review. Some of the changes necessary to get slots working are a bit subtle (happy to explain anything that looks unnecessary), but the surface area of changes is relatively small. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Thanks for this ... I ran the benchmark suite and while most stuff was unchanged, there is a (pretty significant) speedup when walking a large-ish catalog:
$ asv compare main feature/use-slots --only-changed
| Change | Before [f23af862] <main> | After [880b6e94] <feature/use-slots> | Ratio | Benchmark (Parameter) |
|----------|----------------------------|----------------------------------------|---------|---------------------------------------------------------------------------------------|
| - | 78.2±0.7μs | 52.2±0.8μs | 0.67 | catalog.WalkCatalogBench.time_walk [new-terrain/virtualenv-py3.12] |
| - | 78.4±0.6μs | 50.5±1μs | 0.64 | catalog.WalkCatalogBench.time_walk [new-terrain/virtualenv-py3.12-orjson] |
Couple of questions:
- Do you reckon that this is a breaking change? My understanding of
__slots__
is not deep, but the inability to set non-slot attributes at runtime does feel impactful. - Is there a benchmark that we could cook up to test Collections with many items saving time issue #1207? It'd be nice to give ourselves a target to aim for.
Here are the top offenders in terms of cumulative time for smallish (1000 item) catalog saves:
And here are the top offenders with 100x more items:
OK, but how do we write a neat little test that isn't completely ad hoc?... I'm not sure. Ad hoc might be the smart play here? So maybe we empirically determine how much we can shave off of this thing with quick/dirty changes on a testing branch and use the results from that plus whatever disgust it elicits in terms of readability and code smell to decide exactly where to aim Sorry if those were both non-answers but in fairness those are two very hard questions |
👍🏼 on breaking, I'll put this in the v2.0 milestone and hold off on merging until
Stop it, you're making me blush 😊 |
Related Issue(s):
__slots__
for attributes #1229Description:
Because it isn't unexpected that even tens of thousands of
Item
andLink
classes and a fortioriStacObject
classes are regularly in memory, there should be some small but appreciable performance improvements from using__slots__
rather than__dict__
to hold class attributes. The downside of this change is that new attributes cannot be set at runtime that aren't expected in__slots__
definitions. In general, the existence ofextra_fields
onItem
classes makes that a non-issue for the cases we care about.Beyond simply adding
__slots__
definitions, this PR also updates the serialization logic to properly use them and adds some__init__
logic, where appropriate, to avoid non-initialization of expected__slots__
values.#1207 is a related issue but it should be noted that this PR does not fully address the performance issues indicated there, as many duplicated href lookups due to eager mutation of objects remain an issue. IF spilling memory to disk is the cause of the observed exponential increase in runtimes during
.save
is to blame, this PR should provide a bit more breathing room.PR Checklist:
scripts/test
)