-
-
Notifications
You must be signed in to change notification settings - Fork 718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove sortedcontainers #7245
Remove sortedcontainers #7245
Conversation
Not sure if it's necessary to actually sort anywhere
`Computation.code` was the only other place it was used. Doesn't seem worth the dependency for that.
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 15 files ±0 15 suites ±0 6h 25m 56s ⏱️ - 2m 53s For more details on these failures, see this check. Results for commit aa0cde0. ± Comparison against base commit 02b9430. |
worker_pool = self.idle or self.workers | ||
# FIXME idle and workers are SortedDict's declared as dicts | ||
# because sortedcontainers is not annotated | ||
wp_vals = cast("Sequence[WorkerState]", worker_pool.values()) | ||
n_workers: int = len(wp_vals) | ||
if n_workers < 20: # smart but linear in small case | ||
ws = min(wp_vals, key=operator.attrgetter("occupancy")) | ||
assert ws | ||
if ws.occupancy == 0: | ||
# special case to use round-robin; linear search | ||
# for next worker with zero occupancy (or just | ||
# land back where we started). | ||
wp_i: WorkerState | ||
start: int = self.n_tasks % n_workers | ||
i: int | ||
for i in range(n_workers): | ||
wp_i = wp_vals[(i + start) % n_workers] | ||
if wp_i.occupancy == 0: | ||
ws = wp_i | ||
break | ||
else: # dumb but fast in large case | ||
ws = wp_vals[self.n_tasks % n_workers] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW if it's just about this code path, we can drop the sortedset entirely by putting workers in a list as well as a set. that'd be a LogN operation whenever a worker leaves the cluster to remove that worker from the list. That's much better than a NlogN every time we iterate over the workers
There are non-trivial merge conflicts - could you resolve them? |
This depends on #7221, which it doesn't seem is happening any time soon. #7280 is kinda similar to that, without changing the rootish definition. I think the order we'd merge things is
So I'll close this for now, will re-open if we're going to work on this. |
This makes
Scheduler.workers
andScheduler.idle
no longer sorted. Sorted containers are a bit slower than their plain equivalents. Most importantly, iterating over them (which we do in plenty of places) is O(nlogn) instead of O(n).In practice, actual performance is probably worse; I found iteration to be 10x slower in microbenchmarks: #4925 (comment).
This would only be done after #7221. The only code relying on these collections being sorted is the old no-deps
decide_worker
logic, which would be unreachable after that PR.I took this to the extreme and entirely removed the
sortedcontainers
dependency. If we wanted to be more conservative, we could only do 2b8ebf1 (I think we should absolutely do this at least if we merge #7221).@fjetter @crusaderky
pre-commit run --all-files