Partial thread-safety#82
Conversation
There was a problem hiding this comment.
Overall I think this is good. I'm only not convinced that ignoring KeyErrors caused by thread unsafe operations, is a way to make it thread-safe. If there exists a condition when we care about a KeyError other than when caused by threads, then it's not clear we would know how to determine its cause.
I made a prototype PR using threading.RLock and based on some basic benchmarking the added test here test_getitem_isthreasafe it seems to be negligible in performance impact, if just a smidgen faster but ya, basically the same:
# Ignoring KeyErrors
Benchmark 1: python zict/tests/test_lru.py
Time (mean ± σ): 1.251 s ± 0.054 s [User: 1.142 s, System: 0.148 s]
Range (min … max): 1.168 s … 1.380 s 25 runs
# Re-entrant locking
Benchmark 1: python zict/tests/test_lru.py
Time (mean ± σ): 1.214 s ± 0.121 s [User: 1.105 s, System: 0.160 s]
Range (min … max): 1.080 s … 1.531 s 25 runs
|
@milesgranger that's because I modified the test: def test_getitem_is_threasafe():
lru = LRU(100, {})
lru["x"] = 1
def f(_):
barrier.wait()
for _ in range(5_000_000):
assert lru["x"] == 1
n = cpu_count()
barrier = Barrier(n)
with ThreadPoolExecutor(n) as ex:
for _ in ex.map(f, range(n)):
passOn my 12-core Linux box, this PR takes 9.9s to run, while your version takes 18.9s. |
|
Sorry, my comment wasn't about either implementation being faster or not, but that the alternative, given a test which was assumed to be representative of the use case would be unaffected in terms of time while improving the thread-safety aspect. If the original test was not representative to how it would be used in practice, and discerning if a Apologies if I've misunderstood anything here. |
The original test was representative of how it would be used, functionally. |
Make the library partially thread-safe.
Crucially, this makes
distributed.Worker.data.fast.__getitem__()thread-safe, which means thatWorker.executeandget_datawill only need to offload to a thread if any of the keys are actually spilled.Actual thread-offloadling mechanics will be in later PRs (both here and in distributed).