Partial thread-safety by crusaderky · Pull Request #82 · dask/zict

crusaderky · 2023-02-28T12:51:22Z

Partially closes Asynchronous Disk Access in Workers distributed#4424

Make the library partially thread-safe.
Crucially, this makes distributed.Worker.data.fast.__getitem__() thread-safe, which means that Worker.execute and get_data will only need to offload to a thread if any of the keys are actually spilled.

Actual thread-offloadling mechanics will be in later PRs (both here and in distributed).

milesgranger

Overall I think this is good. I'm only not convinced that ignoring KeyErrors caused by thread unsafe operations, is a way to make it thread-safe. If there exists a condition when we care about a KeyError other than when caused by threads, then it's not clear we would know how to determine its cause.

I made a prototype PR using threading.RLock and based on some basic benchmarking the added test here test_getitem_isthreasafe it seems to be negligible in performance impact, if just a smidgen faster but ya, basically the same:

# Ignoring KeyErrors
Benchmark 1: python zict/tests/test_lru.py
  Time (mean ± σ):      1.251 s ±  0.054 s    [User: 1.142 s, System: 0.148 s]
  Range (min … max):    1.168 s …  1.380 s    25 runs

# Re-entrant locking
Benchmark 1: python zict/tests/test_lru.py
  Time (mean ± σ):      1.214 s ±  0.121 s    [User: 1.105 s, System: 0.160 s]
  Range (min … max):    1.080 s …  1.531 s    25 runs

crusaderky · 2023-03-10T12:11:05Z

@milesgranger that's because test_getitem_isthreasafe spends most of its time sending messages across the thread pool, which is itself wrapped in locks.

I modified the test:

def test_getitem_is_threasafe():
    lru = LRU(100, {})
    lru["x"] = 1

    def f(_):
        barrier.wait()
        for _ in range(5_000_000):
            assert lru["x"] == 1

    n = cpu_count()
    barrier = Barrier(n)
    with ThreadPoolExecutor(n) as ex:
        for _ in ex.map(f, range(n)):
            pass

On my 12-core Linux box, this PR takes 9.9s to run, while your version takes 18.9s.

milesgranger · 2023-03-10T12:20:18Z

Sorry, my comment wasn't about either implementation being faster or not, but that the alternative, given a test which was assumed to be representative of the use case would be unaffected in terms of time while improving the thread-safety aspect.

If the original test was not representative to how it would be used in practice, and discerning if a KeyError is raised from legitimate misuse or threading is not important; then making it 'thread-safe' by ignoring KeyErrors is perfectly fine by me.

Apologies if I've misunderstood anything here.

crusaderky · 2023-03-10T12:26:28Z

If the original test was not representative to how it would be used in practice

The original test was representative of how it would be used, functionally.
But unit tests are never designed to be performance-efficient.

crusaderky force-pushed the threads branch from 7b7b0e0 to e4e43b9 Compare February 28, 2023 13:04

crusaderky self-assigned this Feb 28, 2023

crusaderky marked this pull request as ready for review February 28, 2023 14:46

threads

c01ecd5

crusaderky force-pushed the threads branch from 2f71ee5 to c01ecd5 Compare March 1, 2023 12:22

milesgranger reviewed Mar 10, 2023

View reviewed changes

Tweak test

fd9e233

crusaderky force-pushed the threads branch from 4d692ee to fd9e233 Compare March 10, 2023 12:44

crusaderky merged commit 0c7f704 into dask:main Mar 10, 2023

crusaderky deleted the threads branch March 10, 2023 13:30

crusaderky mentioned this pull request Mar 14, 2023

[RFC] LRU thread-safety w/ RLock #85

Closed

crusaderky mentioned this pull request Mar 25, 2023

Lock-based thread safety #92

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Partial thread-safety#82

Partial thread-safety#82
crusaderky merged 2 commits into
dask:mainfrom
crusaderky:threads

crusaderky commented Feb 28, 2023 •

edited

Loading

Uh oh!

milesgranger left a comment •

edited

Loading

Uh oh!

crusaderky commented Mar 10, 2023 •

edited

Loading

Uh oh!

milesgranger commented Mar 10, 2023

Uh oh!

crusaderky commented Mar 10, 2023 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

crusaderky commented Feb 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

milesgranger left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

crusaderky commented Mar 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

milesgranger commented Mar 10, 2023

Uh oh!

crusaderky commented Mar 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

crusaderky commented Feb 28, 2023 •

edited

Loading

milesgranger left a comment •

edited

Loading

crusaderky commented Mar 10, 2023 •

edited

Loading

crusaderky commented Mar 10, 2023 •

edited

Loading