Hang in tokio multi-threaded runtime #27

bin · 2022-01-15T18:10:06Z

I suspect this may be some sort of deadlock issue, as this is resolved by running in a single-threaded runtime. You will find minimum viable examples of the multi- and single-threaded behaviors at https://github.com/bin/fred_broken.git

aembke · 2022-01-15T21:04:33Z

Hi @bin,

Good find. I took a look at this and I don't think it's a deadlock, but it is clearly a race condition.

I have a similar setup going in the pipeline_test, but with one key difference - i'm spawning a new task to run concurrent command there instead of in a single loop.

The deadlock-detection logic in parking lot hasn't picked anything up, and there should only be one lock in play in the code paths used by your example. That usage is limited to single-line push_back and pop_front on a VecDeque, so I doubt that's the issue.

In looking at the network traffic I noticed that when this does repro it always happens when the client decides not to flush the socket after writing some bytes. This was just an optimization, and callers can disable it by setting the pipeline flag in the config to false. When I disable pipelining this no longer repros for me as well. But that would make sense, since if the server never received all the bytes in a frame then it won't respond.

I also found an issue in the logic that decides whether to flush the socket, so I think that's the root cause. My suspicion is that it's an atomic ordering issue in the counter that determines whether to flush some bytes, which would explain why it doesn't repro on a single thread.

I'll need to spend some time with loom but if that's the issue I'll get an update out hopefully this weekend.

aembke · 2022-01-15T21:15:56Z

Yeah, that is the issue. It's not an issue of using the wrong Ordering, but I just had an incr line at the wrong location. I was able to patch this locally and I can no longer repro the issue.

I'll get a patch up for 4.x this weekend.

aembke mentioned this issue Jan 15, 2022

4.3.2 #28

Merged

aembke closed this as completed in #28 Jan 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hang in tokio multi-threaded runtime #27

Hang in tokio multi-threaded runtime #27

bin commented Jan 15, 2022

aembke commented Jan 15, 2022

aembke commented Jan 15, 2022

Hang in tokio multi-threaded runtime #27

Hang in tokio multi-threaded runtime #27

Comments

bin commented Jan 15, 2022

aembke commented Jan 15, 2022

aembke commented Jan 15, 2022