Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ridiculous corner case on AtomicCell reads #345

Closed
mtak- opened this issue Mar 18, 2019 · 6 comments · Fixed by #359
Closed

Ridiculous corner case on AtomicCell reads #345

mtak- opened this issue Mar 18, 2019 · 6 comments · Fixed by #359

Comments

@mtak-
Copy link

mtak- commented Mar 18, 2019

For AtomicCell's that use seqlocks, the usize in the lock can, in some extreme scenario, wrap around in the middle of a read, causing the read validation to succeed even though the read is torn.

Copying the repro from here: Amanieu/seqlock#3

Thread 1:
Starts a read, getting a sequence number, and reads some of the data.

Other threads:
Successfully perform exactly 2^(mem::size_of::<usize>() - 1) writes, leaving the sequence number effectively unchanged.

Thread 1:
Finishes reading the data, and sees the sequence number unchanged. It then returns the torn read.

On 64-bit machines, this would take a few lifetimes to occur, on 32-bit machines it might require the reader thread to be inactive for around one second.

I'm not sure if this worth caring about, but thought I'd post it anyway.

@ghost
Copy link

ghost commented Mar 18, 2019

on 32-bit machines it might require the reader thread to be inactive for around one second.

A single update operation is somewhere on the order of around 10 ns, which means 2^32 updates in a loop would take 43 seconds. So in order for a torn read to happen, a reader thread would have to be paused for 43 seconds and then continue running at just the right moment in between two update operations.

Considering how OS schedulers work and how unlikely that scenario is to happen, I wouldn't worry about it.

Do my calculations seem correct?

@mtak-
Copy link
Author

mtak- commented Mar 18, 2019

1 bit is reserved as the lock bit, so 2^31 updates is the correct number. I benched fetch_adds at around 4ns on my (64bit) machine, and conservatively dropped that to about 0.5ns (2GHz) giving the 1s number. Using 10ns, it would be about 21.5s.

Full disclosure: I have a similar issue in swym's stm implementation, and I'm trying to get a feel for what, if anything, there is to do about it.

@ghost
Copy link

ghost commented Mar 18, 2019

Perhaps we should just use AtomicU64 (will be stable in Rust 1.34, to be released on April 11th)?

@mtak-
Copy link
Author

mtak- commented Mar 20, 2019

AtomicU64::load on i686 compiles to a lock cmpxchg8b godbolt, which would be a substantial hit to performance.

Having thought about this more, using AtomicU64 is practically impossible to result in torn reads, but still theoretically unsound; this is only slightly better than AtomicUsize which has a practically 0 probability of getting a torn read every few seconds or so under the worst circumstances.

Doing nothing looks like the best course of action.

Thanks for checking this issue out, and sorry for the noise!

@mtak- mtak- closed this as completed Mar 20, 2019
@Vtec234
Copy link
Member

Vtec234 commented Mar 21, 2019

Having thought about this more, using AtomicU64 is practically impossible to result in torn reads, but still theoretically unsound; this is only slightly better than AtomicUsize which has a practically 0 probability of getting a torn read every few seconds or so under the worst circumstances.

Well, an extremely low probability is not 0 and using crossbeam should never result in memory unsafety. While a 64-bit counter would still be theoretically unsound, a thread would have to be preempted for several hundred years, while in the case of a 32-bit one the probability is closer to 2^(-32) for it to wrap back and hit exactly the same value. Since switching to AtomicU64 should have no impact on 64-bit platforms, I'd say doing that would be sensible.

@Vtec234 Vtec234 reopened this Mar 21, 2019
@mtak-
Copy link
Author

mtak- commented Mar 30, 2019

Might be better to use a rwlock on 32-bit platforms. The lock cmpxchg8b on loads defeats the purpose of the seqlock.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

2 participants