Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAS and voltile approach to fix delta concurrency bug #5976

Merged
merged 5 commits into from
Nov 13, 2023

Conversation

jack-berg
Copy link
Member

@jack-berg jack-berg commented Nov 10, 2023

After merging #5932, @trask pointed out that the approach isn't actually non-blocking because it doesn't use readLock().tryLock() and the sequence below can result in the call to readLock().lock() to block:

Record thread:
time=1 - Lock readLock = aggregatorHolder.lock.readLock();
time=6 - readLock.lock();

Collect thread:
time=2 - AggregatorHolder<T, U> holder = this.aggregatorHolder;
time=3 - this.aggregatorHolder = new AggregatorHolder<>();
time=4 - Lock writeLock = holder.lock.writeLock();
time=5 - writeLock.lock();

I wrote a test and verifies this is indeed correct, so back to the drawing board. I adjusted the code to repeatedly try to read the volatile AggregatorHolder, with a tryLock as the loop condition. Something like:

    AggregatorHolder<T, U> aggregatorHolder;
    do {
      aggregatorHolder = this.aggregatorHolder;
    } while (!aggregatorHolder.lock.readLock().tryLock());

But after running the unit test I wrote to verify correctness 100 times, this occasionally fails. The problem with this code is that its subject to (essentially) the same sequence outlined above: record threads calls to readLock().tryLock() can succeed, but after a collect thread has unlocked its writeLock(), causing lost writes.

After thinking about this more, I came up with a solution where I use AtomicInteger to track the number of outstanding record operations, incrementing and decrementing as they start and complete. The collect thread does a CAS loop waiting for 0 outstanding record operations, and setting the AtomicInteger to -1, which acts as a signal for record threads to re-read the volatile AggregatorHolder.

The performance of this is better than the readWriteLock solution. And it appears to be correct in all cases (I re-ran the unit test 100 times without failure).

Copy link

codecov bot commented Nov 10, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Files Coverage Δ
...nternal/state/DefaultSynchronousMetricStorage.java 96.59% <100.00%> (+0.24%) ⬆️

... and 2 files with indirect coverage changes

📢 Thoughts on this report? Let us know!

@jack-berg jack-berg mentioned this pull request Nov 11, 2023
Copy link
Member

@trask trask left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nice 👍

I think better comments and naming could make it easier to follow but that could be done in a follow-up.

@asafm
Copy link
Contributor

asafm commented Nov 13, 2023

@jack-berg I wrote that in my original comment design :)

When we record, we call readLock().tryLock. If we get false it means the write lock has been taken and we need to refresh the value from activeAggregatorHandles (explained below why) hence upon false we re-read the value at activeAggregatorHandles and call tryLock again - this should never fail - if it does, fail.

but it doesn't matter since it's flawed as you wrote. Very nice catch! I haven't considered the option that the thread reading the aggregatorHandles from volatile would be context switched between reading the value from volatile up until the write lock has been released and the map gone stale.

The current proposed approach of Odd / even AtomicInteger has 2 issues:

  1. It uses busy wait during collect(), which is not a good approach as it keeps the CPU at 100% due to this. Compare this with the writeLock().lock() approach which to the best of my knowledge doesn't use busy wait.
  2. When reading this you're a bit puzzled and takes a few "hops" to "get it".

I have a suggestion:

  1. Use read/write lock as before, but with the change of using tryLock as outlined in the description of this PR.
  2. Add a boolean volatile openForRecordings to the holder, with initial value of true.
  3. After taking write lock, you set openForRecordings to false.
  4. Upon getting readLock().tryLock() true, you read openForRecordings and if it is false, this is the signal to re-read the holder volatile and re-obtain the read lock since you're using a stale value. It will succeed in the 2nd attempt.

This solves the scenario in which you obtain a read lock on a "finished-write" holder (write has been unlocked).
It doesn't do busy wait in collect(), but uses the existing approach of writeLock().lock() to wait for all other locks.
I think it might be a bit easier to understand, although still requires docs to explain this weird piece of code to the first-time reader.

WDYT?

@jack-berg
Copy link
Member Author

I think there's improvements that could made to the approach for this, but that we're out of time. The release intended for Friday is 3 days late, and going through another round of development / performance testing / review would likely push it at least another day (based on the schedule I have). Our options are essentially: merge this PR or revert the original fix #5932 (no point including code which is both blocking and not correct all the time).

I'm going to push some minor updates to improve the naming / comments, merge this PR, and cut the release. I'm happy to continue iterating on this for the December release.

@trask
Copy link
Member

trask commented Nov 13, 2023

It uses busy wait during collect()

the key to me is that it is effectively only an "at most 2 times" loop, since the collect() thread is only called every OTEL_METRIC_EXPORT_INTERVAL

an option could be to codify that in the algorithm (and avoid some super weird degenerate thread prioritization) by no-op'ing the record() on the third or fourth loop

ugh, I mixed up collect() and record() in my head, please disregard this comment

@jack-berg jack-berg merged commit 72a5bb1 into open-telemetry:main Nov 13, 2023
18 checks passed
@jack-berg
Copy link
Member Author

It uses busy wait during collect()

the key to me is that it is effectively only an "at most 2 times" loop, since the collect() thread is only called every OTEL_METRIC_EXPORT_INTERVAL

The record threads are at most 2 times, but the collect thread could be more time since the loop is waiting for record threads to finish and decrement before continuing.

int recordsInProgress = holder.activeRecordingThreads.addAndGet(1);
while (recordsInProgress > 1) {
  recordsInProgress = holder.activeRecordingThreads.get();
}

@trask
Copy link
Member

trask commented Nov 13, 2023

oh I mixed up collect() and record() in my head (again?) 🤦‍♂️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants