Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Fix race condition in BlockingQueue #8765

Merged
merged 2 commits into from
Dec 8, 2020

Conversation

erobot
Copy link
Contributor

@erobot erobot commented Dec 1, 2020

Motivation

BlockingQueue has race condition that can cause threads waiting forever in multithreading environment. ProducerImpl uses BlockingQueue as pendingMessagesQueue_ and can be blocked forever at it. This PR fixes race condition in BlockingQueue.

Race condition details

void pop() {
Lock lock(mutex_);
// If the queue is empty, wait until an element is available to be popped
queueEmptyCondition.wait(lock, QueueNotEmpty<BlockingQueue<T> >(*this));
bool wasFull = isFullNoMutex();
queue_.pop_front();
lock.unlock();
if (wasFull) {
// Notify that an element is popped
queueFullCondition.notify_one();
}
}

Use BlockingQueue::Pop as example, its procedure is:

  1. lock
  2. check wasFull and then change queue state
  3. unlock
  4. if wasFull, notify one thread waiting at queueFullCondition

Race condition sequence:

  1. queue is full and there are multiple threads waiting on queueFullCondition
  2. thread A call Pop, lock, wasFull is true, unlock -> queue has one free space
  3. thread B call Pop, lock, wasFull is false, unlock -> queue has two free spaces
  4. thread A notify one thread waiting at queueFullCondition
  5. queue is no loger full again
  6. result: except one thread is notified by A, other threads waiting on queueFullCondition are waiting forever

Modifications

  • Use notify_all instead of notify_one to notify threads waiting on condition variables
    Reason: Currently only notify threads when queue is full or empty. After unlock, other threads may change queue state, so thread to notify condition can not determine how queue state changed and should use notify_all in case of more then one change occured.

Verifying this change

  • Add a test case BlockingQueueTest.testPushPopRace to test concurrent push and pop

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API: (no)
  • The schema: (no)
  • The default values of configurations: (no)
  • The wire protocol: (no)
  • The rest endpoints: (no)
  • The admin cli options: (no)
  • Anything that affects deployment: (no)

Documentation

  • Does this pull request introduce a new feature? (no)

@merlimat merlimat added this to the 2.8.0 milestone Dec 1, 2020
@merlimat merlimat added the type/bug The PR fixed a bug or issue reported a bug label Dec 1, 2020
Copy link
Contributor

@merlimat merlimat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great find!

@zymap
Copy link
Member

zymap commented Dec 3, 2020

We have a fix of the integration tests. Please rebase your code. Thanks.

@erobot
Copy link
Contributor Author

erobot commented Dec 3, 2020

We have a fix of the integration tests. Please rebase your code. Thanks.

Rebase and force push without code changed.

@BewareMyPower
Copy link
Contributor

You need to rebase to the latest master again because a spotbugs check issue was fixed recently by #8819

@erobot
Copy link
Contributor Author

erobot commented Dec 7, 2020

You need to rebase to the latest master again because a spotbugs check issue was fixed recently by #8819

Rebased again.

@sijie sijie merged commit 18b3876 into apache:master Dec 8, 2020
codelipenghui pushed a commit that referenced this pull request Dec 21, 2020
### Motivation

BlockingQueue has race condition that can cause threads waiting forever in multithreading environment. ProducerImpl uses BlockingQueue as pendingMessagesQueue_ and can be blocked forever at it. This PR fixes race condition in BlockingQueue.

#### Race condition details
https://github.com/apache/pulsar/blob/91e2f832178d9ffd5d78161145d895910296c2d9/pulsar-client-cpp/lib/BlockingQueue.h#L172-L185
Use BlockingQueue::Pop as example, its procedure is:
1. lock
2. check wasFull and then change queue state
3. unlock
4. if wasFull, notify one thread waiting at queueFullCondition

Race condition sequence:
1. queue is full and there are multiple threads waiting on queueFullCondition
2. thread A call Pop, lock, wasFull is true, unlock -> queue has one free space
3. thread B call Pop, lock, wasFull is false, unlock -> queue has two free spaces
4. thread A notify one thread waiting at queueFullCondition
5. queue is no loger full again
6. result: except one thread is notified by A, other threads waiting on queueFullCondition are waiting forever

### Modifications

* Use notify_all instead of notify_one to notify threads waiting on condition variables
  Reason: Currently only notify threads when queue is full or empty. After unlock, other threads may change queue state, so thread to notify condition can not determine how queue state changed and should use notify_all in case of more then one  change occured.

### Verifying this change

  - Add a test case BlockingQueueTest.testPushPopRace to test concurrent push and pop

(cherry picked from commit 18b3876)
codelipenghui pushed a commit that referenced this pull request Dec 21, 2020
### Motivation

BlockingQueue has race condition that can cause threads waiting forever in multithreading environment. ProducerImpl uses BlockingQueue as pendingMessagesQueue_ and can be blocked forever at it. This PR fixes race condition in BlockingQueue.

#### Race condition details
https://github.com/apache/pulsar/blob/91e2f832178d9ffd5d78161145d895910296c2d9/pulsar-client-cpp/lib/BlockingQueue.h#L172-L185
Use BlockingQueue::Pop as example, its procedure is:
1. lock
2. check wasFull and then change queue state
3. unlock
4. if wasFull, notify one thread waiting at queueFullCondition

Race condition sequence:
1. queue is full and there are multiple threads waiting on queueFullCondition
2. thread A call Pop, lock, wasFull is true, unlock -> queue has one free space
3. thread B call Pop, lock, wasFull is false, unlock -> queue has two free spaces
4. thread A notify one thread waiting at queueFullCondition
5. queue is no loger full again
6. result: except one thread is notified by A, other threads waiting on queueFullCondition are waiting forever

### Modifications

* Use notify_all instead of notify_one to notify threads waiting on condition variables
  Reason: Currently only notify threads when queue is full or empty. After unlock, other threads may change queue state, so thread to notify condition can not determine how queue state changed and should use notify_all in case of more then one  change occured.

### Verifying this change

  - Add a test case BlockingQueueTest.testPushPopRace to test concurrent push and pop

(cherry picked from commit 18b3876)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-picked/branch-2.7 Archived: 2.7 is end of life release/2.6.3 release/2.7.1 type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants