Skip to content

Conversation

@dconeybe
Copy link
Contributor

@dconeybe dconeybe commented Apr 9, 2021

TestSnapshotsInSyncListenerFiresAfterListenersInSync and TestListenCanBeCalledMultipleTimes were flaky for three reasons.

Reason 1: The Semaphore was incorrectly initialized to zero, causing the subsequent call to Wait() to return immediately (instead of blocking, waiting for the call to Post()). The Semaphore should have been initialized to 1 to achieve this desired behavior.

Reason 2: The std::vector<std::string> events object was being accessed concurrently from multiple threads without any protection (e.g. from a Mutex), causing a race condition. I don't think that this race condition contributed to crashes; however, it is technically undefined behavior and is prudent to fix.

Reason 3: Both the Semaphore and the vector could potentially be used by the listener even after these variables went out of scope, causing the call to Semaphore::Post() to fail and crash the application.

The fix was to create function-local TestData classes that encapsulate the vector and protect it from concurrent access with a Mutex.

There was a comment in the code, which I have since deleted, stating that "the implementation of Semaphore::Post() on Apple platforms has a data race". I believe that this apparent data race was actually due to "reason 1" listed above, where the Semaphore was incorrectly being initialized to zero.

TestSnapshotsInSyncListenerFiresAfterListenersInSync and TestListenCanBeCalledMultipleTimes were flaky for three reasons.

Reason 1: The Semaphore was incorrectly initialized to zero, causing the subsequent call to Wait() to return immediately (instead of blocking, waiting for the call to Post()). The Semaphore should have been initialized to 1 to achieve this desired behavior.

Reason 2: The `std::vector<std::string> events` object was being accessed concurrently from multiple threads without any protection (e.g. from a Mutex), causing a race condition. I don't think that this race condition contributed to crashes; however, it is technically undefined behavior and is prudent to fix.

Reason 3: Both the Semaphore and the vector could potentially be used by the listener even after these variables went out of scope, causing the call to Semaphore::Post() to fail and crash the application.

The fix was to create function-local `TestData` classes that encapsulate the vector and protect it from concurrent access with a `Mutex`.

There was a comment in the code, which I have since deleted, stating that "the implementation of Semaphore::Post() on Apple platforms has a data race". I believe that this apparent data race was actually due to "reason #1" listed above, where the Semaphore was incorrectly being initialized to zero.
@google-cla google-cla bot added the cla: yes label Apr 9, 2021
@dconeybe dconeybe self-assigned this Apr 9, 2021
@google-cla google-cla bot added the cla: yes label Apr 9, 2021
@dconeybe dconeybe added the tests-requested: quick Trigger a quick set of integration tests. label Apr 9, 2021
@github-actions github-actions bot added tests: in-progress This PR's integration tests are in progress. and removed tests-requested: quick Trigger a quick set of integration tests. labels Apr 9, 2021
@github-actions
Copy link

github-actions bot commented Apr 9, 2021

✅  Integration test succeeded!

Requested by @dconeybe on commit 1eaa038
Last updated: Fri Apr 9 15:10:02 PDT 2021
View integration test results

@firebase-workflow-trigger firebase-workflow-trigger bot removed the tests: in-progress This PR's integration tests are in progress. label Apr 9, 2021
@github-actions github-actions bot added the tests: succeeded This PR's integration tests succeeded. label Apr 9, 2021
@dconeybe dconeybe requested review from jonsimantov and schmidt-sebastian and removed request for jonsimantov April 10, 2021 00:19
@dconeybe dconeybe enabled auto-merge (squash) April 10, 2021 00:20
@dconeybe dconeybe disabled auto-merge April 10, 2021 00:20
@dconeybe dconeybe enabled auto-merge (squash) April 10, 2021 00:21
@dconeybe dconeybe merged commit 59ba273 into main Apr 12, 2021
@dconeybe dconeybe deleted the dconeybe/PortChangelist365835301 branch April 12, 2021 15:23
@firebase firebase locked and limited conversation to collaborators May 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

api: firestore cla: yes tests: succeeded This PR's integration tests succeeded.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants