Fix race condition(s) while unloading/loading tracks #2305

uklotzde · 2019-09-29T14:25:53Z

I'm not able to reproduce the bug. Awaiting confirmation that this PR fixes the bug!

https://bugs.launchpad.net/mixxx/+bug/1845695

Not a single, but multiple potential race conditions and wrong ordering of responses while unloading and loading tracks!

Pending read requests are discarded when unloading a track. But their response arrived when the new track had already been loaded. This would reset the readable frame index range to empty and the new track plays with silence.

Also, the FIFO between the worker and the reader needs to be drained entirely when unloading a track. No new read requests must be accepted until the new track has been loaded.

I've added many debug assertions to detect any inconsistencies and wrong assumptions.

Pending read requests are discarded when unloading a track. But their response might arrive when the new track has already been loaded. This would reset the readable frame index range to empty and the new track plays with silence. Also, the FIFO between the worker and the reader needs to be drained entirely when unloading a track. No new read requests must be accepted until the new track has been loaded.

uklotzde · 2019-09-29T18:54:54Z

Naming is still a big mess. But I have extracted the state of the reader into its own enum class and thereby simplified the CachingReader::process() function. Debug assertions now verify that no discarded chunks are processed. According to my reasoning, this state machine should be safe now.

The main cause for the bug was the invalid ordering of updates by the worker thread, i.e. first inserting the track loaded message, followed by invalid, discarded chunks from pending read requests for the previous track.

daschuer

Thank you for wrapping your brain around it. I have left some comments.

daschuer · 2019-09-29T20:46:38Z

src/engine/cachingreader.cpp

+    // Don't accept any new read requests until the current
+    // track has been unloaded and the new track has been
+    // loaded!
+    m_state = State::TrackLoading;


Should we set this bit before touching the worker?

Not needed. The worker thread is not intercepting the reader. The reader polls for updates. But if it helps to understand what is happening I will reorder the instructions.

...it is even desired to inform the worker asap. A new comment explains why.

daschuer · 2019-09-29T21:03:39Z

src/engine/cachingreader.cpp

+                // All chunks have been freed before loading the next track!
+                DEBUG_ASSERT(!m_mruCachingReaderChunk);
+                DEBUG_ASSERT(!m_lruCachingReaderChunk);
+                // Discard all pending read requests for the previous track


// Discard all results from pending read requests

daschuer · 2019-09-29T21:18:38Z

src/engine/cachingreader.cpp

+                m_readableFrameIndexRange = intersect(
+                        m_readableFrameIndexRange,
+                        update.readableFrameIndexRange());
+            }
        } else {


Are we sure that we hit always the else branch? We need to receive status update without a chunk.
I think it would be better to have that decoupled

Yes. This is caused by the coupled design. But I won't change that. The debug assertions clearly indicate that there are two kinds of updates:

Read result with a chunk

State change track loaded/unloaded without a chunk

daschuer · 2019-09-29T21:20:41Z

src/engine/cachingreaderworker.cpp

-    update.init(TRACK_NOT_LOADED);
+    // Discard all pending read requests
+    CachingReaderChunkReadRequest request;
+    while (m_pChunkReadRequestFIFO->read(&request, 1) == 1) {


is the fifo multi consumer aware? I think we need to move this into the run() method and the if (m_newTrackAvailable) { branch.

No and it doesn't have to be. loadTrack() is private and only invoked from run() within the same thread.

Ah yes, I misread and took it for newTrack()

uklotzde · 2019-09-29T22:28:28Z

I have only added or modified comments, no code changes needed IMHO.

daschuer · 2019-09-30T06:03:06Z

LGTM, thanks.

daschuer · 2019-09-30T06:07:21Z

I think this PR rectifies a point release NOW.
Thoughts?

uklotzde · 2019-09-30T06:28:51Z

I don't think so. It has probably introduced after 2.2.2, I need to check.

Be-ing · 2019-09-30T12:10:47Z

Fortunately we have not done a bugfix release since the regression was introduced in #2265. The bug was only reported by people using master.

#2253 is now ready for review; I'd like to get that in 2.2.3.

Thank you for the speedy fix @uklotzde.

uklotzde · 2019-09-30T14:01:06Z

I agree, let's first finish the 2 outstanding PRs.

uklotzde · 2019-09-30T20:41:30Z

I got a debug asserting in CachingReader caused by the latest commit...

uklotzde · 2019-09-30T21:12:22Z

OMG.

I didn't notice the enforced direct connections between worker and reader that both live on separate threads. These are wrong!! Or am I missing something???

Moreover, the reader should be responsible for emitting signals, but not the worker. Otherwise, the state of the reader and the emitted signals are inconsistent due to race conditions.

I should quit and stop working on this evil and cursed code, it doesn't get any better.

daschuer · 2019-10-01T20:43:26Z

src/engine/cachingreader.cpp

 }

 void CachingReader::process() {
    ReaderStatusUpdate update;
-    while (m_readerStatusFIFO.read(&update, 1) == 1) {
+    while (m_stateFIFO.read(&update, 1) == 1) {
+        DEBUG_ASSERT(m_state != State::Idle);


This one always fails after ejecting a track and loading a new one.
This is normal for every update.status == TRACK_LOADED message.
So it can be removed.

After TRACK_LOADED the state becomes State::TrackLoaded

After TRACK_UNLOADED the state becomes State::Idle and no new updates are expected.

While the reader is idle no update messages are expected!

The debug assertion is there for a reason. Idle means don't bother me with update messages or something is seriously wrong.

This might be true after your recent changes in the other PR, but this assumption is here wrong.
Try it out. I am still wrapping my head around you new code but I want first understand the original issue.

Was this your assertion as well?

uklotzde added this to the 2.2.3 milestone Sep 29, 2019

uklotzde added the critical bug label Sep 29, 2019

uklotzde requested a review from daschuer September 29, 2019 14:43

uklotzde mentioned this pull request Sep 29, 2019

Analysis: Adjust audio stream length/range if inaccurate or on errors #2252

Merged

Simplify and secure state management of CachingReader

dc1302a

uklotzde changed the title ~~[WiP] Fix race condition(s) while unloading/loading tracks~~ Fix race condition(s) while unloading/loading tracks Sep 29, 2019

daschuer reviewed Sep 29, 2019

View reviewed changes

uklotzde added 3 commits September 30, 2019 00:27

Keep the ordering of operations and explain why

e0b94c3

Reword comment

89b5b40

Add comments two if/else branches

16655dc

daschuer merged commit cd871db into mixxxdj:2.2 Sep 30, 2019

uklotzde deleted the 2.2_cachingreader_reload_tracks branch September 30, 2019 07:03

daschuer reviewed Oct 1, 2019

View reviewed changes

uklotzde mentioned this pull request Oct 1, 2019

Fix race conditions between caching reader and worker (again) #2308

Closed

mixxxbot mentioned this pull request Aug 23, 2022

track plays but without sound #9755

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix race condition(s) while unloading/loading tracks #2305

Fix race condition(s) while unloading/loading tracks #2305

uklotzde commented Sep 29, 2019 •

edited

uklotzde commented Sep 29, 2019

daschuer left a comment

daschuer Sep 29, 2019

uklotzde Sep 29, 2019

uklotzde Sep 29, 2019

daschuer Sep 29, 2019

daschuer Sep 29, 2019

uklotzde Sep 29, 2019

daschuer Sep 29, 2019

uklotzde Sep 29, 2019

daschuer Sep 30, 2019

uklotzde commented Sep 29, 2019

daschuer commented Sep 30, 2019

daschuer commented Sep 30, 2019

uklotzde commented Sep 30, 2019

Be-ing commented Sep 30, 2019 •

edited

uklotzde commented Sep 30, 2019

uklotzde commented Sep 30, 2019

uklotzde commented Sep 30, 2019

daschuer Oct 1, 2019

uklotzde Oct 1, 2019

uklotzde Oct 1, 2019

daschuer Oct 1, 2019

Fix race condition(s) while unloading/loading tracks #2305

Fix race condition(s) while unloading/loading tracks #2305

Conversation

uklotzde commented Sep 29, 2019 • edited

uklotzde commented Sep 29, 2019

daschuer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

uklotzde commented Sep 29, 2019

daschuer commented Sep 30, 2019

daschuer commented Sep 30, 2019

uklotzde commented Sep 30, 2019

Be-ing commented Sep 30, 2019 • edited

uklotzde commented Sep 30, 2019

uklotzde commented Sep 30, 2019

uklotzde commented Sep 30, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

uklotzde commented Sep 29, 2019 •

edited

Be-ing commented Sep 30, 2019 •

edited