New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

priority inversion in the engine #8688

Open

mixxxbot opened this issue Aug 23, 2022 · 21 comments

Labels

bug

Collaborator

mixxxbot commented Aug 23, 2022

Reported by: Be-ing
Date: 2016-11-12T22:53:22Z
Status: In Progress
Importance: High
Launchpad Issue: lp1641360

When a deck has a track loaded to it for the first time since opening Mixxx, CPU usage spikes. This can cause an xrun when loading the second track to be played in a set. It does not occur when loading subsequent tracks to the deck. This occurs with tracks that are already analyzed, so it is not an issue of analyses being run. I think whatever initialization is done on first track load should be done as Mixxx is starting.

mixxxbot added the bug label

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: Be-ing
Date: 2017-01-11T19:18:18Z

This occurs with sampler decks too so it is likely an issue with BaseTrackPlayer.

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: daschuer
Date: 2017-01-11T20:16:45Z

I did yesterday some investigations if the emit valueChange and the mutex is an issue.
I use my netbook with 5 ms buffer LR EQ and on deck playing and the other loading + analyzing.
I have very reproachable an underflow when the analysis is finished.

We a significant delay inside EngineBuffer::process - pauselock if the underrun occurs. Inside this block the delay can happen on different positions.
I have not actually cached a single valueChange singal, which causes the underflow. But they are can take up to 69 ms. This includes the direct connection of cause as well though.

Everytime a second thread writes a CO object that is connected to the main thead, when the engine tries to write as well, there is a priority inversion. Shame on QT that there is no priority inheritance.

I will experiment if a additional thread emit worker can help here. We can collect all pending value change signals in a inked list and the thread can adopt the job to actually emit them.

This has some effects:

The overflow situation is still there, but instead of the engine the emit worker is blocked.
I think we can omit value change singnals, when there is still a signal pending from the same CO and send always the current value. Thw worst case would be that all possible COs are chained up
this means we loose the synced nature or the COs where we can track short changes.
If this is a problem we may introduce a sync type which does not ommit signals
All engine value change callbacks are moved from the engine thread to the emit worker.
this is a nice gain for muticore CPUs, but may introduce new race condition because the code is not aware of a second thread.

Implementation:

each CO gets a "next" atomic pointer to build the linked list.
If the pointer has a value or an end marker the co is already chained up.
Before emit it is checked for the emitting thread and in case of engine, the co is chained.
the emit worker empties the chain as a FIFO. without locking.
It is scheduled after the callback like the sidechain.

What do you think?

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: rryan
Date: 2017-01-11T20:47:24Z

Daniel -- I like your conclusion (that we need to get rid of priority inversion in the engine), but don't think this is a good direction for the implementation. It seems extremely complicated to me and will not help us simplify the engine code (which is really what it needs given how complex it is -- for example, very simple changes like #651 are hard due to the concurrency involved).

I think a better solution, is to remove Controls from the engine entirely. Instead, we should move all the COs and handlers into src/mixer/ classes -- BaseTrackPlayer, etc. Then, we communicate back and forth between the engine and src/mixer classes with a pair of FIFOs for bidirectional communication.

This has worked well in 2.0 for the effect system. The CO API is entirely implemented in the Qt main thread and the Effect* / EngineEffect* classes enforce the boundary between the engine and the rest of Mixxx.

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: rryan
Date: 2017-01-11T20:49:31Z

This has some effects:

I think we can omit value change singnals, when there is still a signal pending from the same CO and send always the current value. Thw worst case would be that all possible COs are chained up

Note: I don't think this is a safe assumption. The value based COs are probably OK, but for COs that trigger actions, I think it is unsafe in the general case.

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: rryan
Date: 2017-01-11T20:50:27Z

(to clarify -- I mean unsafe as in it would produce a logic error, not unsafe as in would produce a crash or race condition or something)

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: daschuer
Date: 2017-01-12T00:09:48Z

I have considered an independent FIFO solution first, but our engine code is simply not aware of it. It is currently based on COs and I think it is a good idea to stick with it for a while to not overcomplicated the solution.

In the ideal case we do not need to touch the engine code at all. All proposed changes are done inside the ControlObjectPrivate.

It is a nice fact, that the issue currently matter only for the signals from the engine. The #651 issue is just the other way round.

If you look closely to the linked list it solution, it is finally very similar to a external fifo solution, but with the benefit, that it makes use of the QT signal dispatcher, and can reach any thread.

The trigger type COs are really an issue with this chained list. I have not had a close look, but I hope, that we only have one CO (play) CO this type, which can receive a special treatment.
Maybe we can use a fixed size lock free fifo for it like you proposed.

On the other hand we have the value type COs who externally benefit from the redundant value change ommit feature. If you consider my 69 ms lock, it will save us from CO updates of 14 engine calls that are outdated anyway. It would be much better to process only the last engine call.
Which will finally help to avoid fifo overflows which are much worse since we loose random values.

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: Be-ing
Date: 2017-01-12T16:02:45Z

Thanks for looking into this issue.

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: Be-ing
Date: 2018-02-07T08:39:44Z

This issue has happened a few times even on my powerful new laptop (Intel Core i7 8550U, 16 GB RAM, 1 TB M.2 SSD), so there is something wrong in Mixxx. Originally when I reported this I encountered the issue most often when loading a track to a deck for the first time, but sometimes it happens loading a track to a deck at a random time. The discussion above is interesting, but if that general code architecture was causing the problem I do not understand why this would not be happening much more frequently than at the specific time that a track is loaded. Uwe, do you have any ideas what may be going on here? It is difficult to get a grasp of everything that goes on when a track is loaded with the rampant use of Qt's signals and slots.

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: daschuer
Date: 2018-02-07T20:08:47Z

This can be explained by the CO looking.

The engine thread sends a lot of updates to the main thread. This works well as long as no other thread sends also changes to the main thread.
If a low prio thread is sending updates, and is inhibited by an other thread, than the engine thread is locked, waiting for the higher and low prio thread.

Loading a Track sends a lot of changes to the GUI.

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: Be-ing
Date: 2018-02-08T13:25:12Z

Maybe we can limit the amount of signal sent when a track is loaded to alleviate this issue before spending the effort to overhaul the architecture.

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: Be-ing
Date: 2018-06-09T21:25:25Z

I don't entirely understand the details of Daniel's proposal, but I think it is on the right track. I think the MessagePipe FIFO is appropriate for the effects system because it does more than simply pass numeric values back and forth. It is also used for transferring pointers to objects allocated on the heap in the main thread to the engine thread. However, I don't think this model should be followed throughout the engine because it requires a lot of overhead (in terms of code complexity, not computational overhead). With that approach, every message between the main thread and engine thread needs its own message type and special code to send the message from the main thread and process the received message in the engine thread. I agree with Daniel that rewriting the whole engine to use this approach would not be a great solution to the problem. Instead, I think working on the ControlObject system would be a better path forward.

If I understand Daniel's proposal correctly, it would be somewhat like the MessagePipe FIFO approach but each ControlObject would have its own private FIFO that is transparent to the rest of Mixxx. Do I understand that correctly?

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: daschuer
Date: 2018-06-10T12:12:31Z

Yes, I am trying to add a global linked list of ControlObjektprivates.
If the Engine thread updates a CO, it is linked into this list instead of signal this message immediately. I am currently consider how to handle direct connections here. They should be still direct.

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: Be-ing
Date: 2018-06-10T13:21:47Z

Should we target this for 2.3.0?

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: daschuer
Date: 2018-06-10T15:13:39Z

Let's decide that once we have a working solutions. Currently we have some strong blocking issues with qt5. We have to find out how and if this is related.

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: Be-ing
Date: 2018-06-10T15:23:18Z

I don't understand what this issue has to do with moving to Qt5. Could you explain?

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: daschuer
Date: 2018-06-10T16:42:12Z

I don't know either.
It is only a suspicion.

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: Be-ing
Date: 2018-06-10T17:54:41Z

This happened with Qt4, so I don't think it's related to whatever other problems you're referring to.

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: rryan
Date: 2018-09-20T18:36:02Z

RE: Be,

I think the MessagePipe FIFO is appropriate for the effects system because it does more than simply pass numeric values back and forth.

The engine does pass much more than numeric values back and forth, we either ignore the thread safety issues, or put a mutex on the shared state (which triggers this bug). Bug #⁠1759637 is an instance of this. We also pass TrackPointer instances back and forth, which can cause race conditions and priority inversion (since TrackPointer requires a mutex to access). The Engine should ideally get a copy of the Track object (and its Beats and Key objects) instead of touching the one used by the rest of Mixxx. That copy ought to be delivered via a message pipe "TrackLoadRequest" message.

RE: Daniel

If the Engine thread updates a CO, it is linked into this list instead of signal this message immediately.

This is funny, because it goes back to how COs were implemented in Mixxx 1.0 by Tue and Ken :). There was a global list of COs that were updated by the engine, which the GUI thread polled later.

This does not solve the bug, unfortunately -- because we still have shared mutable state the requires mutexes to access shared between the engine and the rest of Mixxx. We also have race conditions and logic errors caused by the design of COs -- we "flatten" conceptually grouped controls into multiple COs, and as you know, groups of COs cannot be atomically updated -- leading to race conditions. It would be better to move the CO API out of the engine (the rest of Mixxx can still use the existing CO API so no major code rewrites would be needed other than within the engine).

I am currently consider how to handle direct connections here. They should be still direct.

Direct connections to COs updated by the engine is one of the main causes of this bug! :)

The effects message FIFO approach has worked reliably with no thread safety or priority inversion issues for the effects system -- when I wrote it, I thought of it as a test-bed for a broader fix in the engine. I still think this is the right way forward.

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: daschuer
Date: 2018-09-20T20:24:28Z

Fist steps are in progress for any solution.
I would be happy to receive a review for:
#1700
#1713
#1717

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: Be-ing
Date: 2018-09-20T23:37:26Z

We also pass TrackPointer instances back and forth, which can cause race conditions and priority inversion (since TrackPointer requires a mutex to access).

Interesting. I have not looked at the code for track loading in detail. Do you think this explains why audible artifacts seem to occur more often when loading tracks? Is a message pipe needed here though? Do we just need to make a copy of the object before passing it to the engine?

Collaborator Author

mixxxbot commented Aug 23, 2022

Commented by: rryan
Date: 2018-10-22T16:09:17Z

Bug #⁠1799256 is a potential mitigation for this.

On Thu, Sep 20, 2018 at 4:52 PM Be wrote:

We also pass TrackPointer instances back and forth, which can cause
race conditions and priority inversion (since TrackPointer requires a
mutex to access).

Interesting. I have not looked at the code for track loading in detail.
Do you think this explains why audible artifacts seem to occur more
often when loading tracks? Is a message pipe needed here though? Do we
just need to make a copy of the object before passing it to the engine?

--
You received this bug notification because you are a member of Mixxx
Development Team, which is subscribed to Mixxx.
https://bugs.launchpad.net/bugs/1641360

Title:
priority inversion in the engine

To manage notifications about this bug go to:
https://bugs.launchpad.net/mixxx/+bug/1641360/+subscriptions

mixxxbot transferred this issue from another repository

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment