Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

priority inversion in the engine #8688

Open
mixxxbot opened this issue Aug 23, 2022 · 21 comments
Open

priority inversion in the engine #8688

mixxxbot opened this issue Aug 23, 2022 · 21 comments
Labels

Comments

@mixxxbot
Copy link
Collaborator

Reported by: Be-ing
Date: 2016-11-12T22:53:22Z
Status: In Progress
Importance: High
Launchpad Issue: lp1641360


When a deck has a track loaded to it for the first time since opening Mixxx, CPU usage spikes. This can cause an xrun when loading the second track to be played in a set. It does not occur when loading subsequent tracks to the deck. This occurs with tracks that are already analyzed, so it is not an issue of analyses being run. I think whatever initialization is done on first track load should be done as Mixxx is starting.

@mixxxbot mixxxbot added the bug label Aug 23, 2022
@mixxxbot
Copy link
Collaborator Author

Commented by: Be-ing
Date: 2017-01-11T19:18:18Z


This occurs with sampler decks too so it is likely an issue with BaseTrackPlayer.

@mixxxbot
Copy link
Collaborator Author

Commented by: daschuer
Date: 2017-01-11T20:16:45Z


I did yesterday some investigations if the emit valueChange and the mutex is an issue.
I use my netbook with 5 ms buffer LR EQ and on deck playing and the other loading + analyzing.
I have very reproachable an underflow when the analysis is finished.

We a significant delay inside EngineBuffer::process - pauselock if the underrun occurs. Inside this block the delay can happen on different positions.
I have not actually cached a single valueChange singal, which causes the underflow. But they are can take up to 69 ms. This includes the direct connection of cause as well though.

Everytime a second thread writes a CO object that is connected to the main thead, when the engine tries to write as well, there is a priority inversion. Shame on QT that there is no priority inheritance.

I will experiment if a additional thread emit worker can help here. We can collect all pending value change signals in a inked list and the thread can adopt the job to actually emit them.

This has some effects:

  • The overflow situation is still there, but instead of the engine the emit worker is blocked.
  • I think we can omit value change singnals, when there is still a signal pending from the same CO and send always the current value. Thw worst case would be that all possible COs are chained up
  • this means we loose the synced nature or the COs where we can track short changes.
    If this is a problem we may introduce a sync type which does not ommit signals
  • All engine value change callbacks are moved from the engine thread to the emit worker.
    this is a nice gain for muticore CPUs, but may introduce new race condition because the code is not aware of a second thread.

Implementation:

  • each CO gets a "next" atomic pointer to build the linked list.
  • If the pointer has a value or an end marker the co is already chained up.
  • Before emit it is checked for the emitting thread and in case of engine, the co is chained.
  • the emit worker empties the chain as a FIFO. without locking.
  • It is scheduled after the callback like the sidechain.

What do you think?

@mixxxbot
Copy link
Collaborator Author

Commented by: rryan
Date: 2017-01-11T20:47:24Z


Daniel -- I like your conclusion (that we need to get rid of priority inversion in the engine), but don't think this is a good direction for the implementation. It seems extremely complicated to me and will not help us simplify the engine code (which is really what it needs given how complex it is -- for example, very simple changes like #651 are hard due to the concurrency involved).

I think a better solution, is to remove Controls from the engine entirely. Instead, we should move all the COs and handlers into src/mixer/ classes -- BaseTrackPlayer, etc. Then, we communicate back and forth between the engine and src/mixer classes with a pair of FIFOs for bidirectional communication.

This has worked well in 2.0 for the effect system. The CO API is entirely implemented in the Qt main thread and the Effect* / EngineEffect* classes enforce the boundary between the engine and the rest of Mixxx.

@mixxxbot
Copy link
Collaborator Author

Commented by: rryan
Date: 2017-01-11T20:49:31Z


This has some effects:

  • I think we can omit value change singnals, when there is still a signal pending from the same CO and send always the current value. Thw worst case would be that all possible COs are chained up

Note: I don't think this is a safe assumption. The value based COs are probably OK, but for COs that trigger actions, I think it is unsafe in the general case.

@mixxxbot
Copy link
Collaborator Author

Commented by: rryan
Date: 2017-01-11T20:50:27Z


(to clarify -- I mean unsafe as in it would produce a logic error, not unsafe as in would produce a crash or race condition or something)

@mixxxbot
Copy link
Collaborator Author

Commented by: daschuer
Date: 2017-01-12T00:09:48Z


I have considered an independent FIFO solution first, but our engine code is simply not aware of it. It is currently based on COs and I think it is a good idea to stick with it for a while to not overcomplicated the solution.

In the ideal case we do not need to touch the engine code at all. All proposed changes are done inside the ControlObjectPrivate.

It is a nice fact, that the issue currently matter only for the signals from the engine. The #651 issue is just the other way round.

If you look closely to the linked list it solution, it is finally very similar to a external fifo solution, but with the benefit, that it makes use of the QT signal dispatcher, and can reach any thread.

The trigger type COs are really an issue with this chained list. I have not had a close look, but I hope, that we only have one CO (play) CO this type, which can receive a special treatment.
Maybe we can use a fixed size lock free fifo for it like you proposed.

On the other hand we have the value type COs who externally benefit from the redundant value change ommit feature. If you consider my 69 ms lock, it will save us from CO updates of 14 engine calls that are outdated anyway. It would be much better to process only the last engine call.
Which will finally help to avoid fifo overflows which are much worse since we loose random values.

@mixxxbot
Copy link
Collaborator Author

Commented by: Be-ing
Date: 2017-01-12T16:02:45Z


Thanks for looking into this issue.

@mixxxbot
Copy link
Collaborator Author

Commented by: Be-ing
Date: 2018-02-07T08:39:44Z


This issue has happened a few times even on my powerful new laptop (Intel Core i7 8550U, 16 GB RAM, 1 TB M.2 SSD), so there is something wrong in Mixxx. Originally when I reported this I encountered the issue most often when loading a track to a deck for the first time, but sometimes it happens loading a track to a deck at a random time. The discussion above is interesting, but if that general code architecture was causing the problem I do not understand why this would not be happening much more frequently than at the specific time that a track is loaded. Uwe, do you have any ideas what may be going on here? It is difficult to get a grasp of everything that goes on when a track is loaded with the rampant use of Qt's signals and slots.

@mixxxbot
Copy link
Collaborator Author

Commented by: daschuer
Date: 2018-02-07T20:08:47Z


This can be explained by the CO looking.

The engine thread sends a lot of updates to the main thread. This works well as long as no other thread sends also changes to the main thread.
If a low prio thread is sending updates, and is inhibited by an other thread, than the engine thread is locked, waiting for the higher and low prio thread.

Loading a Track sends a lot of changes to the GUI.

@mixxxbot
Copy link
Collaborator Author

Commented by: Be-ing
Date: 2018-02-08T13:25:12Z


Maybe we can limit the amount of signal sent when a track is loaded to alleviate this issue before spending the effort to overhaul the architecture.

@mixxxbot
Copy link
Collaborator Author

Commented by: Be-ing
Date: 2018-06-09T21:25:25Z


I don't entirely understand the details of Daniel's proposal, but I think it is on the right track. I think the MessagePipe FIFO is appropriate for the effects system because it does more than simply pass numeric values back and forth. It is also used for transferring pointers to objects allocated on the heap in the main thread to the engine thread. However, I don't think this model should be followed throughout the engine because it requires a lot of overhead (in terms of code complexity, not computational overhead). With that approach, every message between the main thread and engine thread needs its own message type and special code to send the message from the main thread and process the received message in the engine thread. I agree with Daniel that rewriting the whole engine to use this approach would not be a great solution to the problem. Instead, I think working on the ControlObject system would be a better path forward.

If I understand Daniel's proposal correctly, it would be somewhat like the MessagePipe FIFO approach but each ControlObject would have its own private FIFO that is transparent to the rest of Mixxx. Do I understand that correctly?

@mixxxbot
Copy link
Collaborator Author

Commented by: daschuer
Date: 2018-06-10T12:12:31Z


Yes, I am trying to add a global linked list of ControlObjektprivates.
If the Engine thread updates a CO, it is linked into this list instead of signal this message immediately. I am currently consider how to handle direct connections here. They should be still direct.

@mixxxbot
Copy link
Collaborator Author

Commented by: Be-ing
Date: 2018-06-10T13:21:47Z


Should we target this for 2.3.0?

@mixxxbot
Copy link
Collaborator Author

Commented by: daschuer
Date: 2018-06-10T15:13:39Z


Let's decide that once we have a working solutions. Currently we have some strong blocking issues with qt5. We have to find out how and if this is related.

@mixxxbot
Copy link
Collaborator Author

Commented by: Be-ing
Date: 2018-06-10T15:23:18Z


I don't understand what this issue has to do with moving to Qt5. Could you explain?

@mixxxbot
Copy link
Collaborator Author

Commented by: daschuer
Date: 2018-06-10T16:42:12Z


I don't know either.
It is only a suspicion.

@mixxxbot
Copy link
Collaborator Author

Commented by: Be-ing
Date: 2018-06-10T17:54:41Z


This happened with Qt4, so I don't think it's related to whatever other problems you're referring to.

@mixxxbot
Copy link
Collaborator Author

Commented by: rryan
Date: 2018-09-20T18:36:02Z


RE: Be,

I think the MessagePipe FIFO is appropriate for the effects system because it does more than simply pass numeric values back and forth.

The engine does pass much more than numeric values back and forth, we either ignore the thread safety issues, or put a mutex on the shared state (which triggers this bug). Bug #⁠1759637 is an instance of this. We also pass TrackPointer instances back and forth, which can cause race conditions and priority inversion (since TrackPointer requires a mutex to access). The Engine should ideally get a copy of the Track object (and its Beats and Key objects) instead of touching the one used by the rest of Mixxx. That copy ought to be delivered via a message pipe "TrackLoadRequest" message.

RE: Daniel

If the Engine thread updates a CO, it is linked into this list instead of signal this message immediately.

This is funny, because it goes back to how COs were implemented in Mixxx 1.0 by Tue and Ken :). There was a global list of COs that were updated by the engine, which the GUI thread polled later.

This does not solve the bug, unfortunately -- because we still have shared mutable state the requires mutexes to access shared between the engine and the rest of Mixxx. We also have race conditions and logic errors caused by the design of COs -- we "flatten" conceptually grouped controls into multiple COs, and as you know, groups of COs cannot be atomically updated -- leading to race conditions. It would be better to move the CO API out of the engine (the rest of Mixxx can still use the existing CO API so no major code rewrites would be needed other than within the engine).

I am currently consider how to handle direct connections here. They should be still direct.

Direct connections to COs updated by the engine is one of the main causes of this bug! :)

The effects message FIFO approach has worked reliably with no thread safety or priority inversion issues for the effects system -- when I wrote it, I thought of it as a test-bed for a broader fix in the engine. I still think this is the right way forward.

@mixxxbot
Copy link
Collaborator Author

Commented by: daschuer
Date: 2018-09-20T20:24:28Z


Fist steps are in progress for any solution.
I would be happy to receive a review for:
#1700
#1713
#1717

@mixxxbot
Copy link
Collaborator Author

Commented by: Be-ing
Date: 2018-09-20T23:37:26Z


We also pass TrackPointer instances back and forth, which can cause race conditions and priority inversion (since TrackPointer requires a mutex to access).

Interesting. I have not looked at the code for track loading in detail. Do you think this explains why audible artifacts seem to occur more often when loading tracks? Is a message pipe needed here though? Do we just need to make a copy of the object before passing it to the engine?

@mixxxbot
Copy link
Collaborator Author

Commented by: rryan
Date: 2018-10-22T16:09:17Z


Bug #⁠1799256 is a potential mitigation for this.

On Thu, Sep 20, 2018 at 4:52 PM Be wrote:

We also pass TrackPointer instances back and forth, which can cause
race conditions and priority inversion (since TrackPointer requires a
mutex to access).

Interesting. I have not looked at the code for track loading in detail.
Do you think this explains why audible artifacts seem to occur more
often when loading tracks? Is a message pipe needed here though? Do we
just need to make a copy of the object before passing it to the engine?

--
You received this bug notification because you are a member of Mixxx
Development Team, which is subscribed to Mixxx.
https://bugs.launchpad.net/bugs/1641360

Title:
priority inversion in the engine

To manage notifications about this bug go to:
https://bugs.launchpad.net/mixxx/+bug/1641360/+subscriptions

@mixxxbot mixxxbot transferred this issue from another repository Aug 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant