New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Analyser queue signal rework #6734

Closed

mixxxbot opened this issue Aug 22, 2022 · 8 comments

Labels

bug

Milestone

Collaborator

mixxxbot commented Aug 22, 2022

Reported by: daschuer
Date: 2012-12-02T11:56:11Z
Status: Fix Released
Importance: Medium
Launchpad Issue: lp1085613
Attachments: analyser_queue_signals.patch

The attached patch is a rework of the signals emitted from Analyser queue patch.

Removes unused Data from the signals
It puts only one signal on the Qt event Queue of the GUI thread each progress step, instead of two.
It ensures that only one process signal is on the GUI Event queue a time. Before the patch, the queue could get poluted with progress updates in case of a long process in the GUI thread.
No priority inversion
Small analysis speed up from 7.5 s to 7 s for a 3:30 track on my system.

In general we should prevent to fire timer driven signals between threads, to prevent the GUI Thread from process outdated data when there is high load anyway.
This patch is an improved solution from what was rejected in Bug #⁠1008721.

mixxxbot added the bug label

Collaborator Author

mixxxbot commented Aug 22, 2022

Commented by: daschuer
Date: 2012-12-02T11:56:11Z
Attachments: analyser_queue_signals.patch

Collaborator Author

mixxxbot commented Aug 22, 2022

Commented by: rryan
Date: 2012-12-04T21:39:17Z

Thanks for putting the patch together. I am trying to understand why it gives a speed improvement.

The code in lp:mixxx/1.11 emits 2 signals per progress update:
AnalyserQueue trackProgress(TrackPointer, int) and
TrackInfoObject analyserProgress(int)

Both are queued in the event queue so their slots are invoked by the main thread.

The code in your patch emits:
AnalyserQueue updateProgress()
AnalyserQueue trackProgress(int) and
TrackInfoObject analyserProgress(int)

But only updateProgress should be queued in the event queue while the latter 2 are direct invoked.
Just making sure I understand the intended benefit of doing it. Correct me if I said something wrong.

The progressUpdateInhibitTimer delays at least 60ms between progress updates so we are really talking about the difference between 2 queued events every 60 ms and 1 queued event every 60ms.

My question is does this really have a measurable benefit on analysis time and Qt event queue length? The semaphore signalling increases the complexity of this section of code by a lot so I'd rather not do it if we don't get a benefit.

You measured 7.5s -> 7s. Did you measure it many times and take an average or is this just a one-off? I wonder what the average and variance of the distribution of analysis times for a single track is if you analyse it over and over. It's possible the 0.5s speed increase was just noise or within the variance of the measurement. I wonder how we could easily test if this is a statistically significant change.

Also there are many other factors at play. The engine is always calculating while you run these tests so it's adding to the measurement noise. We could make a standalone binary that instantiates the analyser queue and tries to analyze a track over and over.

It seems very odd that doing 1 extra queued event every 60ms is enough to cause 0.5s of overall analysis time difference.

Some other minor code comments:

How does WOverview::slotTrackLoaded get called? I see no connections.
How is WOverview::m_trackLoaded different from checking if WOverview::m_pCurrentTrack is not null?
In DlgPrepare, it looks like slotTrackAnalysisProgress and trackAnalysisFinished are just rebroadcasts so they could just be a SIGNAL to SIGNAL connection instead of having a SLOT.

Collaborator Author

mixxxbot commented Aug 22, 2022

Commented by: daschuer
Date: 2012-12-04T23:42:49Z

Hi RJ

Thank you for your review. I Think you read the code right.

The speed improvement is not the cause for this patch. I just put the argument in to show that the patch does not put additional delay to the analyser process. I measure it about five time with a manual stopwatch, so not that reliable.
Also for me it is not that odd, because an additional benefit is that the new updateProgress() signal does not put any additional data on. Transferring data by the QT event queue leads finally would to a malloc call, which is not a high performance task.

The main benefit of this patch is that it stops the Analyser from overload the QT event Queue, if the GUI thread is already busy.
The semaphore stops emitting and discards the signals and allows it again until the previous emit is processes.
In my tests on my Atom hardware I can see, depending on the Waveform settings extreme jerking when the Analyser is running.
GUI Control like waveform scratch is hard to do in this case.

With patch I have still jerking but not that extreme. The original solution with the priority inversion was slightly better but it puts additional delay on the analyser process for all users.

This patch is a tiny part of the whole performance issue. I can live without it, but ...

WOverview::slotTrackLoaded is connected in legacyskinparser to bastrackplayer, this is missing in the patch.

WOverview::m_trackLoaded is needed for my upcoming patch which will show "Ready to play, analyzing .." when the track is loaded but the analysis has not started.

Collaborator Author

mixxxbot commented Aug 22, 2022

Commented by: rryan
Date: 2012-12-05T00:22:55Z

OK -- if you see it improve jerking repeatably on your Atom then I can't argue with that :). Go ahead and commit it.

But I still don't understand. It's true malloc isn't free and the reference counting from copy-constructing TrackPointers isn't cheap but when we talk about 'expensive' and 'cheap' we're talking about the nanosecond timescale.

For something like jerking to happen I think we're having a much more serious problem than an extra malloc here and there every 60ms.

Collaborator Author

mixxxbot commented Aug 22, 2022

Commented by: daschuer
Date: 2012-12-05T07:15:27Z

Hi RJ,
Yes you are right. We will have a timing improvement by not transferring data though the Qt events but that will not gain 500 ms improvement. In my test I have about 125 progress events. Saving 4 ms per progress call due to that is unrealistic.

There is no measurable Timing improvement on an idle System when doing mass analysis with this patch.
The improvement was measured on deck analysis with the other deck playing. So the benefit is due to not processed progress updates or simply a measurement fault.

I will check that.

Collaborator Author

mixxxbot commented Aug 22, 2022

Commented by: daschuer
Date: 2012-12-05T21:51:19Z

Ah, here it is:
On my Netbook tryAquire returns 27 times false. So this should be the main performance gain.

Collaborator Author

mixxxbot commented Aug 22, 2022

Commented by: daschuer
Date: 2012-12-05T23:14:04Z

Committed to lp:mixxx/1.11 #⁠3572

Collaborator Author

mixxxbot commented Aug 22, 2022

Issue closed with status Fix Released.

mixxxbot closed this as completed

mixxxbot transferred this issue from another repository

mixxxbot added this to the 1.11.0 milestone

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment