Clicks and pops on VST tracks that are assigned to FX channels #1662

Closed
caLRo opened this Issue Jan 20, 2015 · 95 comments

Projects

None yet

10 participants

@caLRo
caLRo commented Jan 20, 2015

Loading up a VST through VeSTige and assigning it to an FX channel causes clicks and pops to be audible when playing notes or the entire song. They are also present in .wav and .ogg renders. When unassigned, the problem is gone. This problem does not occur when assigning non-VST tracks to FX channels.

LMMS 1.1.0 64 bit

@tresf
Member
tresf commented Jan 20, 2015

@caLRo thanks for the bug report. I installed mda Delay VST effect to reproduce this.

Unfortunately, I'm having a hard time reproducing. Can you please supply some additional information:

  1. Does this occur when the VST is assigned to the instrument and not the FX channel?
  2. Can you link a VST that is freely available that we can reproduce the bug with?

Thanks.

-Tres

@caLRo
caLRo commented Jan 20, 2015

@tresf

  1. I'm not sure what you mean by assigning the VST to the instrument. The bug occurs when the VST is assigned to a channel on the FX mixer. After loading a VST as an instrument track on the song editor, I click on the plus sign on the FX mixer to add an unused channel (FX 1), then click the VST instrument track in the song editor and then set its FX from 0 to 1. The effects chains are empty, no effect plugins (native or VST) are used.
  2. I have used Synth1 and DSK Darkness Theory VSTs to reproduce the bug. Both are instrument plugins, not effect plugins.

http://www.geocities.jp/daichi1969/softsynth/#down
http://www.dskmusic.com/dsk-darkness-theory-3/

@tresf
Member
tresf commented Jan 20, 2015

Oh these are VST instruments? We support VST effects too, I thought this was separate than #1662 but it appears to be a duplicate or very similar.

Are you saying VST instruments work fine unless they are assigned to an FX channel? Does it play properly when routed through 0 (master) ?

@tresf
Member
tresf commented Jan 20, 2015

I just assigned a VST instrument called "DSK musicbox" to an empty FX channel 1 and could not reproduce. You'll have to provide some more details as I cannot reproduce what you are experiencing.

@tresf
Member
tresf commented Jan 20, 2015

Tested with Darkness Theory as well. Still cannot reproduce.

@caLRo
caLRo commented Jan 20, 2015

The clicks and pops appear inconsistently when only 1 VST instrument is assigned to 1 FX channel. When multiple instances of a VST instrument are loaded and each of them are assigned to a separate channel (FX 1 for instrument a, FX 2 for b, etc.) there are more clicks and pops.

And yes, they work fine again when they are unassigned and are routed through to the master.

Some additional details that may help:
OS - Windows 8.1 x64
Audio interface - SDL

@tresf
Member
tresf commented Jan 20, 2015

Thanks. The VBScript mentioned in the other bug report changes CPU priority on all RemoteVSTPlugin processes. Have you confirmed this does not remedy the issue?

@Spekular
Contributor

Ate we sure this isn't the same/simiar to bug that was fixed by setting
high priority? @calro What OS are you using? Have you tried setting
VSTsomething.exe to high?

@caLRo
caLRo commented Jan 20, 2015

@Spekular I have tested with a project with 6 VST instances assigned to 6 FX channels. I have manually changed the 6 RemoteVstPlugin.exe processes to high, and the clicks and pops are still there. They are gone again when all 6 are unassigned from their channels.

@Spekular
Contributor

@caLro hmm. I guess it isn't the same. Thanks for the thorough testing.

@caLRo
caLRo commented Jan 20, 2015

You're welcome, hopefully I've given you guys enough info to squash this bug. Squash it real good.

@tresf
Member
tresf commented Jan 21, 2015

Whoops, bad post there.. Fixed.

So @diizy this is what I'm seeing... Do you have an idea why the FX channel would be so much higher lower than master in this case? I believe this may be part of the cause of this bug (rather than the performance issues we've noted in other bugs).

image

@tresf
Member
tresf commented Jan 22, 2015

@caLRo I just tried to reproduce this again to no avail.

Can you please click through your FX channels and copy/paste screenshots. We need more info here. 😄

@caLRo
caLRo commented Jan 23, 2015

@tresf I too have trouble reproducing this. It seems the more instances of VeSTige instrument tracks there are (and thus more FX channels on the mixer are used) the more clicks and pops there are. To me this is strange, as I have a pretty decent core i7 cpu and 6 gb ram which should be sufficient to handle larger projects with multiple VSTs.

I have tested with up to 6 instances of VeSTige, each loaded with Synth1 VST and each assigned to a separate channel as shown below:
bug project 01
bug project 02
bug project 03

I have set the master channel volume to 50% in order to prevent clipping and to make it easier to spot the clicks and pops.

@tresf
Member
tresf commented Jan 23, 2015

@caLRo thanks for the superb testing.

I have the feeling this is a performance issue with the mixer in general and the VSTs are taxing the CPU in a way that makes it more apparent.

We'll leave this open for now in the hopes that future performance improvements make this go away.

@badosu
Contributor
badosu commented Jan 23, 2015

@caLRo Can you see the CPU monitor inside LMMS and tell us if it's accusing high CPU usage when you hear the pops?

@caLRo
caLRo commented Jan 24, 2015

@badosu There are no CPU spikes and usage stays very low.

@tresf tresf added the bug label Jan 29, 2015
@tresf tresf added this to the 1.1.0 milestone Jan 29, 2015
@badosu
Contributor
badosu commented Jan 30, 2015

It's difficult for to reproduce your issue @caLRo, I wonder if the fact that you use Windows and SDL may be a factor here.

@tresf Do you know if SDL on windows is known to present issues? And VST maybe is just making them more clear?

@tresf
Member
tresf commented Jan 30, 2015

@tresf Do you know if SDL on windows is known to present issues? And VST maybe is just making them more clear?

SDL is all I've ever composed with on Windows. I've never had reliability with any other backends on Windows. If it does present additional performance issues, I wouldn't necessarily consider it a direct cause.

@curlymorphic
Contributor

Has anyone reproduced this on linux? i've got 60 Darkness theories on 60 fx channels, with my cpu 70-80% and im starting to hear pops. I would consider this normal. I fear im having trouble reporducing this.

@tresf
Member
tresf commented Jan 31, 2015

I would consider this normal. I fear im having trouble reporducing this.

My concern is the amount of CPU our mixer may be adding here to escalate the issue. Do you see a considerable drop in CPU when they are all routed to master?

@curlymorphic
Contributor

I have done some testing with screen shots

60 Darkness theories , all routed to master, with no fx channels while idle
image

60 Darkness theories , all routed to master, with no fx channels while playing
image

60 Darkness theories , all routed to master, with 60 empty fx channels while playing.
image

60 Darkness theories , with 60 fx channels while idle.
image

60 Darkness theories , with 60 fx channels while playing.
image

My concern is the amount of CPU our mixer may be adding here to escalate the issue. Do you see a considerable drop in CPU when they are all routed to master?

On my machine with ubuntu 14.10 no.

Some additional details that may help:
OS - Windows 8.1 x64
Audio interface - SDL

I have a pretty decent core i7 cpu and 6 gb ram

There is obviously something very wrong if pops can be reproduced regularly with only 6 instances. I have a feeling is may be os dependant.

I would be interested to see what os's this can be reproduced in.

@badosu im unsure if you ever managed to reproduce this? if so on what os

@musikBear

For "60 Darkness theories , with 60 fx channels while idle."
What is that from 60sec to 50sec?
Also @curlymorphic, you write '6 instances' but also '60 Darkness theories'
What is correct ...surely, you do not have 60 vst's in the test?!

((you have 8.. cpu's ... 💎 wow..

@curlymorphic
Contributor

What is that from 60sec to 50sec?

Probably the project loading maybe.

also the top image for the idle, only really the last few seconds is idle.

you do not have 60 vst's in the test?

i do. the 6 instances was from the description on how to recreate.

With 6 instances the cpu usage was low with poping on the bug report.
I had to put my cpu under load to get occasional pops, maybe 1 - 2 a min.

@musikBear

i do. the 6 instances was from the description on how to recreate.

..60! That is amazing -You have a serious monster running there! ⚡️

@badosu
Contributor
badosu commented Jan 31, 2015

@curlymorphic Never was tried to reproduce this one :-(. Gonna try later with some synth1 instances. But I hardly believe it's gonna present some problem as well.

@musikBear

1.1 win32
Here is mine cpu-usage, with resp 5 and 10 instances of darknes-theories. I wont say that the click / pop occourences is significant different when the individual instruments are allocated to own FX-mixer channels
That is in fact also difficult to access, just by ear. -It is way to biased, and could in no way be reconized as data. Some reproduceable and reliable way to detect 'pop/clicks' would be needed. This whole investigation also lack consistency in respect to what actually is played -eg
*what notes
*what register
*what volume
i would expect those paramters to be most important
pastenot

(the jaggeds are from adding instances)

@Sti2nd
Contributor
Sti2nd commented Feb 1, 2015

I just talked to a LMMS user who leaved LMMS (not Evan), and he says clicks and pops in his song were the reason for him leaving... So this issue seems fairly important. I think the user said that he didn't use any VSTs (or external stuff), so I am trying to get a project file from him where he have the issue.

@curlymorphic
Contributor

So this issue seems fairly important.

I would say very important. pops and clicks at normal cpu usage are unacceptable. at high cpu uses bigger sample buffers can be used but clicks are unavoidable when processing power is limited.

I am trying to get a project file from him where he have the issue.

That would be really useful, any information about his setup, cpu / memory / sound card / os etc.

I wont say that the click / pop occourences is significant different when the individual instruments are allocated to own FX-mixer channels

@musikBear so you cant confirm the original bug report?

That is in fact also difficult to access, just by ear. -It is way to biased, and could in no way be reconized as data. Some reproduceable and reliable way to detect 'pop/clicks' would be needed.

I do agree on reliable. but im not sure how to do this. We cant simply test for this in lmms because this could be happening further down the chain, maybe audio driver or the sound card. using the same machine to record audio also introduces loads on the system that are not present in normal use.

This whole investigation also lack consistency in respect to what actually is played

Has there ever been a survey done on lmms users about performance. maybe now would be a good time?

If we could get some data, then maybe we can see what the problems could be.

If this sounds like a good idea I can work on some suggested questions, and some benchmark projects.

I'm not sure how this should be delivered to the users (forum page, asked to fill out on github, a dedicated page on the lmms.io site facebook etc) and when (only when reporting problems, on download, generaly on social media etc).

This could be a big task, but could give us benifits in a lot of issues present and future

@Fastigium
Contributor

Ok, here goes :P. I'm the user Sti-Jay mentioned, and let's start with saying that the clicks and pops are not the reason for me to leave, they just royally piss me off (so they don't help, let's keep it at that).

I have found a reliable way to reproduce the clicks and pops on my laptop (Win7 x64 but with 32-bit LMMS 1.1 stable, SDL audio backend, Intel i7-2670QM @ 2.20Ghz, 6GB RAM, Realtek HD Audio Vendor 10EC Device 0269). Download Reproduce.mmpz, open it, press play in the Song-Editor to start looping. No cracks and pops. Now start adding FX channels by repeatedly clicking the big + bar in the mixer panel. The pops and clicks start almost immediately, and after adding like 30 or 40 mixer channels, they occur regularly even if I stop adding new channels. Here's a screenshot of a pop zoomed in on in Audacity:

pop2

As you can see, the sound is distorted in a peculiar way. This screenshot is of a recording of the "live" playback made using SoundLeech. But, hold your breath, the pops are there when exporting, too! Examine this wav export of Reproduce.mmpz with 45 FX channels. That was exported using File -> Export with all the default settings. Crazy, right?

My question to you now is: what would you have me try? Should I try the 64-bit version? Different audio backends (note that I've never had any luck with all non-SDL ones)? Is there some way to debug the mixer on Win 7?

Edit: An extra discovery now that I plugged in my laptop's power supply: the pops are way more frequent when the laptop is in "Power saver" mode than when it is in "High performance" mode. Still, they do not entirely disappear, even with the laptop plugged in.

@tresf
Member
tresf commented Feb 1, 2015

Thanks for the detailed write up.

In terms of doing a debug on Windows, our capabilities are extremely limited. For now we'll assume there are no debug method available. This may change down the road but for now, what we see is what we get on Windows.

What I find interestong is the fact that you are running 32-bit version. Can you try and reproduce on 64-bit please? That may be the missing link.

Lastly, the Core i7 is a hefty machine so that is good to know as well. Thanks again.

When you refer to "reason for me to leave" what is that in regards to? We don't hold prejudice against those who leave our software for others, but we do require a certain amount of involvement to resolve these things so if you can at least see this bug report through that would be a tremendous help. 👍

@Umcaruje
Member
Umcaruje commented Feb 1, 2015

Well, I just opened your project on Ubuntu 14.10 64bit. I managed to get a small glitch only when I added a ridiculous amount of channels(200+) and it was a one-time thing only. I will boot into my windows 8 now, to test it there.

@caLRo
caLRo commented Feb 1, 2015

This is strange. I cannot reproduce this bug anymore. @Fastigium I have opened your project and encountered no problems. I switched TripleOscillator for Synth1, made 2 clones, added 30 fx channels and was assign-scrolling through the channels. No clicks and pops.

Edit: I just tested with the project setup shown in the screenshots I posted earlier, and the clicks and pops were there. I still have no idea why one project has them while another doesn't.

@Fastigium
Contributor

@tresf The steps I described also result in pops and clicks on the 64-bit edition of LMMS, with about the same intensity as with the 32-bit edition. What still puzzles me most is that the bug occurs in the export function. I mean, it seems clear that it is closely related to CPU load and/or responsivity, but during exportation no kinds of underruns should be possible, right?
Would it be helpful to verbosely dump the complete sample calculations to a file during exportation? That might disclose whether the erroneous values are generated within the mixing process itself or if they originate from the instrument plugin. If somebody can build me an exe that dumps out extra data, I'll be happy to reproduce the bug with it and upload the resulting file.

As for my reasons to leave LMMS: I want to get a track signed, and I became convinced that using professional tools would make that a whole lot easier, so I bought FL Studio. So far, I have found that Maximus seems to be the missing link between the best I can create using native LMMS and the sound of artists I admire. That said, I hope to be able to see this bug report through. LMMS has been my stepping stone into the world of EDM creation, and I'd love to see it be the same for others. Be advised, though, that I am still recovering from a burnout, so that I may get overwhelmed from time to time and not respond to inquiries in a timely manner.

@caLRo Would you happen to work on a laptop? If so, could you try switching to the "Power saver" power plan? That might help trigger the bug. Heck, it might even help on a desktop computer.

@Umcaruje
Member
Umcaruje commented Feb 1, 2015

Ok here is my testing:

On windows 8 Pro x64 I added 500 channels to the Reproduce.mmpz @Fastigium shared.
500 channels
When using the 64 bit version of LMMS I had absolutely no clicks.

When I used the 32 bit version I got quite a few clicks, and they started appearing after I added ~100 channels and they got more frequent as I added more channels.

Here is how one of them looks:
click

Audio backend I used was SDL, and latency was 512 samples.

I have a pretty old-ish machine with AMD Athlon II X2 240 CPU and 4GB of RAM.

@Sti2nd
Contributor
Sti2nd commented Feb 1, 2015

I did not manage to reproduce with Fastigum's project. 500 FX channels.

I can reproduce with this project (I stole and modified) https://drive.google.com/uc?export=download&id=0B-1kciTMrTBZUzRSWE5fMVdiRFU The clicks here happens from bar 5 and onward. If you enable the effects chain I tried to boost the clicks so they are more noticeable. The orignial from the lsp https://lmms.io/lsp/?action=show&file=6349

@caLRo
caLRo commented Feb 1, 2015

@Fastigium I'm on a laptop with similar specs as yours, and on power saving mode there are indeed clicks and pops in your project file. They are also present in the exported wav files.

When I opened this issue I thought the bug had something to do with VeSTige and the FX mixer. Now it seems I can reproduce the bug even without VeSTige.

@curlymorphic
Contributor

@Fastigium Thanks for your time. I am going to try out your project now.

@Umcaruje could you upload the wav from the screenshot please, is this rendered, and what buffer size are you using. (Edit, Settings, buffersize)

I can reproduce this on ubuntu 14.10 64 bit with the use of @Fastigium Reproduce.mmpz. with as few a 3 fx channels and some paitence, I would get a pop every munite or so. When increased to 200+ it would nearly always pop atleast once per 16 bars, often more.

This happens on render.
image

looks like a change in gain for 256 samples. , my sample buffer is 64 samples long.

When looking at @Fastigium reproduce.wav i can see a gain for 64 samples.

I'm on a laptop with similar specs as yours, and on power saving mode there are indeed clicks and pops in your project file. They are also present in the exported wav files.

have you tried with the power saving turned off.

@Umcaruje
Member
Umcaruje commented Feb 1, 2015

@Umcaruje could you upload the wav from the screenshot please, is this rendered, and what buffer size are you using. (Edit, Settings, buffersize)

I'm sorry I disposed of it, it was a audacity recording of my output. I can record again and send you the wav.

As I mentioned, my buffer is 512 samples.

@caLRo
caLRo commented Feb 1, 2015

I have tested with "Balance" (my default) and "High performance" power plans as well, and the clicks and pops appear when 5 FX channels are added.

@curlymorphic
Contributor

but during exportation no kinds of underruns should be possible, right?

@Fastigium

I initially thought that this was an unde run, but i am not so sure, because of the export being effected, And that fact that the wave, while distorted, does follow the pattern of the wave, just at a different amplitude, as opposed to trash data left from the last period.

I'm sorry I disposed of it, it was a audacity recording of my output. I can record again and send you the wav.

@Umcaruje That would be really helpful, an exported one would be really good as well 👍
in fact a few more of these from various users would help :) stating how many fx channels it takes to click, your os, 32/64 would be really useful in tracking this down.

@caLRo so the power settings seem not to effect this issue, thats good info :)

@Fastigium
Contributor

@curlymorphic Power settings do affect the issue on my laptop, but sometimes only after I restart LMMS (gotta love glitches like this). Also, my buffer size has been set at 64 frames in all this testing.

@curlymorphic
Contributor

Power settings do affect the issue on my laptop

thanks.

I cant reproduce this issue using the command line to render

image

same project rendered with GUI
image

lmms -r Reproduce.mmpz May need lmms on the path

can anyone confirm this please?

@Fastigium
Contributor

Say, it just occurred to me: on my screen, whenever a pop occurs, the
master dB meter has a higher peak, but the instrument FX channel dB meter
does not. Can anyone confirm this? If it's true, it suggests that the
problem is indeed in the mixer and not in the instrument, and it happens
somewhere between the output of the FX channel and the measuring of the
master channel.

@Umcaruje
Member
Umcaruje commented Feb 1, 2015

Ok here is a wav of the recorded output:
https://drive.google.com/file/d/0B49dcRGtm8vJUlZXWm1wRk5vRVk/view?usp=sharing

LMMS 1.1 32bit

Bug appeared with 5-6 FX channels, I added up to 45 and the glitches increased.

OS: Windows 8 Pro x64

Buffer size: 512 samples.

I will try to reproduce on linux now.

@curlymorphic
Contributor

it happens somewhere between the output of the FX channel and the measuring of the master channel.

sounds like a good place to start looking.

@Umcaruje using your win8 64 with 512 sample, the glitch is only 256 bytes long

image

@musikBear

Guys! do you have autosave enabled -its evil i say! -And it will occour in regular intervals!

In will do tests on reproduce.mmpz on win32 later and post my results
It is a good thing that this file now can be a 'test-standard', that is a tiny step towards reliable reproduce tests -But the issue of 'user-chosen-optimation' - f.i. autosave, is also very important.
In fact, we should establish a test-setting std, and work from that, sp at least lmms instances are comparable -right?

@curlymorphic
Contributor

Guys! do you have autosave enabled -its evil i say! -And it will occour in regular intervals!

Nice thinking, im going to try that now. Ive been working on this for about 12 hours now. The good news is i have a better understand of the mixer now. The bad news, i cant seem to remove them clicks. well not yet. I have 1 more thing to try,then i need to start thinking out of the box,

In fact, we should establish a test-setting std, and work from that

👍

@curlymorphic
Contributor

Guys! do you have autosave enabled -its evil i say!

just tested, thats not making any difference on my machine

@Fastigium
Contributor

Alright, so I've been poking around the source a bit (checked out the v1.1.0 tag, managed to reproduce the bug on Linux Mint 17) and tried printing gain values in FxMixer.cpp to see if they would be != 1. While doing so, I noticed a strange side-effect. Adding a qWarning() call in the right place stopped the pops from being visible on the master channel dB meter! Huh?

Normally, every time I hear a pop I see that the peak lines on the master channel dB meter are briefly higher than those on FX 1's dB meter. When this qWarning() call is added, the peak lines of the master channel and FX 1 always keep the same height. The qWarning() call in FxMixer.cpp:

149     const float v = m_volumeModel.value();
150
151     // DEBUG print the v level
152     qWarning( "%f", v );
153
154     if( m_hasInput )

That's in the FxChannel::doProcessing() method, right after the v of the channel itself is set (i.e. not inside the foreach loop). I also tried not printing the value of v but just a constant number -- qWarning( "%f", 1337.0 ); --, and it had the same effect.

Which leads me to think that this is a threading issue of some kind. I feel a bit out of my league here :P. Can anyone reproduce that adding such a qWarning removes the peak line difference?

@musikBear

win32 xp 1.1
This is interesting
These are exported clips with defaults wave settings in reproduce.mmpz project, with the number of fx-channels indicated.
artefactsclicks
Not only is the clicks independent of numbers of fx-mixer channels, they are *identical and they occour in same posistions!
(I also have the wavefiles in a zip, but that wont upload here)
so
Clicks was here

  • not related to number of fx-channels
  • highly reprocuceable
  • 100% uniform in expression

That changes the issue away from # of fx-mixer channels in a project -it ofcause not exclude that fx-mixer has something to do with it..


i have now tried different waveforms
100% identical output (not attached)

@curlymorphic
Contributor

@Fastigium Nice input

Ia am slowly coming to the conclusion that this is related to the gui and some threading issue, as oposed to being a pure mixer problem. I will investigate a bit further.

What you have said about qWarning is still true in master branch, progress :)

checked out the v1.1.0 tag

any pull requests should go to master for inclusion in the next release.

@musikBear

are you refering to the points I have marked in red?

image

@Fastigium
Contributor

@curlymorphic You might be right about the gui relation, I'll try if I can produce pops with command-line rendering. Edit: Seems like I cannot, time to investigate the gui relation :).

Good to hear the qWarning works for you, too! We have a clue, at least, though it might be some complicated side-effect that keeps us looking in the wrong direction :P.

I'll remember to make any actual pull requests for the master branch. Working with that tag now because that keeps my Windows and Linux builds equal.

@musikBear Are you sure those points in the audio are clicks, i.e. do they sound like clicks? The testing sound is a melody with slightly overlapping notes (due to attack/release), so the peaks you see might just be the melody.

@Fastigium
Contributor

Some new information: I tried regression testing, reasoning that this bug wasn't there in 1.0.3 so it must have been introduced somewhere, and found that 1.0.93 and 1.0.96 do not pop and click but become unresponsive on Linux Mint when following the steps to reproduce I posted earlier*. Versions 1.0.98, 1.0.99 and 1.0.100 exhibit the same behavior as 1.1.0.

Now, between version 1.0.96 and 1.0.98, @diizy did some work on mixer multithreading, which may be the cause of this difference. I wonder if the pops and clicks were introduced between 1.0.96 and 1.0.98, or if they were there before but masked by the unresponsiveness. Maybe becoming unresponsive is 1.0.96's way of popping and clicking, so to speak?

*: Unresponsive means that the sound stops abruptly, the playback bar freezes in the Song-Editor, the dB meters drop to nothing, and none of the buttons of the UI seem to have any effect when clicked. CPU usage of the LMMS process sticks near 200%, meaning that it keeps using multiple cores to do whatever it is that's blocking the UI. This happens after adding between, say, 20 and 100 mixer channels.

@Animtim
Animtim commented Feb 3, 2015

+1!
I also experience clicks'n'pops in 1.1.0 and 1.1.1 as soon as I send some instruments to mixer tracks. If I only use master channels, clicks don't happen (but ...meeeh! I want to use the mixer and sends :p)

I'm on Mageia 5 beta (cauldron), with custom built package (I just updated the spec today from 1.0.3 to 1.1.1 and noticed this... )

I can provide more infos and test some patch if needed.

@Fastigium
Contributor

1.0.96 on Windows also becomes unresponsive when following the steps to reproduce I posted earlier, and I've also failed to make it pop or click thus far. What's interesting: in actual music projects of mine that exhibited occasional pops and clicks in 1.1.0, 1.0.96 stays pop-free, too. This observation is confirmed by Darius on the forums.

My hypothesis is therefore that the pops and clicks bug was introduced by one of the 67 commits that separate 1.0.98 from 1.0.96. I'll see if I can do a proper regression test today and pinpoint the commit that introduced the bug. That should give us a clue on how to fix this.

@Fastigium
Contributor

Alright, so I have kind of pinpointed the bug to ca06a10. I say kind of, because I managed to produce one or two pops and clicks with the commit before it, but only after following the steps to reproduce many times. It seems that the bug was there before, but that the added mutex lock/unlock of ca06a10 makes it much more likely to occur.

This leads me to believe that we are dealing with a race condition. I also noticed how the sample values of the pops and clicks in the recorded and rendered audio seem to be exactly 2 times the value they should be. In other words, it is likely that the instrument sample values are added to the mixer buffer twice, which is something that might very well be caused by some kind of race condition. I'm just at a loss as to where in the code that race condition is to be found. I'll see if I can keep digging.

Sorry to keep double posting, btw. I just hope that if someone subscribed to this bug sees my progress, he or she will get an idea that will speed up this bug hunt. If I should stop double posting, just say the word :).

@Fastigium
Contributor

Okay... I think I've found a way to fix this bug. The pops disappear pretty much completely when I put a global QMutex lock/unlock around all operations in FxChannel::doProcessing(). I just can't understand why. No matter how the operations in that method interleave, they should not influence each other between threads. But apparently, they do.

This "fix" serializes all mixer processing, which totally eliminates the speed gain obtained by parallelizing it. In other words, it is a roundabout way of returning to a single-thread mixer. Properly fixing the bug would require identifying the way in which certain interleavings can cause incorrect data and fixing that. I don't know if I'll be able to do that, though.

At any rate, here is the way I got 1.1.0 back to the poppiness (or lack thereof) of 1.0.96:
(In FxMixer.cpp)

  1. Add the following declaration at the top of the file, after the #include statements:
    QMutex uglyFixForPops;
  2. Add the following statement right after the opening bracket of FxChannel::doProcessing():
    uglyFixForPops.lock();
  3. Add the following statement right before the closing bracket of FxChannel::doProcessing():
    uglyFixForPops.unlock();

@curlymorphic Can you confirm that this fixes the bug? I can still reproduce a rare pop while adding channels, but none at all just playing the song with lots of channels added.

@musikBear

@curlymorphic wrote

are you refering to the points I have marked in red?

yes, but meanwhile it looks like the issue has been isolated
@Fastigium Nicely done!
hope @diizy chimes in with respect to the threading

@curlymorphic
Contributor

@Fastigium I will check when i get back in front of my pc. It is possible that the click when adding channels is unrelated.

as for mutexing the whole process i never tried that, did try mutexing chunks of code to no avail. Nice work for sticking with it. We will have to wait and see what @diizy has to say about the threading issue.

@Animtim
Animtim commented Feb 3, 2015

Thanks a lot for the temporary fix, it works perfectly here.

@badosu
Contributor
badosu commented Feb 3, 2015

Wow, really impressed by the teamwork here.

Props to everyone involved! 👍

Just for the reference, on Arch Linux, master build, 64 bit with JACK output I could not reproduce the issue when playing Reproduce.mmpz.

@tresf
Member
tresf commented Feb 4, 2015

Wow, really impressed by the teamwork here.
Props to everyone involved! 👍

Agreed. This is very good conversation indeed and I have been following it right along.

As already mentioned, a proper approach really needs to be settled on here which would be a good reason to submit a 1.1 patch as this bug seems to be the most pressing issue that has been reported (even the mixer channel moving/deleting crashes seem to be much less commonly reported from the users perspective).

I'm a bit confused how the tryLock() works in this context, but I tend to agree with @Fastigium's findings and it seems to be accurate with how locks work in general. In this case, is seems to skip the JobQueue if it cannot get a lock, it returns false immediately per http://qt-project.org/doc/qt-4.8/qmutex.html#tryLock.

I notice there is also a tryLock(int timeout) which accepts a timeout value... I'm not sure if that is a good idea to try here or not? I simply don't know enough about the DSP code.

Usually I'd defer to Vesa on these things, but his connectivity has been limited so we'll have to either wait or spend some more time investigating the best approach. 👍

-Tres

@Fastigium
Contributor

@tresf I tried replacing that tryLock() with a normal lock(), and it didn't seem to make any difference at all. The mutex that's being locked/unlocked there is only used when adding or removing channels, as far as I can see, so it may result in a buffer filled with zeroes sent to the sound output when a channel is added or removed. This might actually be intentional, as the alternative is that the mixing has to wait until the lock is released, in which case a delay occurs that might cause an avalanche effect on slow computers. But that's just guesswork; it would have been nice if a comment was added explaining why tryLock() is being used instead of lock(). See message below this one

That said, I'll see if I can isolate the race condition further by varying the lock/unlock locations; maybe I'll find the exact spot where things go wrong. Do we have an ETA on Vesa's return to connectivity?

@Fastigium
Contributor

@tresf Ok, scratch that, when FxChannel::doProcessing() is globally locked, it does matter when I replace tryLock() with lock() :P. Ugh, these thread issues are so elusive!

@Fastigium
Contributor

I found a race condition! Look at this method from FxMixer.cpp:

void FxChannel::incrementDeps()
{
    m_dependenciesMet.ref();
    if( m_dependenciesMet >= m_receives.size() && ! m_queued )
    {
        m_queued = true;
        MixerWorkerThread::addJob( this );
    }
}

This method may be called from multiple worker threads simultaneously and is not synchronized in any way. There is a possible interleaving that results in incorrect data:

  1. Thread 1 increments m_dependenciesMet (the ref() method increments the value atomatically)
  2. Thread 2 increments m_dependenciesMet
  3. Thread 1 checks if( m_dependenciesMet >= m_receives.size() && ! m_queued )
  4. Thread 2 checks if( m_dependenciesMet >= m_receives.size() && ! m_queued )
  5. Both threads branch into the if statement, adding a job for this channel
  6. The channel is processed twice!

This would explain why the pops look like the sample values are doubled, they are simply added twice because the whole master channel is added to the job queue twice. I tried putting the ugly global locks around this method, and it reduced the pops to their 1.0.96 frequency. They are still not 100% gone, however, which leads me to think that there must be another race condition somewhere, though it is triggered far less frequently. Which probably means that it's even more obscure :P.

Anyway, a proper fix for this would be to put a channel-wide mutex lock around all the operations in FxChannel::incrementDeps(). There conveniently is a channel-wide mutex already defined, namely m_lock, so that just adding m_lock.lock(); and m_lock.unlock(), respectively, would be enough (tested and works). @tresf @curlymorphic Do you agree and should I create a pull request to the master branch for this?

@curlymorphic
Contributor

@curlymorphic Can you confirm that this fixes the bug? I can still reproduce a rare pop while adding channels, but none at all just playing the song with lots of channels added.

I'm really sorry it has taken me so long to respond, when you have put so much effort in.

I can conform on my machine with your fixes described , wrapping void FxChannel::doProcessing() in a mutex lock removes the pops while playing. Before this patch with 200 channels I dont recall flawless playback in a 16 bar loop. Ive just had this playing for about 15 mins and did not detect any clips or pops, audibly or visually.

There is a performance hit, but was expecting that. IMO glitch free audio with a lower performance is still much preferable to distorted audio with good performance.

I have some thoughts, that I am going to leave here, that maybe of use but I could not solve the issue.

MixerWorkerThread.cpp

void MixerWorkerThread::JobQueue::wait()
{
    while( (int) m_itemsDone < (int) m_queueSize )
    {
#if defined(LMMS_HOST_X86) || defined(LMMS_HOST_X86_64)
        asm( "pause" );
#endif
    }
}

Im not sure if i properly understand the line while( (int) m_itemsDone < (int) m_queueSize )

While getting the value of m_itemsDone and m_queueSize are both atomic operations my understanding is that the thread could be interrupted in between the operations and the values changed before the compare takes place,( or just one value changed).

I id try wrapping the comparison inside a mutex, but there was no improvement. using code similar to this.

bool moreToDo
m_myMutex.lock();
moreToDo =  (int) m_itemsDone < (int) m_queueSize;
m_myMutex.unlock(); 

Another unrelated thought, could the processing of the last job in the queue, which should be mix all to master, being processed twice. There is a comment bock in the following code that i cant fully understand the implications of, no matter how many times I have stepped thru the code.

MixerWorkerThread::MixerWorkerThread( Mixer* mixer ) :
    QThread( mixer ),
    m_quit( false )
{
    // initialize global static data
    if( queueReadyWaitCond == NULL )
    {
        queueReadyWaitCond = new QWaitCondition;
    }

    // keep track of all instantiated worker threads - this is used for
    // processing the last worker thread "inline", see comments in
    // MixerWorkerThread::startAndWaitForJobs() for details
    workerThreads << this;

    resetJobQueue();
}

My final and unrelated though for now is all the channel processing threads are being set to QThread::TimeCriticalPriority, could this possibly be causing another thread to miss out so to speak.

As I said all these thoughts have lead me no where. If anyone can can conform or deny any of this, it would certainly improve my understanding of the issue.

@curlymorphic
Contributor

@Fastigium

I did not see you last post, before i posted above. I will test this out right now

@curlymorphic
Contributor

Anyway, a proper fix for this would be to put a channel-wide mutex lock around all the operations in FxChannel::incrementDeps(). There conveniently is a channel-wide mutex already defined, namely m_lock, so that just adding m_lock.lock(); and m_lock.unlock(), respectively, would be enough (tested and works). @tresf @curlymorphic Do you agree and should I create a pull request to the master branch for this?

Very nice work, I really admire your persistence. I can confirm this works, and the performance issue I mentioned previously is now gone. I can't see any visual difference in the Lmms Cpu meter, or system monitor on my machine.

Do you agree and should I create a pull request to the master branch for this?

I would put in a pull request to master. It can easily be cherry picked as a bug release for 1.1 if desired, and a pull request may get more attention.

@Fastigium
Contributor
@badosu
Contributor
badosu commented Feb 4, 2015

Really Impressive work @Fastigium! Thanks :-)

@Fastigium
Contributor

Ok, I submitted a pull request. I couldn't find any guidelines on how you like your pull requests and/or commit messages; if I did something wrong, please tell me what and where I can find guidelines if they exist.

Also please note that this patch does not fix all clicks and pops on my computer, it just greatly reduces their frequency. More race condition hunting is in order. However, since my master is picking up the pace again, I have less free time than I had earlier this week. If anyone can join in the fray (you were doing great, @curlymorphic!) it would probably help :).

And thanks for all the compliments, they really get me going and make my days 😊!

@musikBear

👍 ! very good!

@curlymorphic
Contributor

@Fastigium

I Nice to see the pull request:), I should have some more time again from sunday. If you get a chance could you please look at my above post about the following code please, I keep getting drawn back to this.

MixerWorkerThread.cpp

void MixerWorkerThread::JobQueue::wait()
{
    while( (int) m_itemsDone < (int) m_queueSize )
    {
#if defined(LMMS_HOST_X86) || defined(LMMS_HOST_X86_64)
        asm( "pause" );
#endif
    }
}
@tresf
Member
tresf commented Feb 6, 2015

@Fastigium the patch was denied from our master branch due to conflicts with some future design efforts. We may still stage this for stable-1.1 though... waiting to hear back from Vesa.

-Tres

@Fastigium
Contributor

@curlymorphic As far as I understand that part now, the only way in which this comparison would cause problems is if it returns false (i.e. stops waiting) while there are still jobs being done. That can only happen if either m_itemsDone tests too large or m_queueSize tests too small.

Now, m_itemsDone is only incremented after a job is actually done (with fetchAndAddOrdered( 1 );), so it shouldn't be possible for it to become greather than or equal to the amount of jobs in the queue before all those jobs are actually done.

However, m_queueSize may be increased by adding new jobs. Grepping the code showed me that new jobs are only added from within jobs that are already being run. That means that m_queueSize is always increased, if at all, before the m_itemsDone increase that would make the condition true. This, in turn, means that as long as the value of m_itemsDone is retrieved before that of m_queueSize, this comparison will never go wrong. There is no moment in time where m_itemsDone is equal to m_queueSize and a job still gets added.

However, I don't know if gcc guarantees the order of value retrieval in a comparison, especially when optimization is enabled. I'll see if I can read up on that, it's an interesting subject :P. And if you can see any flaw in my reasoning, please point it out. I love indulging in this kind of reasoning every once in a while :).

@tresf: Yeah, I noticed. RT-safety is a valid reason to reject the pull request in my eyes, but implementing an atomic way to check if the job can be added requires getting rid of the m_queue check, and I don't know if I dare burn my hands on that (is that an English idiom? :P).

@Fastigium Fastigium added a commit to Fastigium/lmms that referenced this issue Feb 7, 2015
@Fastigium Fastigium RT-safe fix for race condition causing #1662 b0ff7d1
@Fastigium
Contributor

Woohoo, I think I just managed to fix the issue even better and in an RT-safe way! Can anyone check if #1745 fixes the clicks and pops for him/her, too?

@musikBear

Can anyone check if #1745 fixes the clicks and pops for him/her, too?

asa there is a win-install, raise a big flag and wave it :D i will jump in

@Animtim
Animtim commented Feb 7, 2015

RT-safe fix seems to work here, good work 👍

@Umcaruje
Member
Umcaruje commented Feb 7, 2015
@tresf
Member
tresf commented Feb 7, 2015

I'll make some windows installers for people to test, when I come home.

Or if we get good tests, we can just release 1.1.1. :)

@Fastigium Fastigium added a commit to Fastigium/lmms that referenced this issue Feb 7, 2015
@Fastigium Fastigium RT-safe fix for race condition causing #1662 d64e93b
@Fastigium
Contributor

@tresf Things seem to have stalled a bit now. Are more tests needed or is a release already upcoming?

@tresf
Member
tresf commented Feb 10, 2015

@tresf Things seem to have stalled a bit now. Are more tests needed or is a release already upcoming?

Yes, sorry. I need to make a stable-1.1.2 release (we'll have to skip 1.1.1 since we already bumped the version). @curlymorphic @badosu in your memory are there any commits to master that need to be backported/cherrypicked prior to release?

The current plan is to offer stable-1.1.2 as a beta and simply switch the flag to stable once we know it is good.

@curlymorphic
Contributor

in your memory are there any commits to master that need to be backported/cherrypicked prior to release?

not that i can recall, I have just looked over the commit list, and nothing stood out.

@tresf
Member
tresf commented Feb 10, 2015

I've reviewed too... the language updates probably should have targeted stable-1.1 but they're better left as-is since they can change quite a bit between versions.

But... What about this mute flag? 5a92243

@curlymorphic
Contributor

But... What about this mute flag? 5a92243

Im not sure that the mute flag was a big a show stopper as the poping audio

@tresf
Member
tresf commented Feb 10, 2015

Agreed. I'll try to publish the win32/win64 builds now and post back when we have installers to try. They will show as beta versions on our download page as well, but I'll link the installers here directly.

If they seem OK, I'll follow-up with an OSX version and then myself, Stian or Uros will publish the link to social media.

-Tres

@tresf
Member
tresf commented Feb 10, 2015

They're ready. If anyone has the time to test this out, please do and post feedback here.

Here are direct links to the installers:

releases/download/v1.1.2/lmms-1.1.2-win32.exe
releases/download/v1.1.2/lmms-1.1.2-win64.exe

Or via the beta buttons on our download page:
https://lmms.io/download/#windows

@Fastigium
Contributor

@tresf Both versions tested on my Win7 with power saving enabled. 32-bit seems to be 100% pop-free. 64-bit has occasional pops while adding channels (could be the tryLock() failing), none whatsoever during normal playback with lots (i.e. >150) of channels added :).

@Sti2nd
Contributor
Sti2nd commented Feb 10, 2015

But... What about this mute flag? 5a92243

As I have understood it, all bug fixes could be applied. And since this is a bug introduced with 1.1.0, I don't see why one would keep a newly introduced bug unfixed.

Im not sure that the mute flag was a big a show stopper as the poping audio

Agreed. Does importance have anything to say when they are both fixed? You seem to suggest that, I thought not :p Could be me misunderstanding what you put in the words bug fix release :)

@caLRo
caLRo commented Feb 10, 2015

Tested 64 bit version on win 8.1 and heard no clicks or pops. :) I used 6 Synth1 VSTs and scrolled them across 100 FX channels. While playing, I removed some channels and added a few, but could not reproduce those pops @Fastigium described.

@tresf
Member
tresf commented Feb 10, 2015

Agreed. Does importance have anything to say when they are both fixed? You seem to suggest that, I thought not :p Could be me misunderstanding what you put in the words bug fix release :)

This is true, a bugfix is a bugfix and this is a valid candidate, but this isn't the first we discussed this #1631. If you want it in, this might be the time for you to download the GitHub client and issues the git cherrypick command and just do it yourself, but it would be staged for a hypothetical 1.1.3 now, since 1.1.2 is tagged, milestone'd and ready for release.

-Tres

@tresf
Member
tresf commented Feb 10, 2015

@caLRo @Fastigium thanks for the feedback and quick testing. I'm considering this bug closed. Let's release 1.1.2. 👍

Fixed via #1747, thanks @Fastigium!

@tresf tresf closed this Feb 10, 2015
@Sti2nd
Contributor
Sti2nd commented Feb 10, 2015

If you want it in, this might be the time for you to download the GitHub client and issues the git cherrypick command and just do it yourself

I actually managed to. Glad you told me to do it 😃

@tresf
Member
tresf commented Feb 11, 2015

@Sti2nd here's an interim build until we sort the mingw stuff...

https://github.com/tresf/lmms/releases/tag/v1.1.2-1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment