Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix crashes/hangs when previewing instrument presets #2497

Merged
merged 4 commits into from
Feb 16, 2016

Conversation

Fastigium
Copy link
Contributor

The commits included in this pull request fix the crashes and hangs occurring when previewing many instrument presets in a short time for me. Please note, however, that I do not advise to merge this pull request unaltered, since I am not particularly happy with the implementation of some of the fixes. I'd like to discuss them with someone more at home in the code base, and perhaps aid with a more thorough cleanup of potential deadlocks and race conditions. Please view the comments on the commits as well.

Edit: commit comments disappeared due to rebasing; they can still be found at Fastigium@940d1be, Fastigium@a5beca8 and Fastigium@4820532.

@Fastigium
Copy link
Contributor Author

This pull request is meant to fix #2315. However, as mentioned in the description, I think it needs to be improved before it is ready to be merged. I'd be grateful for any testing, however.

@Fastigium Fastigium mentioned this pull request Dec 30, 2015
@tresf
Copy link
Member

tresf commented Dec 30, 2015

Thanks @Fastigium for the research on all of this as well as the work to nudge things along.

I am ignorant, however, when it comes to choosing a proper way of passing multiple types as a method parameter

An array would by my first instinct, but I've never done this in C++ myself. Since it's an enum and not an int, stacking it's value in a single variable isn't as easy (unless the enum receives an underlying mask that can be used for bitwise stacking, i.e. 0001, 0010, 0100, 1000 i.e. 0x01, 0x02, 0x04, 0x08 -- which would require a value for each Type, probably overkill when compared to an array).

Edit: Example: http://stackoverflow.com/questions/19267843/type-safe-enum-bit-flags

Edit 2: Perhaps I'm reading this wrong... I've re-read your question. You're more concerned with re-using the same function for both the Enum as well as the PlayHandle types, right?

@Fastigium
Copy link
Contributor Author

@tresf You're welcome :). Ah, what I meant is that I need a method to remove NotePlayHandles and InstrumentPlayHandles without removing other types of PlayHandles. The existing Mixer::removePlayHandles() can only remove either all types of PlayHandles or all types of PlayHandles except InstrumentPlayHandles.

Therefore, I want a Mixer::removePlayHandlesOfTypes() to which I can pass some kind of a list of types and it will remove only PlayHandles of the types contained in that list. That seems like a nice generalized solution. Changing the Enum is only needed if I choose to use a bitwise stacking approach for this, otherwise it isn't involved at all.

An array was my first instinct as well, but I was afraid of the old C problem of "who is responsible for the memory being freed". Turns out C++/Qt has classes for that that clean up after themselves. Yay for progress! I'd say a QSet is the semantically correct way to pass a list of PlayHandle types to remove then. Plus it's fast. So that would take care of that problem.

In general, however, I do not feel comfortable with the code changes I propose. I like to do things right, and I feel a bit adrift here. In lieu of an extensive code style document, I'd like the opinion of at least 1 person who carries a long-term responsibility for the quality of the code. Just like diizy gave me on the pops/clicks fix (#1744). Browsing around this GitHub repo, though, I am not sure any person of that description is active anymore. Is that correct?

@softrabbit
Copy link
Member

I'd say a QSet is the semantically correct way to pass a list of PlayHandle types to remove then. Plus it's fast.

Some might scream it's not realtime safe!. I'd lean towards the enum (or something equivalent) side to keep those people silent.

@Fastigium
Copy link
Contributor Author

Some might scream it's not realtime safe!. I'd lean towards the enum (or something equivalent) side to keep those people silent.

An excellent point! That definitely makes bitwise stacking appear to be the way to go (I assume that's what you mean with "the enum"?).

@grejppi
Copy link
Contributor

grejppi commented Dec 31, 2015

Bit flags are used pretty much everywhere else for these purposes, so I'd say go for it.

@Fastigium
Copy link
Contributor Author

Alright, I'll rebase it soon, probably tomorrow. In the meanwhile, if anyone could test and confirm this pull request actually fixing the crashes/hangs, that'd be much appreciated :)

@tresf
Copy link
Member

tresf commented Jan 1, 2016

I tried to test this fix to see if it fixed the stability issues while playing presets and my results are inconclusive.

Testing results:

  • lmms/master branch
    • Navigate to Presets, OpulenZ, Click-jack the presets, cause deadlock.
      screen shot 2015-12-31 at 8 10 52 pm
  • fastigium/preview-fix branch
    • Navigate to Presets, OpulenZ, Click-jack the presets, cause deadlock.
      screen shot 2015-12-31 at 8 11 55 pm

@Fastigium
Copy link
Contributor Author

Happy new year to all! 🎉
@tresf Thank you! I was afraid I was missing at least one possible crash/deadlock, and you found one :). Turns out that destructing opl2instrument calls Engine::mixer()->removePlayHandles() as well, making this deadlock very similar to the one fixed by 4820532. Replacing removePlayHandles() with removePlayHandlesOfTypes() everywhere should fix this properly. I'll get to work on that after I finish something else that has popped up irl in the meanwhile.

@Fastigium
Copy link
Contributor Author

Ok, rebased to make a better removePlayHandles() fix (23d2824), incidentally deleting my old commit comments 😕. The better fix prevents the OpulenZ deadlock for me, as well as some other potential deadlocks. Two things I'm not sure about:

  1. Should I use QFLAGS for the PlayHandle::Types? It doesn't seem semantically correct, as a PlayHandle has 1 and only 1 type, so I opted against it for now.
  2. Am I removing the right types of PlayHandles everywhere now? Some instruments do not seem to use NotePlayHandles but Midi notes instead; does that mean I don't have to remove NotePlayHandles for those?

I also ran into some as yet unfixed crashes when clicking lots of preset previews:

  • A SIGSEGV 4 levels deep in SDL audio. Haven't investigated that one yet, and it doesn't occur very often.
  • A SIGSEGV in BufferManager::clear(). This seems to be caused by a race condition somehow, but if I uncomment the qDebug() in BufferManager::acquire() I can't reproduce it anymore 😠.
  • A SIGSEGV in m_playHandles.erase( it ) in AudioPort.cpp. I'm not sure why this occurs, but it only happens when quickly previewing lots of ZASF presets.

In general, fixing #2315 is quickly becoming much more of a headache than I imagined 😝. It seems that the code is riddled with potential deadlocks and race conditions. What I would really like to do is examine the current multithreading design, read up on proper practices and fix this fundamentally. I feel like I'm just fighting symptoms now. Has such an effort already been done in the recent past? Is there some kind of design document?

Edit: the lost commit comments can still be found at Fastigium@a5beca8 and Fastigium@4820532, respectively.

@grejppi
Copy link
Contributor

grejppi commented Jan 1, 2016

Is there some kind of design document?

I'm afraid there is none.

@tresf
Copy link
Member

tresf commented Jan 6, 2016

@Fastigium we don't have anyone dedicated to rewriting our core currently, just a few dedicated maintainers that help where we can, so you may quickly find yourself becoming the expert on the topic if you choose to stick around, sorry if this sounds preposterous.

So in risk of taking this a bit off-topic, I'll introduce myself to put this preposterous state into context... I joined the dev community in 2012 so that two of my friends could eventually run LMMS on Mac as -- at the time -- we couldn't afford 5 separate Ableton licenses. I pinged the dev channel and had positive response but after several empty promises from a particular community member, I decided to start working on Mac support myself. Since then I can finally afford a paid DAW, but I now favor LMMS and it's community, despite it's obvious shortcomings. Furthermore, my Mac friends no longer work on music. I guess I stick around now because the community needs so much help. I think most people arrived here in a similar way, to scratch an itch. But because of this, we have a great community combined with a codebase largely lacking core developers. 😕

Since 2012, I was able to help with a lot of things, but sadly I've helped very little with the core logic. ⛵ We've had some great developers (diizy, badosu, curlymorphic) come and go, but what you see in terms of activity on the tracker is -- sadly -- what we get. <EOF> </TLDR>

@Fastigium
Copy link
Contributor Author

@tresf Thanks for the info and the background, that does clear things up a lot. Joining in the off-topic fray: the reason I started working on this bug is that I needed to make some people happy to beat my holiday blues. Fixing #1662 back in the day was like a dream: so much participation and gratefulness, and a challenging problem to boot! I was hoping that would happen again, but things are different in many respects this time.

So that leaves me to acknowledge things as they are, and decide what I want to do with them. I think I have the competence to become the expert on the topic you're talking about, and maybe even more, but I'm not ready for such a commitment right now. I was hoping to squash this bug, see 1.2.0 RC1 released, produce an awesome track with it and maybe stick around for the occasional bugfix. Instead, it would seem I removed the tip of an iceberg of potential instability. Perhaps getting rid of the most frequently occurring deadlocks and crashes will make things stable enough for a release candidate, but I fear someone is going to have to put in the big work one day or another. Hm. Maybe I'll be ready to do so somewhere in the future. Not sure.

@tresf In any case I'd like to say: thank you for sticking around. You've been giving a lot, and I hope the community has been giving back in return. Have my thumbs up 👍 😊!

But what to do with this for now... should I try to fix crashes until things are stable enough for an RC? Or should a discussion be started about getting a rewrite done?

@tresf
Copy link
Member

tresf commented Jan 7, 2016

But what to do with this for now... should I try to fix crashes until things are stable enough for an RC?

If you can continue to help, yes, please.

Or should a discussion be started about getting a rewrite done?

Lesser of two evils I think. @grejppi has expressed some concern in our efforts to get a stable release out of our master branch versus reverting to 1.1 and backporting the non-core changes. I think we're close to a stable RC and I'd also like to add that if any master bug is reproducible in 1.1 and the experience doesn't get worse, we should consider releasing with the bug.

This means that if the Zyn preset crashes you are able to reproduce are just as volatile in our stable branch, we'd be silly to hold up a 1.2 release over them.

Edit: Yes, thanks for linking #1662 to put things in to context. Great teamwork indeed. 👍

@Fastigium
Copy link
Contributor Author

This means that if the Zyn preset crashes you are able to reproduce are just as volatile in our stable branch, we'd be silly to hold up a 1.2 release over them.

That is an excellent point, I'll be sure to test that. I'm neck-deep chipping away at a Chromium bug at the moment, though, so I'll schedule this for when a break occurs in that effort.

@Fastigium
Copy link
Contributor Author

Well, how do you know? I can't seem to reproduce the Zyn preset crashes anymore, nor any of the other crashes. Race conditions are elusive sometimes. Can anyone else give this another whirl and try to produce a crash or deadlock?

@musikBear
Copy link

@Fastigium If you have a windows binary, i will.
Otherwise, you could stress the test by running resource demanding processes that utilize sound and graphics. I have often used 2-3 instances of vlc video-player, so the load was around 60% of cpu-usage, and then executed the instance i wanted to stress-test. There is a tendency to more pronounced cpu usage readouts, when the system is under stress, and the same is true for pesty random bugs. Besides that, you have done fine work for all here! 👍

@tresf
Copy link
Member

tresf commented Jan 13, 2016

@musikBear. We've written build tutorials for Linux, Windows (on linux), Mac and Windows (on Windows). Can't you step up and get a build fired by now? We need the help.

@Fastigium
Copy link
Contributor Author

@musikBear Thanks for that suggestion! With more stress on the computer I can generate SIGSEGV signals again while quickly clicking ZASF presets, though I have to keep clicking for 40-60 seconds sometimes. Also, the segmentation faults appear at seemingly random locations in the code. This looks like some kind of general memory corruption. I'll try next if I can reproduce it on a different computer; the one I'm on now is quite old and may have faulty memory.

In the meanwhile, test results from other people are very helpful. So indeed, if you can manage a build @musikBear, that'd be awesome 👍. An easy way to get the code including my fixes so far is this link: https://github.com/Fastigium/lmms/archive/preview-fix.zip

@tresf
Copy link
Member

tresf commented Jan 13, 2016

@musikBear
Copy link

@Fastigium I will report back sometimes tomorrow
@tresf Thanks! 👍 -I totally understand your heads-up regarding XP, but then again. There are quite many that list xp as their os, in the forum help-desk. I have a feeling that xp still is very popular in far east, china and russia.. A tad strange since those areas also are those with most pirate-software :p

@tresf
Copy link
Member

tresf commented Jan 14, 2016

There are quite many that list xp as their os, in the forum help-desk.

Off-topic, but for sanity's sake, upgrade and virtualize it. Microsoft offers VirtualBox images for free, so if your argument is "people still use it", then just fire up the VM at testing time and get out of the stone ages.

Back on topic.... Thanks for volunteering to test. Please benchmark crashing results against 1.1.3 so we know the severity.

@musikBear
Copy link

@Fastigium Very good things you have done here! 🎆
I have really tried hard to make your RC 'preview' crash, but i could not!
This is with no stress from other apps.
nocrashinpreviewnostress

And this
nocrashinpreviewstress
is with 3 video-players running
'1' is very fast previewing of 3oc presets.
'2' is very fast previewing of zasfx presets.
I have chosen the most resource demanding zasfx' from the 'Collection' folder and even tried the 'Dual'. I could not crash lmms!
I am especially impressed that previewing of VSTs now is blocked, so that orphan problem is no more! very good.
@tresf imo, this could be a RC for 1.2.
There is one real problem, but thats the shutdown event, and there is already a ticket on shutdown-crashing. That happens every time, and it will eat RAM, because it leaves remoteVSTplugins running.
I will now try to use Fastigium's release this weekend, and note down everything i find as problems or real issues, but this looks very very promising indeed!

@Fastigium
Copy link
Contributor Author

@musikBear Awesome, thanks :). Though I can't take credit for blocking VST previews, unless that is an unintended side-effect of my fixes 😋

I tried reproducing the SIGSEGV signals on a different computer but it just wouldn't crash. However, on the computer on which it does crash, stable-1.1 is crash-free. That makes me wonder if there is a memory corruption bug in master that only surfaces under very specific circumstances. More testing still welcome therefore, though I wouldn't be opposed to just releasing the RC and seeing if we get crash reports.

As for the shutdown crash, I wouldn't mind taking a look at that as well. Though to prevent starting much and finishing nothing, I'd like to be done with this PR before I start on that one.

@tresf
Copy link
Member

tresf commented Jan 15, 2016

previewing of VSTs now is blocked

Yes, via #1587.

though to prevent starting much and finishing nothing

Understood.

@musikBear
Copy link

Is it correct that this 2497/ 2315 is the last critical-bug ?
A search with filter critical only flags #2315
If there is anything that should be tested, let me know.

@zonkmachine
Copy link
Member

Much better. One thing I notice with this is that, both with and without this patch, sometimes zasfx previews will get stuck. With the patch the sticky is carried on to the next zasfx preview and this is not broken until you preview another instrument. In master previewing another zasfx patch will also break the sticky.

@Fastigium
Copy link
Contributor Author

@zonkmachine Woohoo, new input :)! That's an interesting difference, something funky going on there. What do you do / how do you interact with the program to trigger zasfx previews getting stuck?

@zonkmachine
Copy link
Member

I just click fast on the previews to try and torment lmms as bad as I can. >;)

@Fastigium
Copy link
Contributor Author

@zonkmachine Thanks, I just managed to reproduce it. Hm. It appears that the thread affinity fix commit makes the difference between the two described behaviors. That commit prevents the mixer from messing with PlayHandles created by the GUI thread. So in master, the mixer can clean up if the GUI thread leaves a note dangling, but in my branch, it can't. The real problem here is that there is a race condition that leaves a note dangling at all. I haven't been able to find it yet, but I'll keep looking.

@Fastigium
Copy link
Contributor Author

@zonkmachine Ok, I think I've fixed it. The problem was that new PlayHandles are not directly added to the list, but first kept in a queue to be added at the next mixer iteration. However, the Mixer::removePlayHandle() method did not check this queue, leaving a PresetPreviewPlayHandle dangling if it hadn't been added to the list yet. Can you test and check if this fix works for you, too?

@zonkmachine
Copy link
Member

Can you test and check if this fix works for you, too?

Testing...

@zonkmachine
Copy link
Member

👍 Totally fixed it!

@tresf
Copy link
Member

tresf commented Feb 15, 2016

🎉

@Fastigium
Copy link
Contributor Author

@zonkmachine @tresf Awesome! I've also been thinking about my reluctance to merge my code. Having seen some more of the existing code base, I don't think it makes sense to strive for a "perfect" addition to code that is imperfect. The goal isn't to make all the code beautiful, it's to enable people to make awesome music. Who cares if there are tens of hidden race conditions (and boy did I spot two gems yesterday 😁), as long as the program doesn't crash or hang too often. Let's do the best we can and end up with code that incites a WTF every now and then, but a product that triggers a WOW!

In other words: if nobody else has problems with my code, I'm game for merging it 👍 😊. Though when that time comes, ask me to rebase the commits one more time because I just realized I mixed up spaces and tabs 😵. And if anyone starts an effort to clean up the multithreaded code, ping me!

@Umcaruje
Copy link
Member

Though when that time comes, ask me to rebase the commits one more time

Well, this is a solution to a crtitical issue, that is stopping 1.2 RC's from getting published, so I think we want to merge this one ASAP.

@musikBear
Copy link

@Fastigium AWESOME ! 🎆

@Fastigium
Copy link
Contributor Author

Alright, rebased and ready to merge!

@Fastigium
Copy link
Contributor Author

Another rebase based on Tres's comments (I decided against altering src/tracks/InstrumentTrack.cpp because I want to keep it readable). @tresf Better? now including src/tracks/InstrumentTrack.cpp.

This fixes a few deadlocks where a PresetPreviewPlayHandle would be removed by
the creation of a new PresetPreviewPlayHandle.
This fixes a problem where a PresetPreviewPlayHandle would be put in
m_newPlayHandles to be added, then "removed" before it was actually added,
leaving it dangling.
@tresf
Copy link
Member

tresf commented Feb 16, 2016

@Fastigium ready to merge as far as I'm concerned although my comments about the 80-char limit still stand. You can wrap those as cosmetically pleasing because

some
 ? stuff
 : just doesn't look right on its own line

@Fastigium
Copy link
Contributor Author

@tresf Not sure what you mean with "You can wrap those as cosmetically pleasing". Would you like to see it all on one line or just skip the extra line for | PlayHandle::TypeInstrumentPlayHandle?

@tresf
Copy link
Member

tresf commented Feb 16, 2016

@Fastigium your call. What looks good to you is what will be committed. For example:

https://github.com/LMMS/lmms/pull/2497/files#diff-39ebb2eece4517af2a230e4737733950R191

vs.

https://github.com/LMMS/lmms/pull/2497/files#diff-e63e52cfc7a4ac16027b43b8a0d0e23aR160

I prefer the first, but we'll merge regardless. 👍

@Fastigium
Copy link
Contributor Author

I like it fine the way it is now, getting a bit tired of rebasing 😁. Anything else I can do to help 1.2.0?

@tresf
Copy link
Member

tresf commented Feb 16, 2016

I like it fine the way it is now, getting a bit tired of rebasing 😁. Anything else I can do to help 1.2.0?

👍

tresf added a commit that referenced this pull request Feb 16, 2016
Fix crashes/hangs when previewing instrument presets
@tresf tresf merged commit 47606d7 into LMMS:master Feb 16, 2016
@tresf
Copy link
Member

tresf commented Feb 16, 2016

Anything else I can do to help 1.2.0?

First, thanks for fixing this issue... secondly, to answer your question... #2384, if interested.

@Fastigium
Copy link
Contributor Author

You're most welcome 😃! And sure I'm interested! But isn't that mohamed--abdel-maksoud's brainchild? At any rate, I'll dive in and try to understand what's going on. Some more understanding can't hurt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants