Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash/segfault when enabling/disabling chatty MIDI controllers #6680

Closed
mixxxbot opened this issue Aug 22, 2022 · 37 comments
Closed

Crash/segfault when enabling/disabling chatty MIDI controllers #6680

mixxxbot opened this issue Aug 22, 2022 · 37 comments

Comments

@mixxxbot
Copy link
Collaborator

Reported by: Pegasus-RPG
Date: 2012-10-31T09:16:14Z
Status: Fix Released
Importance: Critical
Launchpad Issue: lp1073484
Tags: crash, load, midi, patch, verification-done
Attachments: [Back trace of crash](https://bugs.launchpad.net/bugs/1073484/+attachment/3419575/+files/Back trace of crash), [Another back-trace](https://bugs.launchpad.net/bugs/1073484/+attachment/3419605/+files/Another back-trace), [back trace of the problem on 1.11 r3450](https://bugs.launchpad.net/bugs/1073484/+attachment/3420679/+files/back trace of the problem on 1.11 r3450), [Back-trace and log snippet with debug PortMIDI](https://bugs.launchpad.net/bugs/1073484/+attachment/3421425/+files/Back-trace and log snippet with debug PortMIDI), [Partial log & back-trace with PM Verbose on](https://bugs.launchpad.net/bugs/1073484/+attachment/3421463/+files/Partial log & back-trace with PM Verbose on), [pmlinuxalsa.c patch to prevent using null pointer in handle_event()](https://bugs.launchpad.net/bugs/1073484/+attachment/3421482/+files/pmlinuxalsa.c patch to prevent using null pointer in handle_event())


When disabling a very chatty controller (moving platter) and enabling another (non-chatty, doesn't matter which) I get a segfault. This happens with no tracks loaded. (The SCS.1d is the controller in question in this case and it constantly sends timestamp messages even when it's stopped. FWIW, these are Sysex messages that are 18 bytes long.)

Steps to reproduce:

  1. Connect and turn on a very chatty controller
  2. Start Mixxx.
  3. Go to prefs and enable said controller, disabling all others. Click OK.
  4. Go back to prefs, un-check Enable on the chatty controller, check Enable on another (like Midi-Through) and click OK.
  5. Observe the segfault in Pm_Poll().

Happens in 1.11 r3447 & 3450.

@mixxxbot
Copy link
Collaborator Author

Commented by: Pegasus-RPG
Date: 2012-10-31T09:16:14Z
Attachments: [Back trace of crash](https://bugs.launchpad.net/mixxx/+bug/1073484/+attachment/3419575/+files/Back trace of crash)

@mixxxbot
Copy link
Collaborator Author

Commented by: Pegasus-RPG
Date: 2012-10-31T09:38:03Z
Attachments: [Another back-trace](https://bugs.launchpad.net/mixxx/+bug/1073484/+attachment/3419605/+files/Another back-trace)


Attached another back trace where the crash happened right after disabling one controller while enabling another.

@mixxxbot
Copy link
Collaborator Author

Commented by: rryan
Date: 2012-10-31T15:26:25Z


Thanks for the backtraces. It looks like PM is segfaulting with no other Mixxx thread doing anything suspicious.

Do you compile PortMIDI by hand? There are no debug symbols for PortMIDI in your backtrace. Having those would be useful to track it down.

@mixxxbot
Copy link
Collaborator Author

Commented by: rryan
Date: 2012-10-31T15:26:39Z


I assume this doesn't happen in 1.10.x?

@mixxxbot
Copy link
Collaborator Author

Commented by: rryan
Date: 2012-10-31T19:50:38Z


Ah, I think I see the issue. When you close a device the m_pInputStream and m_pOutputStreams are not NULL'd so we are pointing to free'd memory. If it goes like:

uncheck activate -> PortMidiController::close() -> m_pInputStream is now pointing to invalid memory
poll timer activates -> ControllerManager -> PortMidiController::poll() -> Pm_Poll with a freed input stream -> segfault

Added a potential fix in lp:mixxx/1.11 r3450. Sean -- can you try it and confirm?

@mixxxbot
Copy link
Collaborator Author

Commented by: Pegasus-RPG
Date: 2012-11-01T13:50:29Z
Attachments: [back trace of the problem on 1.11 r3450](https://bugs.launchpad.net/mixxx/+bug/1073484/+attachment/3420679/+files/back trace of the problem on 1.11 r3450)


No, I use the version of PortMIDI packaged by Debian Squeeze, 184. I don't see a package that would include debug symbols.
Your change didn't seem to help the situation much. The attached backtrace looks much the same.

I can't reproduce in 1.10, but it has numerous other problems with buffering MIDI messages. (It will segfault when PortMidi overflows.)

@mixxxbot
Copy link
Collaborator Author

Commented by: rryan
Date: 2012-11-01T14:53:24Z


Did you test my fix?

@mixxxbot
Copy link
Collaborator Author

Commented by: rryan
Date: 2012-11-01T14:54:08Z


Sorry, didn't read the title of your latest attachment.

@mixxxbot
Copy link
Collaborator Author

Commented by: rryan
Date: 2012-11-01T17:27:39Z


When does the segfault happen? Right after clicking 'ok' on the preferences? Can you isolate it to just using a single midi controller or does it require multiple controllers?

I can't reproduce on a mac (to simulate the motorized controller I just moved a jog constantly).

@mixxxbot
Copy link
Collaborator Author

Commented by: rryan
Date: 2012-11-01T17:33:33Z


Could you remover the debian portmidi and build by hand with debug symbols? Also, it doesn't look like a corruption backtrace so when you get a crash in gdb w/ PM debug symbols, poke around in the various stack frames to see what various variables are and if everything looks right. You may need to compile with no optimizations otherwise a lot of temporaries will be optimized out.

Also, it would help if you could provide some more of the log before the backtrace.

@mixxxbot
Copy link
Collaborator Author

Commented by: Pegasus-RPG
Date: 2012-11-02T07:56:44Z
Attachments: [Back-trace and log snippet with debug PortMIDI](https://bugs.launchpad.net/mixxx/+bug/1073484/+attachment/3421425/+files/Back-trace and log snippet with debug PortMIDI)


Not sure what to look for here. It seems like maybe we're still polling when we ought not to be. Race condition? The attached crash happened when I disabled the SCS.1d (which is very chatty) and enabled the .1m at the same time.

@mixxxbot
Copy link
Collaborator Author

Commented by: Pegasus-RPG
Date: 2012-11-02T09:11:07Z
Attachments: [Partial log & back-trace with PM Verbose on](https://bugs.launchpad.net/mixxx/+bug/1073484/+attachment/3421463/+files/Partial log & back-trace with PM Verbose on)


Here's one more for you. I rebuilt PortMIDI with its Verbose flag on. The log is from the point that I un-checked Enabled on the SCS.1d and checked it on the MIDI-Through device. PM sends some messages to the device on shutdown which is expected, and it appears to close fine. The segfault happens when the newly-opened device is polled, so maybe this is a bug in PortMIDI?

@mixxxbot
Copy link
Collaborator Author

Commented by: Pegasus-RPG
Date: 2012-11-02T09:26:57Z


Also interesting is that the crash still happens even if you reverse the order of opening and closing devices: I tried enabling a device beneath the SCS.1d (the .1m in this case) and even though the controller polling is stopped then restarted, the segfault still occurs. It also happens if I disable two devices and enable one, as long as the .1d is one of the disabled devices.

More interestingly, it also happens every time if I have two devices enabled (SCS.1d and another) and disable only the SCS.1d. (Back trace looks the same though.)

@mixxxbot
Copy link
Collaborator Author

Commented by: Pegasus-RPG
Date: 2012-11-02T09:33:46Z
Attachments: [pmlinuxalsa.c patch to prevent using null pointer in handle_event()](https://bugs.launchpad.net/mixxx/+bug/1073484/+attachment/3421482/+files/pmlinuxalsa.c patch to prevent using null pointer in handle_event())


And this seems to fix it!

@mixxxbot
Copy link
Collaborator Author

Commented by: Pegasus-RPG
Date: 2012-11-02T09:59:46Z


Cross-reference: https://sourceforge.net/apps/trac/portmedia/ticket/3

@mixxxbot
Copy link
Collaborator Author

Commented by: rryan
Date: 2012-11-02T15:15:10Z


Nice job!

Since our ControllerManager thread inherently prevents any parallelism in how we talk to PortMIDI it isn't possible that we are calling Pm_Poll() before we have finished a Pm_Close on the controller in question.

After reading PortMIDI for a little bit I know why it requires 2 devices to trigger now:

After closing one device and opening another:

  1. Chatty MIDI device sends new messages on the wire, ALSA queues them up for that MIDI port.
  2. Pm_Close() on the chatty device marks the descriptor closed, NULLs internalDescriptor (the pointer you check for NULL in your patch). Messages for chatty-device's port are still queued in ALSA since we haven't received them yet.
  3. Mixxx starts polling for the newly opened device,
  4. Pm_Poll() called on new device. Pm_Poll would exit early if we called it on a closed device so Mixxx is definitely not calling it on the recently closed device. This is why 2 devices are required.
  5. Even though you call Pm_Poll on a specific device, Pm_Poll gets messages for all MIDI ports with pending messages.

As you pointed out in your patch, handle_event assumes we will only receive messages for devices that are open so it doesn't check internalDescriptor for NULL.

I think PortMIDI's ALSA module has a variety of NULL-pointer issues but they are all guarded against by the common API since the common API will generally bail on any operation that you try to do when the device is not open.

I noticed that in alsa_in_close, after pm_free'ing midi->descriptor, it does not clear midi->descriptor. That means that if we were to somehow be able to call any method in PortMIDI that uses midi->descriptor on the input port, it would cause a segfault. The corresponding alsa_out_close method does clear midi->descriptor. The fix for this is simple:

--- pmlinuxalsa.c      2012-11-02 10:46:48.481183991 -0400
+++ pmlinuxalsa.c       2012-11-02 10:42:45.341228735 -0400
@@ -339,6 +339,7 @@
         pm_hosterror = snd_seq_delete_port(seq, desc->this_port);
     }
     alsa_unuse_queue();
+    midi->descriptor = NULL;
     pm_free(desc);
     if (pm_hosterror) {
         get_alsa_error_text(pm_hosterror_text, PM_HOST_ERROR_MSG_LEN,

@mixxxbot
Copy link
Collaborator Author

Commented by: rryan
Date: 2012-11-20T04:12:52Z


The state of affairs here is that we are waiting for PortMIDI to commit the fix and then we can try to get the Debian maintainer to repackage it.

@mixxxbot
Copy link
Collaborator Author

Commented by: rryan
Date: 2012-12-06T19:43:03Z


The fix is now in PortMIDI r226 so I've gotten in touch with piem (the portmidi package maintainer) to update it.

@mixxxbot
Copy link
Collaborator Author

Commented by: crichton
Date: 2012-12-06T20:19:53Z


The attachment "pmlinuxalsa.c patch to prevent using null pointer in handle_event()" of this bug report has been identified as being a patch. The ubuntu-reviewers team has been subscribed to the bug report so that they can review the patch. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-reviewers team please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

@mixxxbot
Copy link
Collaborator Author

Commented by: quadrispro
Date: 2012-12-23T22:32:48Z


[Impact]

  • It affects users who rely on Portmidi.

  • When disabling a very chatty controller (moving platter) and enabling another (non-chatty, doesn't matter which) you get a segfault. This happens with no tracks loaded. (The SCS.1d is the controller in question in this case and it constantly sends timestamp messages even when it's stopped. FWIW, these are Sysex messages that are 18 bytes long.)

[Test Case]

  1. Connect and turn on a very chatty controller
  2. Start Mixxx.
  3. Go to prefs and enable said controller, disabling all others. Click OK.
  4. Go back to prefs, un-check Enable on the chatty controller, check Enable on another (like Midi-Through) and click OK.
  5. Observe the segfault in Pm_Poll().

[Regression Potential]

The patch is minimalistic, there could be no regression at all.

[Other Info]

The patch was already uploaded to Debian unstable to fix an RC bug.

@mixxxbot
Copy link
Collaborator Author

Commented by: rryan
Date: 2012-12-24T21:20:43Z


Thank you Alessio! I'm marking Fix Released in Mixxx since this is fixed in Debian and Ubuntu Raring w/ backports on the way for Quantal and below.

@mixxxbot
Copy link
Collaborator Author

Commented by: quadrispro
Date: 2012-12-26T13:59:45Z


BTW, the patch has been accepted upstream too.

@mixxxbot
Copy link
Collaborator Author

Commented by: brian-murray
Date: 2013-01-10T21:18:24Z


Hello Sean, or anyone else affected,

Accepted portmidi into quantal-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/portmidi/1:200-0ubuntu1.12.10.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

@mixxxbot
Copy link
Collaborator Author

Commented by: brian-murray
Date: 2013-01-10T21:22:24Z


Hello Sean, or anyone else affected,

Accepted portmidi into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/portmidi/1:200-0ubuntu1.12.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

@mixxxbot
Copy link
Collaborator Author

Commented by: brian-murray
Date: 2013-01-10T21:26:30Z


Hello Sean, or anyone else affected,

Accepted portmidi into oneiric-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/portmidi/1:200-0ubuntu1.11.10.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

@mixxxbot
Copy link
Collaborator Author

Commented by: brian-murray
Date: 2013-01-10T21:29:57Z


Hello Sean, or anyone else affected,

Accepted portmidi into lucid-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/portmidi/1:200-0ubuntu1.10.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

@mixxxbot
Copy link
Collaborator Author

Commented by: quadrispro
Date: 2013-01-25T11:35:38Z


Confirmed working on Precise and Quantal.

Thanks.

@mixxxbot
Copy link
Collaborator Author

Commented by: cjwatson
Date: 2013-01-25T12:08:08Z


The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

@mixxxbot
Copy link
Collaborator Author

Commented by: janitor
Date: 2013-01-25T12:08:09Z


This bug was fixed in the package portmidi - 1:200-0ubuntu1.12.04.1

---------------
portmidi (1:200-0ubuntu1.12.04.1) precise-proposed; urgency=low

  * debian/patches/11-pmlinuxalsa.patch:
    - Avoid SIGSEGV when it receives data for devices which
      might have already been closed. (LP: #1073484)
    - Fix some other pointer issues:
      + alsa_in_close() didn't clear midi-descriptor.
      + Some other uses of midi->descriptor didn't do NULL-check of
        the pointer.
 -- Alessio Treglia <email address hidden>   Sun, 23 Dec 2012 22:14:44 +0000

@mixxxbot
Copy link
Collaborator Author

Commented by: cjwatson
Date: 2013-01-25T12:09:03Z


Any chance of verification on lucid and oneiric too?

@mixxxbot
Copy link
Collaborator Author

Commented by: janitor
Date: 2013-01-25T12:09:14Z


This bug was fixed in the package portmidi - 1:200-0ubuntu1.12.10.1

---------------
portmidi (1:200-0ubuntu1.12.10.1) quantal-proposed; urgency=low

  * debian/patches/11-pmlinuxalsa.patch:
    - Avoid SIGSEGV when it receives data for devices which
      might have already been closed. (LP: #1073484)
    - Fix some other pointer issues:
      + alsa_in_close() didn't clear midi-descriptor.
      + Some other uses of midi->descriptor didn't do NULL-check of
        the pointer.
 -- Alessio Treglia <email address hidden>   Sun, 23 Dec 2012 22:14:44 +0000

@mixxxbot
Copy link
Collaborator Author

Commented by: quadrispro
Date: 2013-01-26T12:48:59Z


Just to confirm the patch works also on Oneiric and Lucid.

@mixxxbot
Copy link
Collaborator Author

Commented by: cjwatson
Date: 2013-01-28T12:51:52Z


Thanks!

@mixxxbot
Copy link
Collaborator Author

Commented by: janitor
Date: 2013-01-28T12:52:13Z


This bug was fixed in the package portmidi - 1:200-0ubuntu1.10.04.1

---------------
portmidi (1:200-0ubuntu1.10.04.1) lucid-proposed; urgency=low

  * debian/patches/11-pmlinuxalsa.patch:
    - Avoid SIGSEGV when it receives data for devices which
      might have already been closed. (LP: #1073484)
    - Fix some other pointer issues:
      + alsa_in_close() didn't clear midi-descriptor.
      + Some other uses of midi->descriptor didn't do NULL-check of
        the pointer.
 -- Alessio Treglia <email address hidden>   Sun, 23 Dec 2012 22:25:35 +0000

@mixxxbot
Copy link
Collaborator Author

Commented by: janitor
Date: 2013-01-28T12:52:17Z


This bug was fixed in the package portmidi - 1:200-0ubuntu1.11.10.1

---------------
portmidi (1:200-0ubuntu1.11.10.1) oneiric-proposed; urgency=low

  * debian/patches/11-pmlinuxalsa.patch:
    - Avoid SIGSEGV when it receives data for devices which
      might have already been closed. (LP: #1073484)
    - Fix some other pointer issues:
      + alsa_in_close() didn't clear midi-descriptor.
      + Some other uses of midi->descriptor didn't do NULL-check of
        the pointer.
 -- Alessio Treglia <email address hidden>   Sun, 23 Dec 2012 22:25:35 +0000

@mixxxbot
Copy link
Collaborator Author

Commented by: dr-graef
Date: 2013-02-02T19:19:35Z


Alessio, could you please revisit the portmidi update that you released a few days ago? It's a minor glitch, but the libportmidi.so from the 12.04.1 update isn't properly linked and breaks Python+PortMidi applications such as Frescobaldi. Full bug report with suggested patch here: https://bugs.launchpad.net/ubuntu/+source/portmidi/+bug/1110326

@mixxxbot
Copy link
Collaborator Author

Issue closed with status Fix Released.

@mixxxbot mixxxbot transferred this issue from another repository Aug 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant