Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Linux] Close/reopen editor several times crashes the host #249

Closed
ryukau opened this issue May 30, 2022 · 44 comments
Closed

[Linux] Close/reopen editor several times crashes the host #249

ryukau opened this issue May 30, 2022 · 44 comments

Comments

@ryukau
Copy link

ryukau commented May 30, 2022

Hi.

I'm experiencing a bug that close/reopen editor several times crashes the host. This bug was fixed on commit 1fe9a38, but introduced again on a8b5c81.

Diff on below fixes this bug on VSTGUI 4.11.

diff --git a/vstgui/lib/platform/linux/x11frame.cpp b/vstgui/lib/platform/linux/x11frame.cpp
index 99393735..5a549e40 100644
--- a/vstgui/lib/platform/linux/x11frame.cpp
+++ b/vstgui/lib/platform/linux/x11frame.cpp
@@ -177,6 +177,7 @@ struct DrawHandler

        ~DrawHandler ()
        {
+               cairo_device_finish (device);
                cairo_device_destroy (device);
        }

This is a regression. I believe that failing on multiple editor is better than crashing on single editor, because the later means plugin is unusable.

For reference, stacktraces on the crash are on the links below.

@scheffle
Copy link
Collaborator

Hi @andreas56, do you have an opinion on this as you are the author of this regression?

@andreas56
Copy link
Contributor

No, sorry, I don't really. My fix solved the problem I had as described in the commit comment, and I didn't get any problem with closing/reopening after the fix, but I didn't use Reaper, and my plugin was actually not a VST plugin either.

@scheffle
Copy link
Collaborator

Hi @rehans, can you have a look?

@andreas56
Copy link
Contributor

I just tested my plugin again (it was a long time since last time), and I was able to reproduce the error. Adding cairo_device_finish line seems to solve it, yes, but it also reintroduces the problem with using two plugin instances at the same time.

@ryukau
Copy link
Author

ryukau commented May 31, 2022

I performed git bisect and a8b5c81 is the exact commit which re-introduced this bug. Also this issue is not only about my plugins, but VST 3 SDK example plugins are crashing.

@andreas56 Did you change your opinion after the second comment? I'd like to know if we agree that preventing crash is better for now.

@andreas56
Copy link
Contributor

Yes, sorry for not being clear, I agree that my commit is doing more harm than good and should be reverted.

@ryukau
Copy link
Author

ryukau commented May 31, 2022

@andreas56 Thank you. It helps a lot.

@scheffle We reached a consensus. Would you merge the patch in next release?

@scheffle
Copy link
Collaborator

Sure.

@rehans
Copy link
Contributor

rehans commented Jun 4, 2022

I can reproduce the crash while stress testing the plug-in window (open/close rapidly). But I cannot reproduce any bug with multiple plug-in instances.

Nevertheless I have implemented a prototype of DrawHandler using shared memory, xcb pixmap and cairo image surface. This fixes the crash at least and I do not see anything suspicious with multiple instances either.

@ryukau @andreas56 Please have a look at my fork of VSTGUI https://github.com/rehans/vstgui.git and test it. I only changed the vstgui/lib/platform/linux/x11frame.cpp file.

If this is working for all of us I will clean it up, make it pretty and talk to @scheffle for a pull request.

@andreas56
Copy link
Contributor

andreas56 commented Jun 4, 2022

I did a quick test with you fork, but with it, my GUI isn't visible at all. It just doesn't show anything in the frame. Same thing with the Minesweeper example for example: just a black window.

@rehans
Copy link
Contributor

rehans commented Jun 4, 2022

Please make sure to use the latest commit of my repository. I forgot to commit the CMakeLists.txt (missing xcb-image lib). So actually you should have received a compile error ;) It is fixed now of course.

I quick tested with reaper v6.53 and debug plug-ins

  • Latest Manjaro: my plug-in and AGain.vst3 open up and no crashes
  • Ubuntu 22.04 LTS: my plug-in and AGain.vst3 open up and no crashes

I could not get any GUI with Mandelbrot and Mindsweeper, neither with the old nor the new implementation. Seems to be broken anyway.

If I find the time the next days I will do some more testing.

@andreas56
Copy link
Contributor

Yes, I got a compile error, but I modified CMakeList.txt to solve it. (I added xcb-shm, not xcb-image, but it doesn't seem to matter, it still doesn't work for me when I changed it to xcb-image.) Minesweeper and mandelbrot both work with the old implementation (but I have to set GDK_BACKEND=x11), but not with the new.

@ryukau
Copy link
Author

ryukau commented Jun 5, 2022

@rehans Many thanks for the proper patch! I tested the fork on Fedora 36 and it's working here.

@andreas56 Make sure to:

  • Replace entire directory of vst3sdk/vstgui4 to rehans' fork.
  • Use release build.

I experienced crash when I only copied vstgui/lib/platform/linux/x11frame.cpp to the existing VST 3 SDK directory.

If you are on debug build, VST3 inline UI editor might be opening instead of your own GUI. If you right click on black screen and it opens some context menu, then this is probably the case. This happened on my environment when I added -DDEVELOPMENT=1.

@ryukau
Copy link
Author

ryukau commented Jun 5, 2022

rehans' fork is also working on Ubuntu 20.04.

Just in case, I only tested on plugins (AGain and NoteExpressionSynth). I'm not sure how to build standalone.

@andreas56
Copy link
Contributor

Ok, I've now tried using the entire rehans fork instead of just x11frame.cpp. No difference, I still just get a black frame. I also tried using Release build, also no difference. I'm not using VST SDK, only VSTGUI, that's why I'm testing with the Minesweeper example instead of AGain.

This is how I build VSTGUI, including the standalone examples:

mkdir build
cd build
CXXFLAGS=-fPIC cmake -DCMAKE_BUILD_TYPE=Release -GNinja ..
ninja

And this is how I run Minesweeper:

cd Release/Minesweeper
GDK_BACKEND=x11 ./Minesweeper

I'm using Magiea 8 linux and Gnome with Wayland.

Minesweeper works fine this way if I use the old implementation by changing DrawHandler2 to DrawHandler on line 524 in x11frame.cpp.

@ryukau
Copy link
Author

ryukau commented Jun 5, 2022

On my environment (Fedora 36 and Ubuntu 20.04), standalones are working with both current master and rehans' fork.

Below are the cmake configure output. Not sure if this is relevant, but I put them to show the compiler and library versions. Let me know if there's something I can help.

Fedora 36
$ CXXFLAGS=-fPIC cmake -DCMAKE_BUILD_TYPE=Release -GNinja ..
-- The C compiler identification is GNU 12.1.1
-- The CXX compiler identification is GNU 12.1.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/lib64/ccache/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/lib64/ccache/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found X11: /usr/include
-- Looking for XOpenDisplay in /usr/lib64/libX11.so;/usr/lib64/libXext.so
-- Looking for XOpenDisplay in /usr/lib64/libX11.so;/usr/lib64/libXext.so - found
-- Looking for gethostbyname
-- Looking for gethostbyname - found
-- Looking for connect
-- Looking for connect - found
-- Looking for remove
-- Looking for remove - found
-- Looking for shmat
-- Looking for shmat - found
-- Found Freetype: /usr/lib64/libfreetype.so (found version "2.12.1")
-- Found PkgConfig: /usr/bin/pkg-config (found version "1.8.0")
-- Checking for module 'xcb'
--   Found xcb, version 1.13.1
-- Checking for module 'xcb-util'
--   Found xcb-util, version 0.4.0
-- Checking for module 'xcb-cursor'
--   Found xcb-cursor, version 0.1.3
-- Checking for module 'xcb-keysyms'
--   Found xcb-keysyms, version 0.4.0
-- Checking for module 'xcb-xkb'
--   Found xcb-xkb, version 1.13.1
-- Checking for module 'xcb-image'
--   Found xcb-image, version 0.4.0
-- Checking for module 'xkbcommon'
--   Found xkbcommon, version 1.4.0
-- Checking for module 'xkbcommon-x11'
--   Found xkbcommon-x11, version 1.4.0
-- Checking for module 'glib-2.0'
--   Found glib-2.0, version 2.72.1
-- Checking for module 'cairo'
--   Found cairo, version 1.17.6
-- Checking for modules 'pangocairo;pangoft2'
--   Found pangocairo, version 1.50.7
--   Found pangoft2, version 1.50.7
-- Checking for module 'fontconfig'
--   Found fontconfig, version 2.14.0
-- Build type: Release
-- Building only vstgui
/home/cu/Desktop/vstgui_solo/rehans/vstgui4/vstgui/uidescription/editing
-- Found EXPAT: /usr/lib64/libexpat.so (found version "2.4.7")
-- Checking for module 'gtkmm-3.0'
--   Found gtkmm-3.0, version 3.24.6
-- Checking for module 'sqlite3'
--   Found sqlite3, version 3.36.0
-- Configuring done
-- Generating done
-- Build files have been written to: /home/cu/Desktop/vstgui_solo/rehans/vstgui4/build
Ubuntu 20.04
$ CXXFLAGS=-fPIC cmake -DCMAKE_BUILD_TYPE=Release -GNinja ..
-- The C compiler identification is GNU 9.4.0
-- The CXX compiler identification is GNU 9.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found X11: /usr/include
-- Looking for XOpenDisplay in /usr/lib/x86_64-linux-gnu/libX11.so;/usr/lib/x86_64-linux-gnu/libXext.so
-- Looking for XOpenDisplay in /usr/lib/x86_64-linux-gnu/libX11.so;/usr/lib/x86_64-linux-gnu/libXext.so - found
-- Looking for gethostbyname
-- Looking for gethostbyname - found
-- Looking for connect
-- Looking for connect - found
-- Looking for remove
-- Looking for remove - found
-- Looking for shmat
-- Looking for shmat - found
-- Looking for IceConnectionNumber in ICE
-- Looking for IceConnectionNumber in ICE - found
-- Found Freetype: /usr/lib/x86_64-linux-gnu/libfreetype.so (found version "2.10.1")
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.1")
-- Checking for module 'xcb'
--   Found xcb, version 1.14
-- Checking for module 'xcb-util'
--   Found xcb-util, version 0.4.0
-- Checking for module 'xcb-cursor'
--   Found xcb-cursor, version 0.1.1
-- Checking for module 'xcb-keysyms'
--   Found xcb-keysyms, version 0.4.0
-- Checking for module 'xcb-xkb'
--   Found xcb-xkb, version 1.14
-- Checking for module 'xcb-image'
--   Found xcb-image, version 0.4.0
-- Checking for module 'xkbcommon'
--   Found xkbcommon, version 0.10.0
-- Checking for module 'xkbcommon-x11'
--   Found xkbcommon-x11, version 0.10.0
-- Checking for module 'glib-2.0'
--   Found glib-2.0, version 2.64.6
-- Checking for module 'cairo'
--   Found cairo, version 1.16.0
-- Checking for modules 'pangocairo;pangoft2'
--   Found pangocairo, version 1.44.7
--   Found pangoft2, version 1.44.7
-- Checking for module 'fontconfig'
--   Found fontconfig, version 2.13.1
-- Build type: Release
-- Building only vstgui
/home/cu/Desktop/vstgui_solo/rehans/vstgui4/vstgui/uidescription/editing
-- Found EXPAT: /usr/lib/x86_64-linux-gnu/libexpat.so (found version "2.2.9")
-- Checking for module 'gtkmm-3.0'
--   Found gtkmm-3.0, version 3.24.2
-- Checking for module 'sqlite3'
--   Found sqlite3, version 3.31.1
-- Configuring done
-- Generating done
-- Build files have been written to: /home/cu/Desktop/vstgui_solo/rehans/vstgui4/build

@ryukau
Copy link
Author

ryukau commented Jun 5, 2022

It appears that some scaling on standalone is not working correctly on Ubuntu.

Below is a screen shot of Minesweeper on Ubuntu.

vstgui_minesweeper_ubuntu

Below is a screen shot of Minesweeper on Fedora. The red flag seems like triangular flag emoji. Font scaling on top part (Mines, Time, New Game) is correct however.

vstgui_minesweeper_fedora

@rehans
Copy link
Contributor

rehans commented Jun 5, 2022

@andreas56 I did the exact same thing on Manjaro and Ubuntu 22.04 and with the GDK_BACKEND being activated Mindsweeper is working nicely on both distros with DrawHandler and DrawHandler2.

So we got working results on:

  • Ubuntu 22.04, 20.04
  • Manjaro
  • Fedora 36

I have never heard of "Magiea 8" though ;) It is not even listed on distrowatch.com. I could not find out if it is
Debian/Ubuntu or Arch based (or independant?) either. I think it is almost impossible to support all distros out there. This is why the VST 3 SDK is focused on Ubuntu. My development environment is Manjaro, an Arch based distro. So this is covered as well.
Maybe you can debug this line https://github.com/rehans/vstgui/blob/develop/vstgui/lib/platform/linux/x11frame.cpp#L409 and check if at least all fields in xcbData are valid.

@ryukau I assume (I might be wrong though) that the wrong scaling is a different issue. All other controls are scaled correctly in Mindsweeper. It is only the flag icon which is not.

@andreas56
Copy link
Contributor

Sorry, I meant Mageia (previously known as Mandriva, and before that Mandrake). It's an rpm based distro, independant, even if it's probably similar to Fedora.

I'm not goot at xcb, so I don't know if these values are valid or not, but here are the values from line 409:
window=33554433 gcontext=33554441 pixmap=33554442 w=1504 h=1604

@rehans
Copy link
Contributor

rehans commented Jun 5, 2022

Sorry, I meant Mageia (previously known as Mandriva, and before that Mandrake). It's an rpm based distro, independant, even if it's probably similar to Fedora.

Ah ok, Mandrake sounds familiar ;)

I'm not goot at xcb, so I don't know if these values are valid or not, but here are the values from line 409: window=33554433 gcontext=33554441 pixmap=33554442 w=1504 h=1604

Neither am I ;) But the values look similar to what I got. Please also check the shared memory segment, which is xcbData.segmentInfo, espacially the address xcb_shm_segment_info_t::shmaddr of the segment. Does it show a potential valid address resp. is not nullptr?

Another thing you could try is to exchange 0600 with 0777 in https://github.com/rehans/vstgui/blob/develop/vstgui/lib/platform/linux/x11frame.cpp#L246 I assume this sets the access rights of the shared memory segment.

@andreas56
Copy link
Contributor

segmentInfo looks OK I think:

shmseg=33554440 shmdid=32806 shmaddr=0x7fcd9b150000

0777 didn't help.

I see the shm segment with ipcs:

------ Shared Memory Segments --------
key        shmid      owner      perms      bytes      nattch     status      
...
0x00000000 32806      andreas    777        9649664    2                       

@rehans
Copy link
Contributor

rehans commented Jun 5, 2022

Good! Well...it depends. Not so good, because I am running out of ideas ;)

I have just created https://github.com/rehans/xcb-shm-test.git which is the example I took as a base for the VSTGUI implementation. Run that please and check if you see a blue/purple square in the top left corner on a black background.

If this does not work, well, ...no idea.

@andreas56
Copy link
Contributor

andreas56 commented Jun 5, 2022

Ok, xcb-shm-test fails with "Shm error...". It's reply->shared_pixmaps that's 0, not reply being null. If I comment the 0 check, I get a black window, no blue/purple square.

@ryukau
Copy link
Author

ryukau commented Jun 5, 2022

@rehans I agree that scaling is a separate issue. It happens on current master too. Sorry for being off topic. I thought sharing any information is valuable.


Perhaps the issue is related to Wayland? I found following issue in regard of shared pixmap.

https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/659

@ryukau
Copy link
Author

ryukau commented Jun 5, 2022

I found another issue that says "shared memory pixmaps may not be available on NVIDIA by default."

jaelpark/chamferwm#10

To paraphrase, adding Option "AllowSHMPixmaps" "1" to somewhere in Section "Device", Section "ServerFlags" or Section "Screen" inside of xorg configuration might solve the problem, if using NVIDIA device.

@rehans
Copy link
Contributor

rehans commented Jun 5, 2022

@andreas56
My reply looks like this:

response_type:1
shared_pixmaps:1
sequence:5
length:0
major_version:1
minor_version:2
uid:1000
gid:1000
pixmap_format:2
pad0

What's yours looking like?

@ryukau Thanks for the wayland link. Looks a bit like our problem.

@andreas56
Copy link
Contributor

shared_pixmaps:0
sequence:5
length:0
major_version:1
minor_version:2
uid=1000
gid=1000
pixmap_format:0

The version of my XWayland is 1.20.14, which I think means that I don't have the change linked by ryukau.

@andreas56
Copy link
Contributor

According to distrowatch, Mageia is not the only one of the mentioned distros that's using 1.20:

Ubuntu 22.04 21.1.3
Ubuntu 20.04 1.20.8
Manjaro stable 21.1.3
Fedora 36 1.20.14
Mageia 8 1.20.10

@rehans
Copy link
Contributor

rehans commented Jun 5, 2022

My current installation of Manjaro shows 22.1.1-1 for XWayland. And @ryukau successfully tested on Ubuntu 20.04 and Fedora 36, with Ubuntu 20.04 even having a slightly older version like Mageia 8.

@andreas56
Copy link
Contributor

Yes, but did @ryukau use Wayland when testing on Ubuntu 20.04 or Fedora 36?

By the way, what is the reason for using shared memory? I guess the new DrawHandler implementation is not just for fixing the crash bug and the multiple instance bug?

@ryukau
Copy link
Author

ryukau commented Jun 5, 2022

My test environment was both using X11. I checked it using the instruction in the link below.

https://unix.stackexchange.com/a/325972

@rehans
Copy link
Contributor

rehans commented Jun 5, 2022

By the way, what is the reason for using shared memory? I guess the new DrawHandler implementation is not just for fixing the crash bug and the multiple instance bug?

Yes, it is just an attempt to fix the crash and the multiple instances bug. See also:

So I just tried that without knowing the obstacles. But I am more than happy if someone else comes up with a better and easier solution. The cairo_xcb_surface_create stuff seems to be buggy somehow.

@ryukau
Copy link
Author

ryukau commented Jun 5, 2022

I tested Minesweeper on Gnome on Wayland on Fedora 36, and both current master and rehans' fork are working with GDK_BACKEND=x11.

Screenshot from 2022-06-05 19-38-06

If I remove GDK_BACKEND=x11, then it shows empty window.

Screenshot from 2022-06-05 19-29-11

@ryukau
Copy link
Author

ryukau commented Jun 5, 2022

I'm also not familiar with X11, so can't provide proper fix.

For now, the options seem like following:

  1. Merge the rehans' fork.
  2. Apply the quick patch on the opening comment of this issue.
  3. Stay on current master.

I'm OK with 1 or 2. I prefer 1 if possible. I'd like to avoid 3.

@andreas56
Copy link
Contributor

My distrowatch research above wasn't very good it seems, here it instead says Fedora 36 is using XWayland 22.1.1. So, I still think the shm version doesn't work when you're using wayland and XWayland 1.20.

I've tried to debug the crash today, but I didn't get any wiser really. I added a copy of the _get_screen_index function from cairo and called i right before the call to cairo_xcb_surface_create in x11frame.cpp (I also had to add a _cairo_xcb_screen_from_visual to go from a visual to a screen), but the copied function had no problems finding the screen index, even in those cases where the one in cairo fails and causes the crash.

Option 1: I have no idea how unusual my setup is (I mean using Wayland + having a 1.20 version of XWayland). Maybe it's OK to ignore that combination and go with the shm solution.

Option 2: The problem caused by this (the multiple instances bug) has only been observed by me, right? It could very well be because of my experimental LV2+VSTGUI plugin, and nothing to worry about.

Option 3: Random crashes in all tested environments, all caused by me. No thanks. :)

@andreas56
Copy link
Contributor

I think I have a possible solution now. If I put the call to cairo_device_finish inside RunLoop::Impl::exit after the useCount check, instead of in the DrawHandler destructor, I both get rid of the random crashes and also have multiple plugin instances working. I have to cleanup the code a bit before letting you test it. I'll do that tomorrow.

@andreas56
Copy link
Contributor

I have pushed the fix now to my fork. Please try it out and/or review the commit.

@rehans
Copy link
Contributor

rehans commented Jun 6, 2022

I quick tested the fork and did not see any crashes or weird behavior, good! So you only call cairo_device_finish when the useCounton the Runloop becomes 0...as far as I can see. I did not understand all the cairo_device_destroy calls yet. I will have a look at that later.

@ryukau
Copy link
Author

ryukau commented Jun 6, 2022

@andreas56 The fork is working nicely on Ubuntu 20.04 and Fedora 36.


My understanding is that:

  • When CFrame::open() is called, it reaches RunLoop::init() via Frame::Frame() in x11frame.cpp.
  • When CFrame::close() is called, then Frame::~Frame() will be called. And it calls DrawHander::~DrawHander() then RunLoop::exit ().

So useCount is not likely cause a problem in usual situation, I guess.

I considered follwoing 2 situation for andreas56's fork, and I think it won't break.

  1. All device is equal.
  2. All device is different.

In case 1, it only goes into if (impl->device != device) branch in RunLoop::setDevice once, when a window is closed at first time. I guess this case was causing a problem on multiple windows? It might be possible to open absurd number of windows and overflow device, but no one will do that.

In case 2, it reaches if (impl->device != device) for every time a window (CFrame) is closed.

@andreas56
Copy link
Contributor

Thanks!

The move of the cairo_device_finish is the main thing. It fixes the case I had where I have two instances of a plugin open and then close one of them. The remaining one was then broken because cairo cleaned up resources needed.

The cairo_device_destroy calls are more theoretical, I just wanted the RunLoop to have a proper reference to the device, so it does a cairo_device_reference when it gets a new device object in setDevice. I say it's theoretical because the ref_count never seems to get close to zero anyway - that's why we need to call cairo_device_finish.

When opening a second frame, the xcb_connection will be the same as for the first running frame, which also means cairo will return the same device - so of your two situations, it's number one that happens.

@rehans
Copy link
Contributor

rehans commented Jun 6, 2022

Good! Then we have a way more simple solution than the shared memory one. And if there is anything going wrong in the future we still have the option to switch to shared memory if really necessary. But I am crossing fingers, that this will not be the case ;)

@andreas56
Copy link
Contributor

I just did a minor cleanup of my patch, which otherwise is waiting for merge in #251 .

@ryukau
Copy link
Author

ryukau commented Jun 8, 2022

I tested and confirmed cleanup patch is working. It makes sense when all the device is the same.

@myrrc
Copy link

myrrc commented Apr 26, 2024

Related: sfztools/sfizz-ui#136

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants