New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
grc: Fix cairo assertion failure by not storing reference to context #6352
Conversation
The connection flowgraph element previously kept a reference to the cairo context passed to the draw function in order to be able to use cairo's `in_stroke` function to determine if the mouse cursor was on the path of the curved connection line. As it turns out, this is dangerous because GTK is constantly destroying and creating new cairo contexts and surfaces. This avoids keeping a reference to the cairo context by initializing a local cairo context with the connection class for the sole purpose of storing the curved path and calculating `in_stroke`. The local context and surface can be very basic because not much is needed for `in_stroke`, and the context can persist and just have its path replaced when it needs to be updated. On Windows, resizing the GRC window (and particularly the flowgraph canvas) to a larger size than its initial size would cause the underlying cairo surface to be destroyed and a new one created. The cairo context stored for the curved connection line had a reference to that destroyed surface, and closing that context (upon replacing it with a new one) would attempt decrement the reference count on the surface. This would produce an assertion failure that crashed GRC stating `Assertion failed: CAIRO_REFERENCE_COUNT_HAS_REFERENCE (&surface->ref_count)`. On macOS with the Quartz backend, a related crash and assertion error would manifest when connecting blocks. Signed-off-by: Ryan Volz <ryan.volz@gmail.com>
All this did was request a larger size for the cairo surface of the DrawingArea on which this is set, which very roughly worked around the cairo context reference bug for users with high DPI displays. Now that this bug is fixed by the previous commit, this is not necessary. It's important to note that GTK takes care of the DPI scaling for high DPI displays behind the scenes, so this doesn't even help to scale the flowgraph properly. If there are further bugs with scaling, don't look here. Signed-off-by: Ryan Volz <ryan.volz@gmail.com>
I now also think that #5111 was a partial, circumstantial workaround for the same bug that this addresses, so I've added a commit to undo that one. Now that I understand better how cairo and GTK work, it's clear that it wasn't fixing any scaling issues and only had the effect of making the flowgraph canvas larger on high DPI displays, which just so happened to prevent triggering this bug for some of those users. That analysis agrees with user experience with actual scaling bugs like #5841, where having that commit or reversing it made no difference. |
Anyone able to test this out on mac? (tagging names from #4174) |
Thanks! This fixed my crash connecting blocks. #5431 GNU Radio Companion 3.10.4.0 MacOS 10.15.7 |
WRT |
Thank you! This bug has been plaguing me for years! I replaced the two python files on my Windows installation, and all is fine (and the fix makes sense, too.) |
Right, it only changes the size of the |
We have confirmation from a few users now in the various bug reports that this fixes those issues (linked by the PR). I haven't heard back yet about a few of the other issues that are potentially related (e.g. GRC 1/4 window size on macOS), but the reports now cover everything I thought should be fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the confirmation of fixes, let's go ahead and merge this one! Very nice!
This is huge. This could be the most impactful bugfix of the year as the combo of issues have been total blockers for huge groups of potential users. Ryan, thank you so much for tracking this down. |
I am running GnuRadio v 3.10.4.0 installed from home-brew on Ventura 13.0.1 and an M1 and getting the segfaults when using gnu-radio companion. It does not segfault right away... but at random times... even if just moving around the widget menu. How can I tell if this particular fix (#6352) is included in the version that I am running? Should I build from source? |
@cryptik This fix won't be in your 3.10.4.0 installation, but it's easy enough to modify it without even reinstalling. Just find where Homebrew has installed the particular Python files that this changes and edit them manually as in this PR. |
Hi @ryanvolz, thanks for the quick response. I was able to modify the files indicated in the PR. I am not sure if this is the correct place, but in my Homebrew install, it appears the python files are located here (/opt/homebrew/Cellar/gnuradio/3.10.4.0_1/lib/python3.10/site-packages/gnuradio/grc/gui). Hopefully there are not also copied somewhere else. After making the changes, GRC is hugely more stable. I do still get segfaults, but they are very infrequent now. |
I might have a different problem. It seems to still segfault whenever I try and load a .grc file from disk. It loads the file but as soon as I move the mouse, it segfaults. Running a .grc file seems to do the same thing, but on a more random basis. This Mac M1 has been more trouble than it's worth. I so wish I could return it. |
Windows users have reported Deluge crashes when resizing the window with Piecesbar or Stats plugins enabled: Expression: CAIRO_REFERENCE_COUNT_HAS_REFERENCE(&surface->ref_count) This is similar to issues fixed in GNU Radio which is a problem due to storing the current cairo context which is then being destroyed and recreated within GTK causing a reference count error Fixes: https://dev.deluge-torrent.org/ticket/3339 Refs: gnuradio/gnuradio#6352
Windows users have reported Deluge crashes when resizing the window with Piecesbar or Stats plugins enabled: Expression: CAIRO_REFERENCE_COUNT_HAS_REFERENCE(&surface->ref_count) This is similar to issues fixed in GNU Radio which is a problem due to storing the current cairo context which is then being destroyed and recreated within GTK causing a reference count error Fixes: https://dev.deluge-torrent.org/ticket/3339 Refs: gnuradio/gnuradio#6352
Windows users have reported Deluge crashes when resizing the window with Piecesbar or Stats plugins enabled: Expression: CAIRO_REFERENCE_COUNT_HAS_REFERENCE(&surface->ref_count) This is similar to issues fixed in GNU Radio which is a problem due to storing the current cairo context which is then being destroyed and recreated within GTK causing a reference count error Fixes: https://dev.deluge-torrent.org/ticket/3339 Refs: gnuradio/gnuradio#6352
Windows users have reported Deluge crashes when resizing the window with Piecesbar or Stats plugins enabled: Expression: CAIRO_REFERENCE_COUNT_HAS_REFERENCE(&surface->ref_count) This is similar to issues fixed in GNU Radio which is a problem due to storing the current cairo context which is then being destroyed and recreated within GTK causing a reference count error Fixes: https://dev.deluge-torrent.org/ticket/3339 Refs: gnuradio/gnuradio#6352
Windows users have reported Deluge crashes when resizing the window with Piecesbar or Stats plugins enabled: Expression: CAIRO_REFERENCE_COUNT_HAS_REFERENCE(&surface->ref_count) This is similar to issues fixed in GNU Radio which is a problem due to storing the current cairo context which is then being destroyed and recreated within GTK causing a reference count error Fixes: https://dev.deluge-torrent.org/ticket/3339 Refs: gnuradio/gnuradio#6352 Closes: #431
Description
The connection flowgraph element previously kept a reference to the cairo context passed to the draw function in order to be able to use cairo's
in_stroke
function to determine if the mouse cursor was on the path of the curved connection line. As it turns out, this is dangerous because GTK is constantly destroying and creating new cairo contexts and surfaces.This avoids keeping a reference to the cairo context by initializing a local cairo context with the connection class for the sole purpose of storing the curved path and calculating
in_stroke
. The local context and surface can be very basic because not much is needed forin_stroke
, and the context can persist and just have its path replaced when it needs to be updated.On Windows, resizing the GRC window (and particularly the flowgraph canvas) to a larger size than its initial size would cause the underlying cairo surface to be destroyed and a new one created. The cairo context stored for the curved connection line had a reference to that destroyed surface, and closing that context (upon replacing it with a new one) would attempt to decrement the reference count on the surface. This would produce an assertion failure that crashed GRC stating
Assertion failed: CAIRO_REFERENCE_COUNT_HAS_REFERENCE (&surface->ref_count)
.On macOS with the Quartz backend, a related crash and assertion error would manifest when connecting blocks. I only figured out that these two were essentially the same bug when I saw that my fix to the Windows crash worked around the same piece of code as an older patch carried by MacPorts and, up until recently until it was dropped because of another bug, the conda package. Considering the other bug that that patch triggered with the conda-forge
cairo
(which has since been fixed), I think this approach is lighter weight and has less potential for future problems.Related Issue
Fixes #2726.
Fixes #2938.
Fixes #5431.
Fixes #5734.
Which blocks/areas does this affect?
GRC
Testing Done
I found a reliable way to trigger #2938 on Windows, and with this fix I don't see any crashes. Selecting connection lines still works as well.
Checklist
I have updated the documentation where necessary.I have added tests to cover my changes, and all previous tests pass.