Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QGIS PyQGIS bug/crash regression #273561

Closed
dvdkon opened this issue Dec 11, 2023 · 30 comments · Fixed by #274408
Closed

QGIS PyQGIS bug/crash regression #273561

dvdkon opened this issue Dec 11, 2023 · 30 comments · Fixed by #274408
Assignees

Comments

@dvdkon
Copy link

dvdkon commented Dec 11, 2023

Describe the bug

I ran into a bug in QGIS which I described in an upstream bug report. To summarise, constructing some objects in the embedded Python is not possible as usual, and trying to work around the bug results in a crash.

Importantly, this only happens in the current (unstable or 23.11) NixOS build of QGIS, not in the upstream Flatpak or in a 22.05 NixOS build.

Steps To Reproduce

  1. Run recent NixOS QGIS build with nix run github:NixOS/nixpkgs/nixos-23.11#qgis.
  2. Open Python Console (Ctrl-Alt-P) and run QgsRuleBasedRenderer.Rule(QgsSymbol.defaultSymbol(QgsWkbTypes.LineGeometry))
    You should get this error: Rule(): not enough arguments
  3. Run QgsRuleBasedRenderer.Rule(QgsSymbol.defaultSymbol(QgsWkbTypes.LineGeometry), None, label="")
    QGIS should crash with SIGSEGV.
  4. Run the same on an older package (nix run github:NixOS/nixpkgs/nixos-22.05#qgis) or from the Flatpak. The first example should work without errors and the second should throw an error (invalid args).

Expected behavior

Same as the official Flatpak: no crashes and a success for the first code snippet.

Notify maintainers

@imincik @lsix

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

[user@system:~]$ nix-shell -p nix-info --run "nix-info -m"
 - system: `"x86_64-linux"`
 - host os: `Linux 6.5.7, NixOS, 24.05 (Uakari), 24.05pre555097.91050ea1e57e`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.18.1`
 - channels(root): `"nixos"`
 - nixpkgs: `/nix/var/nix/profiles/per-user/root/channels/nixos`

Add a 👍 reaction to issues you find important.

@imincik
Copy link
Contributor

imincik commented Dec 11, 2023

@dvkon , thanks for reporting. I confirm that QGIS is crashing on my system as well

@imincik
Copy link
Contributor

imincik commented Dec 11, 2023

@dvdkon , are you able to check which Python version is used in your Flatpak QGIS package ?

@dvdkon
Copy link
Author

dvdkon commented Dec 11, 2023

@dvdkon , are you able to check which Python version is used in your Flatpak QGIS package ?

Sure, here's sys.version: "3.11.5 (main, Nov 10 2011, 15:00:00) [GCC 13.2.0]"

@imincik
Copy link
Contributor

imincik commented Dec 11, 2023

@imincik
Copy link
Contributor

imincik commented Dec 11, 2023

I tried to build QGIS with Python 3.10 and it can be crashed with the same command.

@imincik
Copy link
Contributor

imincik commented Dec 11, 2023

I did some investigation, but for now, I can't see any potential nixpkgs packaging issue related to this problem.

@dvdkon
Copy link
Author

dvdkon commented Dec 12, 2023

Thanks, @imincik. I did some digging of my own, and I think it's a problem with SIP. In one of the generated files (python/core/build/_core/sip_corepart5.cpp) I found the generated interop function for the faulty constructor, init_type_QgsRuleBasedRenderer_Rule. It contains a call to sipParseKwdArgs() with a string "pattern" which defines how arguments should be extracted. Now it's "#J:|iiJ1J1J1b". I looked at both the parser and generator code, and I can't find what the colon should be doing, so given that the crash is in this function's descendant, I think this pattern string is the problem.

I'm trying to build QGIS with a different version of SIP, but the latest 6.8.0 doesn't seem to work right now.

EDIT: SIP 6.8.1 was just released, so I'll try with that.

@imincik
Copy link
Contributor

imincik commented Dec 12, 2023

Thank you very much for debugging @dvdkon . I very much appreciate it and hope it will help us to move forward with this issue.
I immediately expected the problem with sip upgrade, but there is only very small version difference between nixos 23.05 and unstable (6.7.7 -> 6.7.11).

If this is really silent regression which we didn't notice at all, then we pay a big price for not running unit tests (#267919).

@imincik
Copy link
Contributor

imincik commented Dec 12, 2023

Thanks, @imincik. I did some digging of my own, and I think it's a problem with SIP. In one of the generated files (python/core/build/_core/sip_corepart5.cpp) I found the generated interop function for the faulty constructor, init_type_QgsRuleBasedRenderer_Rule. It contains a call to sipParseKwdArgs() with a string "pattern" which defines how arguments should be extracted. Now it's "#J:|iiJ1J1J1b". I looked at both the parser and generator code, and I can't find what the colon should be doing, so given that the crash is in this function's descendant, I think this pattern string is the problem.

@timlinux , any ideas ?

@imincik
Copy link
Contributor

imincik commented Dec 12, 2023

Looking at failed tests from #267919 , many of them fail on not enough arguments.

Like this one

ERROR: testSimpleLineWithOffset (__main__.TestQgsFillSymbolLayers.testSimpleLineWithOffset)
test that rendering a polygon with simple line symbol with offset results in closed line
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/imincik/Projects/geonix/code/dev/source/tests/src/python/test_qgsfillsymbollayers.py", line 59, in testSimpleLineWithOffset
    symbol = QgsFillSymbol()
             ^^^^^^^^^^^^^^^
TypeError: QgsFillSymbol(): not enough arguments

@dvdkon
Copy link
Author

dvdkon commented Dec 12, 2023

Sadly my build with SIP 6.8.1 crashes just as with 6.7.11. I'm not entirely sure I built it correctly. It might also be a good idea to try downgrading SIP to 6.7.7, but full builds take an hour+ on my computer, so I won't be doing that today.

I finally found what the : means: ':' - '0' is 0xA or FMT_AP_NO_CONVERTORS | FMT_AP_TRANSFER, so it's a regular flag that's just read by an "ungrepable" process.

@imincik
Copy link
Contributor

imincik commented Dec 12, 2023

Upstream is installing python3-sip in their testing container. python3-sip in Ubuntu 22.04 or even in Debian Sid is SIP version 4 . Maybe we should try to build with sip_4 package. I'll try that now.

@imincik
Copy link
Contributor

imincik commented Dec 12, 2023

To build with sip_4, I had to remove pyqt-builder from pythonBuildInputs because it was dragging sip 6. Then build failed on 'QtXml/QtXmlmod.sip' could not be found. I am now trying to build with patch mentioned in qgis/QGIS#54184
`

@imincik
Copy link
Contributor

imincik commented Dec 12, 2023

diff --git a/cmake/SIPMacros.cmake b/cmake/SIPMacros.cmake
index a0bfb1e6e22..7f3022a6ccd 100644
--- a/cmake/SIPMacros.cmake
+++ b/cmake/SIPMacros.cmake
@@ -101,7 +101,7 @@ MACRO(GENERATE_SIP_PYTHON_MODULE_CODE MODULE_NAME MODULE_SIP SIP_FILES CPP_FILES
       ENDIF( ${CONCAT_NUM} LESS ${SIP_CONCAT_PARTS} )
     ENDFOREACH(CONCAT_NUM RANGE 0 ${SIP_CONCAT_PARTS} )
 
-    SET(SIPCMD ${SIP_BUILD_EXECUTABLE} --no-protected-is-public --pep484-pyi --no-make --concatenate=${SIP_CONCAT_PARTS} --qmake=${QMAKE_EXECUTABLE} --include-dir=${CMAKE_CURRENT_BINARY_DIR} --include-dir=${PYQT_SIP_DIR} --api-dir ${CMAKE_BINARY_DIR}/python ${SIP_BUILD_EXTRA_OPTIONS})
+    SET(SIPCMD ${SIP_BUILD_EXECUTABLE} --no-protected-is-public --pep484-pyi --no-make --concatenate=${SIP_CONCAT_PARTS} --qmake=${QMAKE_EXECUTABLE} --include-dir=${CMAKE_CURRENT_BINARY_DIR} --include-dir=@pyQt5PackageDir@/PyQt5/bindings --api-dir ${CMAKE_BINARY_DIR}/python ${SIP_BUILD_EXTRA_OPTIONS})
 
     ADD_CUSTOM_COMMAND(
       OUTPUT ${_sip_output_files}

didn't help with sip: Unable to find file "QtXml/QtXmlmod.sip"

@dvdkon
Copy link
Author

dvdkon commented Dec 12, 2023

I just noticed that the backtrace from the crash contains pyqt5-sip, not sip: /nix/store/4f6hnd85mk2s49g64bjlbdwmav9px2ay-python3.11-pyqt5-sip-12.11.0/lib/python3.11/site-packages/PyQt5/sip.cpython-311-x86_64-linux-gnu.so.

I don't know what the difference is actually, but it could be that we have a problem with using different sip packages to build and to run the resulting binary. Running python3 -c 'import sipbuild; print(sipbuild.__file__)' prints /nix/store/2qcdslc5g6p56ldrqfyid7liydylz31j-python3.11-sip-6.7.11/lib/python3.11/site-packages/sipbuild/__init__.py, which is in a different package. (Don't mind the version, it's actually 6.8.1, I forgot to update package metadata).

@dvdkon
Copy link
Author

dvdkon commented Dec 12, 2023

It could be that we have a problem with using different sip packages to build and to run the resulting binary.

This is most likely the cause. PyQt5-sip contains its own copy of siplib.c, the runtime that, among other things, parses arguments. NixOS' version 12.11.0 is old and doesn't support the # specifier introduced after sip 6.7.7. The code then popped off an argument, didn't understand the specifier, and skipped it, mangling the parsing state.

The issue can be fixed by updating pyqt5_sip in nixpkgs, but I think it's worth investigating how/why there are two sip versions and how to sync the versions. Should QGIS' own bindings even be using PyQt5-sip?

I also think upstream should add more runtime checks here. They version ABIs, but apparently never check them. Also, throwing an error on an unknown arg specifier would have saved us both some hours of debugging.

@imincik
Copy link
Contributor

imincik commented Dec 12, 2023

In latest nixpkgs master:

nix why-depends .#qgis .#python3Packages.pyqt5_sip

nix/store/q0sbp8brz7j6frpkc9gs8xqyknf5wkah-qgis-3.34.1
└───/nix/store/d46yqf5f2cmjz8dm3y26hwqdqc54zkih-python3.11-pyqt5-sip-12.11.0

@dvdkon
Copy link
Author

dvdkon commented Dec 12, 2023

I can now confirm upgrading pyqt5_sip fixes the issue. @imincik could you make an update PR?

@imincik
Copy link
Contributor

imincik commented Dec 12, 2023

I can now confirm upgrading pyqt5_sip fixes the issue. @imincik could you make an update PR?

Great thanks once again ! I'll make a PR tomorrow. I am just building QGIS with updated pyqt5_sip to version 12.13.0.

@imincik
Copy link
Contributor

imincik commented Dec 12, 2023

@dvdkon , you are very welcome to join Nix Geospatial team if you are interested.

@imincik
Copy link
Contributor

imincik commented Dec 13, 2023

PyQT5_SIP PR created - #273914

@imincik
Copy link
Contributor

imincik commented Dec 14, 2023

@dvdkon , update of pyqt5_sip was merged to python-updates branch and will hopefully appear in master soon.

I am not sure what is the best approach to fix QGIS in stable nixos-23.11 branch. Backporting of pyqt5_sip 12.13.0 can cause regressions for other packages (or maybe it will fix bunch of packages, we really don't know).

What I need before starting some changes in stable branch is to know what are the compatible options. I was trying to find some compatibility matrix between python-sip and python-pyqt5_sip but I haven't found any information (https://www.riverbankcomputing.com/ is was not very useful). How did you know which versions are compatible ?

@imincik
Copy link
Contributor

imincik commented Dec 14, 2023

Or maybe do it other way around and write some code which will test compatibility. Then we can use it as a package test for both of them.

@imincik
Copy link
Contributor

imincik commented Dec 14, 2023

Checking Flatpak com.riverbankcomputing.PyQt.BaseApp which is used to build QGIS Flatpak package:

  • in branch 5.15-21.08 (older one) they build sip-6.7.9 with PyQt5_sip-12.12.1
  • in branch 5.15-23.08 (latest one) they build sip-6.8.1 with PyQt5_sip-12.13.0

In Nixpkgs we have

  • 23.11: sip 6.7.12 + pyqt5_sip 12.11.0
  • unstable: sip 6.7.12 + 12.13.0 (after our PR) (EDIT: sip 6.8.0 is now in python-updates branch)

@imincik
Copy link
Contributor

imincik commented Dec 14, 2023

Found some sip tests in QGIS.

@imincik
Copy link
Contributor

imincik commented Dec 14, 2023

Found some sip tests in QGIS.

Unfortunately, all sip tests passed with both pyqt5_sip 12.13.0 and 12.11.0 .

@dvdkon
Copy link
Author

dvdkon commented Dec 14, 2023

What I need before starting some changes in stable branch is to know what are the compatible options. I was trying to find some compatibility matrix between python-sip and python-pyqt5_sip but I haven't found any information (https://www.riverbankcomputing.com/ is was not very useful). How did you know which versions are compatible ?

I'm about as confused as you are, so I sent a message to the PyQt mailing list, hopefully someone will clarify.

My guess is that the version of PyQt5-sip corresponds to a SIP ABI version. Right now, the latest is v12.14, but we only have PyQt5-sip v12.13.0. The latest compatible pair then seem to be SIP v6.7.12 and PyQt5-sip v12.13.0.

@imincik
Copy link
Contributor

imincik commented Dec 15, 2023

My guess is that the version of PyQt5-sip corresponds to a SIP ABI version. Right now, the latest is v12.14, but we only have PyQt5-sip v12.13.0. The latest compatible pair then seem to be SIP v6.7.12 and PyQt5-sip v12.13.0.

Yes, this makes a sense.

@imincik
Copy link
Contributor

imincik commented Dec 15, 2023

PR for stable branch created #274408

imincik added a commit to imincik/geospatial-nix that referenced this issue Dec 15, 2023
@imincik imincik linked a pull request Dec 15, 2023 that will close this issue
13 tasks
@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/week-in-geospatial-team-11-17-dec-2023/37035/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants