Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can't open file '/lib/python3.9/site-packages/STATmain.py' #42

Closed
antonl321 opened this issue Apr 28, 2022 · 16 comments · Fixed by spack/spack#30505 or spack/spack#30601
Closed

Comments

@antonl321
Copy link

Hi,

I have installed with spack the latest STAT and the launch fails with the message from below:

/lus/h2resw01/hpcperm/la/Tools/spack/opt/spack/linux-rhel8-zen/gcc-8.4.1/python-3.9.12-5ac5g7upuz2pwa7x6oxpxvx7leq32a2q/bin/python3.9: can't open file '/lib/python3.9/site-packages/STATmain.py': [Errno 2] No such file or directory

An install that I have done on a similar system in February worked fine. the only difference I spotted is that the new install uses a slightly newer version of python: 3.9.12 vs 3.9.9

Any idea of what's going on here? Python should be looking in the spack dirs for the STATmain.py.

Lucian Anton

@lee218llnl
Copy link
Collaborator

The stat-cl and stat-gui wrapper scripts use the following command to run the actual executable:

exec /usr/bin/python ${prefix}/lib/python2.7/site-packages/STATmain.py gui $@

And earlier in those scripts, ${prefix} should be set like this for example:

export prefix=/usr/tce/packages/stat/stat-4.0.2

Can you see if your stat-cl and stat-gui scripts are setting prefix properly?

@antonl321
Copy link
Author

Ok, following our lead I found that PYTHON_PREFIX was not defined. Spack should have fix it, anyway.
I set it to correct value but I get this symbol error at launch. Any hint ?

2711156: /lus/h2resw01/hpcperm/la/Tools/spack/opt/spack/linux-rhel8-zen/gcc-8.4.1/gdk-pixbuf-2.42.6-lm3clg77j3q3dpmhogldhwgpctus7fv5/lib/gdk-pixbuf-2.0/2.10.0/loaders/libpixbufloader-svg.so: error: symbol lookup error: undefined symbol: g_module_check_init (fatal)
2711156: /lus/h2resw01/hpcperm/la/Tools/spack/opt/spack/linux-rhel8-zen/gcc-8.4.1/gdk-pixbuf-2.42.6-lm3clg77j3q3dpmhogldhwgpctus7fv5/lib/gdk-pixbuf-2.0/2.10.0/loaders/libpixbufloader-svg.so: error: symbol lookup error: undefined symbol: g_module_unload (fatal)

Cheers,

Lucian

@lee218llnl
Copy link
Collaborator

Can you try running:

ldd /lus/h2resw01/hpcperm/la/Tools/spack/opt/spack/linux-rhel8-zen/gcc-8.4.1/gdk-pixbuf-2.42.6-lm3clg77j3q3dpmhogldhwgpctus7fv5/lib/gdk-pixbuf-2.0/2.10.0/loaders/libpixbufloader-svg.so

That symbol should be defined in the libgtk library:

$ nm gtkplus-3.24.29-avzxhxmi47tnyxenhob6shys4nbfk5eb/lib/libgtk-3.so.0.2404.25 | grep g_module_unload
00000000003d9a50 t gtk_theming_module_unload

@antonl321
Copy link
Author

antonl321 commented Apr 28, 2022

ldd output looks fine but some of the symbols are not found by nm.

But I think that the missing symbols messages are a red herring. I see them also when I run stat-gui installed previously. This one works.

I should mention perhaps that when I launch stat-gui the STAT the start window appears for a fraction of a second and then it crashes.

@antonl321
Copy link
Author

I got a core file from the crash. Do you know which executable I have to pass to gdb together with the core file?

@lee218llnl
Copy link
Collaborator

You can pass it the python executable.

@antonl321
Copy link
Author

antonl321 commented May 4, 2022

with export PYTHONDEVMODE=1 I get the following

label = gtk.Label("Please Wait...")
/lus/h2resw01/hpcperm/atosla/Tools/spack/opt/spack/linux-rhel8-zen/gcc-8.4.1/stat-develop-rv3h57xami2zet7347sasplfvntpesq3/lib/python3.9/site-packages/STATGUI.py:839: DeprecationWarning: Gtk.ScrolledWindow.add_with_viewport is deprecated
self.sw.add_with_viewport(vbox)
Fatal Python error: Segmentation fault

Current thread 0x000015330427f740 (most recent call first):
File "/lus/h2resw01/hpcperm/atosla/Tools/spack/opt/spack/linux-rhel8-zen/gcc-8.4.1/python-3.9.12-5ac5g7upuz2pwa7x6oxpxvx7leq32a2q/lib/python3.9/site-packages/gi/overrides/Gtk.py", line 580 in run
File "/lus/h2resw01/hpcperm/atosla/Tools/spack/opt/spack/linux-rhel8-zen/gcc-8.4.1/stat-develop-rv3h57xami2zet7347sasplfvntpesq3/lib/python3.9/site-packages/STATGUI.py", line 1260 in on_attach
File "/lus/h2resw01/hpcperm/atosla/Tools/spack/opt/spack/linux-rhel8-zen/gcc-8.4.1/stat-develop-rv3h57xami2zet7347sasplfvntpesq3/lib/python3.9/site-packages/STATGUI.py", line 424 in init
File "/lus/h2resw01/hpcperm/atosla/Tools/spack/opt/spack/linux-rhel8-zen/gcc-8.4.1/stat-develop-rv3h57xami2zet7347sasplfvntpesq3/lib/python3.9/site-packages/STATGUI.py", line 2481 in STATGUI_main
File "/lus/h2resw01/hpcperm/atosla/Tools/spack/opt/spack/linux-rhel8-zen/gcc-8.4.1/stat-develop-rv3h57xami2zet7347sasplfvntpesq3/lib/python3.9/site-packages/STATmain.py", line 138 in
Segmentation fault (core dumped)

the source code in Gtk.py:580 is
` 578 with register_sigint_fallback(self.destroy):

579             with wakeup_on_signal():

580                 return Gtk.Dialog.run(self, *args, **kwargs)

`

@lee218llnl
Copy link
Collaborator

It might help if you run gdb on the core file or run STAT under gdb so we can see where in the GTK implementation it is seg faulting. Based on the python trace, though, this looks like it is not a STAT bug, but rather a GTK bug.

@antonl321
Copy link
Author

I don't understand this. I build with python 3.9.9 which works on another system (same SW stack) and I get the the crash. Not many other version diffs in the spack dependencies between the two installations.

The first problem with the new install is that PYTHON_PREFIX is not defined, hence the first failure. Do you know if this variable is comes form spack environment or it belongs to STAT package?

@lee218llnl
Copy link
Collaborator

I was able to poke at this a bit, but didn't reach any conclusions. First, I will caution you to be careful when attributing the cause of the problem. It is not the python version that is causing this. I found that the problem occurs with 3.9.9 and 3.9.12. There may be another dependence in the spack dependence chain that is contributing to this issue. The PYTHON_PREFIX variable is a macro that should be expanded during the configure step. I will have to do some more debugging/research to figure out why this isn't being expanded properly.

@lee218llnl
Copy link
Collaborator

I was able to track the change to moving from automake 1.15 to automake 1.16. With that in mind, can you `spack uninstall -a stat' to remove your current stat installations and then reinstall stat with spack adding ^automake@1.15. Let me know if that fixes it for you. If so, I will update the spack package to reflect this until I can fix the issue in the STAT build system itself.

@lee218llnl
Copy link
Collaborator

the spack PR was merged. If you do a git pull in your spack directory, you should get an updated stat package.py that forces automake version 1.15 (and also dyinst 11.X). Please let me know if this works for you.

@antonl321
Copy link
Author

Hi,
The new build solves the PYHTON_PREFIX issue but the gtk segfault happens exactly in the same place.
I attach two files with the output of spack find -d for the working and broken installation. The top line in each file is the spack version.

Also I google about gtk seg fault and I found this issue which could be relevant
msys2/MINGW-packages#7992

stat-broken.txt
stat-works.txt
.

@lee218llnl
Copy link
Collaborator

OK, I'm glad the PYTHON_PREFIX issue is fixed. I too see seg faults with stat-gui. It's not clear if the crash in the link you sent is related. FWIW, here is part of the trace for the STAT GUI crash:

_pygi_invoke_closure_free,                  FP=7fffffffa190
     g_source_callback_unref,                    FP=7fffffffa1a0
     g_source_destroy_internal,                  FP=7fffffffa1d0
     g_main_context_dispatch,                    FP=7fffffffa240
     g_main_context_iterate.constprop.0,         FP=7fffffffa2a0
     g_main_loop_run,                            FP=7fffffffa2c0
gtk_dialog_run,                             FP=7fffffffa330
     ffi_call_unix64,                            FP=7fffffffa350
     ffi_call_int,                               FP=7fffffffa400
pygi_invoke_c_callable,                     FP=7fffffffa4b0
pygi_function_cache_invoke,                 FP=7fffffffa520
_PyObject_Call,                             FP=7fffffffa560
do_call_core,                               FP=7fffffffa690```

I am going to try building a few more ways to see if we can work around this.

@lee218llnl
Copy link
Collaborator

Based on a diff of your outputs and a google search of the stack trace, I found https://gitlab.gnome.org/GNOME/gobject-introspection/-/merge_requests/283 and figured out that this issue is with the move from libffi@3.3 to libffi@3.4.2. If you add ^libffi@3.3, then you will not get the seg fault. I will work on a PR for spack and py-xdot.

@antonl321
Copy link
Author

Tested on my system. Works fine.
Thanks for your prompt help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants