New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
segfault in certain programs when used with shim library #69
Comments
modified original scripts, ksh was noise, also variables all just use $HOME now. convert.sh.txt now it is reproducible on 14.04 and my 18.04 laptop. Real bug. |
well... it has nothing to do with AMQP, or log files. have removed the SR_POST_CONFIG setting so that it does not initialize configuration or attempt any sort of connection to broker. Also replaced log file handling with just plain fprintf( stderr. (by defining macro... fb5550169bdb56207197b168fdb637d4a5d7a856 |
this should actually be in the sarrac project. oops. |
if I add gdb to the beginning of the line that fails (and compiled with SR_POST_CONFIG, and SR_SHIMDEBUG set, the listing ends with: R_SHIMDEBUG fclose NO POST read-only. |
something the library is calling that should not reflect the exit status. this is for MetPX/sarracenia#69 and it helps in the sense that with the previous patch and this one, GDB no longer segfaults, but the tclsh case still does.
whatever the problem is, it is in libsrshim.c itself... in the working file I have it doesn't even call any of the test of the project. something to do with stderr, errno, exit status and/or logging. |
I replaced all occurrences of stderr with an srlog file descriptor, and it then runs fine. |
ok so don't need image magick either... tclsh by itself is enough export SR_SHIMDEBUG=1 tclsh $HOME/test/cmoi_convert/hello.tcl gives same error: ... |
http://wiki.tcl.tk/8489 |
It looks like tclsh sets the return code non-zero (failure) if anything is written to stderr. |
Found this: If we add the -ignorestderr option to exec, all is fine. exec -ignorestderr convert |
though it turns out tclsh was a poor example, it did serve to illustrate a real issue, as gdb was also seg faulting. The previous fixes have addressed that, and need to get into a future release. |
2.18.07b4 released a few weeks ago. the client reports other stuff crashing now... I believe this is related to those binaries using uninitialized values, where the stack has already been used because it has been used by the shim library. Will close for now, but if that proves false, then will need to re-open. |
running in a ubuntu xenial container...on certain servers (not reproducible on 18.04 test machine)
given the following script:
(rename to get around filtering.)
anyways, on the systems where it happens the third invocation via tcl fails like so:
018-07-21 12:49:00,298 [DEBUG] sr_post file2message start with: /fs/home/fs1/ssc/di/pas037/test/cmoi_convert/2018071712_ObsMap_0008_tcl.png sb=0x7ffeada6e820 islnk=0, isdir=0, isreg=1$File.ppm $ {File}_tcl.png"
2018-07-21 12:49:00,302 [INFO] published: 20180721124900.298754778 sftp://peter@localhost/ /fs/home/fs1/ssc/di/pas037/test/cmoi_convert/2018071712_ObsMap_0008_tcl.png topic=v02.post.fs.home.fs1.ssc.di.pas037.test.cmoi_convert sum=s,842a30434757bd5f9bcedc7cccc987fb03fe568b6817fc567b61e98dfc998cd3c83e3f54d64e18a98795628a59f06290231512ab553021fdc1788913244f3030 source=pas037 to_clusters=hpfx1.science.gc.ca from_cluster=hpfx1.science.gc.ca mtime=20180721124900.263802 atime=20180721122726.586971969 mode=0644 parts=1,49229,1,0,0
while executing
"exec convert
(file "/home/pas037/test/cmoi_convert/convert.tcl" line 3)
output3:1
The text was updated successfully, but these errors were encountered: