Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crash open #592

Closed
wcasanova opened this issue Apr 7, 2021 · 10 comments · Fixed by #593
Closed

crash open #592

wcasanova opened this issue Apr 7, 2021 · 10 comments · Fixed by #593

Comments

@wcasanova
Copy link

lastest commit crash in open
archlinux
htop-dev-git-3.0.5.r204.gf3a37f9
The solution was to remove the htoprc, but when configuring it gives the same error, it seems to be when the disk read / write column is set.
IO_READ_RATE, IO_WRITE_RATE

----------------------
The following function calls were active when the issue was detected:
---
htop(CRT_handleSIGSEGV+0xd5)[0x55b6f19a4695]
/usr/lib/libc.so.6(+0x3cf80)[0x7f9cfce8ef80]
htop(+0x2b09a)[0x55b6f19b909a]
htop(+0x2ad9c)[0x55b6f19b8d9c]
htop(ProcessList_scan+0xab)[0x55b6f19aefbb]
htop(CommandLine_run+0x60d)[0x55b6f19a318d]
/usr/lib/libc.so.6(__libc_start_main+0xd5)[0x7f9cfce79b25]
htop(_start+0x2e)[0x55b6f199ed1e]
---```
@BenBE
Copy link
Member

BenBE commented Apr 7, 2021

Can you get a full backtrace (bt full) from a debug build of htop from gdb alongside the file that htop is reading at that moment?

@wcasanova
Copy link
Author

htop.objdump.zip

@fasterit
Copy link
Member

fasterit commented Apr 7, 2021

Program received signal SIGFPE, Arithmetic exception.
 0x000055555557fdf2 in LinuxProcessList_readIoFile (now=0, procFd=6, process=0x555555670590) at linux/LinuxProcessList.c:429
429	            process->io_rate_read_bps = ONE_K * (process->io_read_bytes - last_read) / (now - process->io_last_scan_time);

(gdb) bt
#0  0x000055555557fdf2 in LinuxProcessList_readIoFile (now=0, procFd=6, process=0x555555670590) at linux/LinuxProcessList.c:429
#1  LinuxProcessList_recurseProcTree (this=this@entry=0x5555555a0890, parentFd=parentFd@entry=-100, 
    dirname=dirname@entry=0x55555558d424 "/proc", parent=parent@entry=0x0, period=period@entry=20081708.5, now=0)
    at linux/LinuxProcessList.c:1344
#2  0x000055555558226a in ProcessList_goThroughEntries (super=super@entry=0x5555555a0890, 
    pauseProcessUpdate=pauseProcessUpdate@entry=false) at linux/LinuxProcessList.c:2000
#3  0x00005555555753a7 in ProcessList_scan (this=this@entry=0x5555555a0890, pauseProcessUpdate=pauseProcessUpdate@entry=false)
    at ProcessList.c:615
#4  0x000055555556856d in CommandLine_run (name=<optimized out>, argc=<optimized out>, argv=<optimized out>) at CommandLine.c:335
#5  0x00007ffff7bf50b3 in __libc_start_main (main=0x5555555646b0 <main>, argc=1, argv=0x7fffffffe5f8, init=<optimized out>, 
    fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe5e8) at ../csu/libc-start.c:308
#6  0x00005555555646fe in _start ()

(gdb) bt full 3
#0  0x000055555557fdf2 in LinuxProcessList_readIoFile (now=0, procFd=6, process=0x555555670590) at linux/LinuxProcessList.c:429
        buffer = "rchar: 19374125417\000wchar: 36455852379\000syscr: 6309305\000syscw: 8378424\000read_bytes: 9826943488\000write_bytes: 11302940672\ncancelled_write_bytes: 457519104\n", '\000' <repeats 43 times>, "\002\000\000\000\000\000\000\000"...
        last_read = 0
        last_write = 0
        buf = 0x7fffffffd88b "write_bytes: 11302940672\ncancelled_write_bytes: 457519104\n"
        line = <optimized out>
        r = <optimized out>
        buffer = <optimized out>
        r = <optimized out>
        last_read = <optimized out>
        last_write = <optimized out>
        buf = <optimized out>
        line = <optimized out>
#1  LinuxProcessList_recurseProcTree (this=this@entry=0x5555555a0890, parentFd=parentFd@entry=-100, 
    dirname=dirname@entry=0x55555558d424 "/proc", parent=parent@entry=0x0, period=period@entry=20081708.5, now=0)
    at linux/LinuxProcessList.c:1344
        pid = <optimized out>
        proc = <optimized out>
        procFd = 6
        command = "\272\005\016\001", '\000' <repeats 28 times>, "\060 0 0 0 \000\004(\371\034(\250\071\300\342YUUU\000\000\240\264\333\367\377\177\000\000\240\244\333\367\377\177\000\000P0\306\367\377\177\000\000\300\342YUUU\000\000\300\342YUUU\000\000\240\250\333\367\377\177\000\000\037\035\306\367\377\177\000\000\060 0 0 0 0 0 0 0 0"
        lasttimes = <optimized out>
        tty_nr = <optimized out>
        lp = <optimized out>
        percent_cpu = <optimized out>
        name = <optimized out>
        preExisting = false
        pl = <optimized out>
        entry = <optimized out>
        settings = 0x5555555bcaf0
        dirFd = 5
        dir = <optimized out>
        cpus = 4
        hideKernelThreads = true
        hideUserlandThreads = false
        __PRETTY_FUNCTION__ = "LinuxProcessList_recurseProcTree"
#2  0x000055555558226a in ProcessList_goThroughEntries (super=super@entry=0x5555555a0890, 
    pauseProcessUpdate=pauseProcessUpdate@entry=false) at linux/LinuxProcessList.c:2000
        this = 0x5555555a0890
        settings = 0x5555555bcaf0
        period = 20081708.5
        rootFd = -100
(More stack frames follow...)

@fasterit
Copy link
Member

fasterit commented Apr 7, 2021

(now - process->io_last_scan_time) is zero due to f3a37f9 (356488a).
Issue in two places in linux/LinuxProcessList.c

@natoscott
Copy link
Member

@fasterit @wcasanova thanks - looking into it.

@natoscott
Copy link
Member

@wcasanova could you try out PR #593 please? I've not been able to reproduce the crash locally but I believe this is the root cause here.

@fasterit
Copy link
Member

fasterit commented Apr 8, 2021

#593 at 68585fa does not fix the Floating point exception for me.

Breaking at linux/LinuxProcessList.c:429, I get:

!429              process->io_rate_read_bps = ONE_K * (process->io_read_bytes - last_read) / (now - process->io_last_scan_time);
>>> print now
$1 = 0
>>> print process->io_last_scan_time
$2 = 0
# for comparison:
>>> print process->io_read_bytes
$3 = 257432

And 0 - 0 is zero and that faults as a divisor.

@wcasanova
Copy link
Author

@natoscott still gives the error

============================
Please check at https://htop.dev/issues whether this issue has already been reported.
If no similar issue has been reported before, please create a new issue with the following information:

- Your htop version (htop --version)
- Your OS and kernel version (uname -a)
        command = "gdbus\000l-client-\000 \000\000\000\060\000\000\000\000\342)Ӟ\211\243\265H\316\377\377\377\177\000\000\003|\311\367\377\177", '\000' <repeats 27 times>, "\342)Ӟ\211\243\265?\000\000\000\000\000\000\000 \321\377\377\377\177\000\000\200\214\316\367\377\177\000\000Wp\311\367\377\177\000\000\360\320\377\377\377\177\000\000\000\342)Ӟ\211\243\265\230"
        lasttimes = <optimized out>
        tty_nr = <optimized out>
        lp = <optimized out>
        percent_cpu = <optimized out>
        name = <optimized out>
        preExisting = false
        pl = <optimized out>
        entry = <optimized out>
        settings = 0x5555555bc0e0
        dirFd = 5
        dir = <optimized out>
        cpus = 4
        hideKernelThreads = true
        hideUserlandThreads = false
        errorReadingProcess = <optimized out>
#2  0x000055555557edac in LinuxProcessList_recurseProcTree (this=0x55555559df90, parentFd=<optimized out>, dirname=<optimized out>, parent=0x0, period=4212617.5, now=0) at linux/LinuxProcessList.c:1318
        pid = <optimized out>
        proc = <optimized out>
        procFd = 4
        command = "sshd\000md-resolve\000nts_highpri\000UU\000\000\377\377\377\377\377\377\377\377", '\000' <repeats 88 times>, "\377"
        lasttimes = <optimized out>
        tty_nr = <optimized out>
        lp = <optimized out>
        percent_cpu = <optimized out>
        name = <optimized out>
        preExisting = false
        pl = <optimized out>
        entry = 0x555555709c60
        settings = 0x5555555bc0e0
        dirFd = 3
        dir = <optimized out>
        cpus = 4
        hideKernelThreads = true
        hideUserlandThreads = false
        errorReadingProcess = <optimized out>
#3  0x00005555555814a8 in ProcessList_goThroughEntries (super=0x55555559df90, pauseProcessUpdate=<optimized out>) at linux/LinuxProcessList.c:2000
        this = <optimized out>
        settings = 0x5555555bc0e0
        period = 4212617.5
--Type <RET> for more, q to quit, c to continue without paging--
        rootFd = -100
#4  0x0000555555574fcb in ProcessList_scan (this=0x55555559df90, pauseProcessUpdate=<optimized out>) at ProcessList.c:615
        firstScanDone = true
#5  0x000055555556918d in CommandLine_run (name=<optimized out>, argc=<optimized out>, argv=<optimized out>) at CommandLine.c:335
        lc_ctype = <optimized out>
        flags = {pidMatchList = 0x0, commFilter = <optimized out>, userId = <optimized out>, sortKey = 0, delay = -1, useColors = true, enableMouse = true, treeView = <optimized out>, allowUnicode = <optimized out>,
          highlightChanges = <optimized out>, highlightDelaySecs = <optimized out>}
        ut = 0x5555555aece0
        pl = 0x55555559df90
        settings = 0x5555555bc0e0
        header = 0x5555555bf1e0
        panel = 0x7fffffffd950
        state = {settings = 0x5555555bc0e0, ut = 0x5555555aece0, pl = 0x55555559df90, mainPanel = 0x555555704e90, header = 0x5555555bf1e0, pauseProcessUpdate = false, hideProcessSelection = false}
        scr = 0x5555555c2b70
#6  0x00007ffff7b85b25 in __libc_start_main () from /usr/lib/libc.so.6
No symbol table info available.
#7  0x0000555555564d1e in _start ()
No symbol table info available.

natoscott added a commit to natoscott/htop that referenced this issue Apr 9, 2021
@natoscott
Copy link
Member

natoscott commented Apr 9, 2021

@fasterit @wcasanova oh, my mistake - its happening in the very first call to ProcessList_scan, not the second as I thought:

#5  0x000055555556918d in CommandLine_run (name=<optimized out>, argc=<optimized out>, argv=<optimized out>) at CommandLine.c:335

So my earlier change did nothing for this case. I've pushed a followup commit to PR #593 - can you try that out once more? Thanks!

@fasterit
Copy link
Member

fasterit commented Apr 9, 2021

The current version of #593 looks good. I cannot reproduce the FP exception anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants