Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sometimes mc hangs on directory change #4071

Open
mc-butler opened this issue Mar 19, 2020 · 8 comments
Open

Sometimes mc hangs on directory change #4071

mc-butler opened this issue Mar 19, 2020 · 8 comments
Labels
area: core Issues not related to a specific subsystem prio: medium Has the potential to affect progress ver: 4.8.33 Reproducible in version 4.8.33

Comments

@mc-butler
Copy link

Important

This issue was migrated from Trac:

Origin https://midnight-commander.org/ticket/4071
Reporter olfway (olfway@….com)

I use mc 4.8.24 on mac os 10.15.3

❯ env LC_MESSAGES=C /opt/mc/bin/mc -V
GNU Midnight Commander unknown
Built with GLib 2.64.1
Using the S-Lang library with terminfo database
With builtin Editor
With subshell support as default
With support for background operations
With mouse support on xterm
With internationalization support
With multiple codepages support
Virtual File Systems: cpiofs, tarfs, sfs, extfs, ftpfs, fish
Data types: char: 8; int: 32; long: 64; void *: 64; size_t: 64; off_t: 64;
❯ /opt/mc//bin/mc --configure-options
 '--prefix' '/opt/mc' '--without-x' '--with-screen=slang' '--disable-doxygen-html' '--disable-doxygen-dot' '--disable-doxygen-doc' 'CFLAGS=-O0 -g -ggdb' 'LDFLAGS=-L/usr/local/opt/gettext/lib -L/usr/local/opt/gettext/lib' 'CPPFLAGS=-I/usr/local/opt/gettext/include -I/usr/local/opt/gettext/include'

fish shell, version 3.1.0

Sometimes, then I press enter to change directory mc just hangs.
Also, there is a zombie kill process after that

Backtrace from mc:


(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x00007fff73a3a3b2 libsystem_kernel.dylib`__sigsuspend + 10
    frame #1: 0x00000001002f9394 mc`synchronize at common.c:497:9
    frame #2: 0x00000001002f81ba mc`feed_subshell(how=0, fail_on_error=0) at common.c:609:13
    frame #3: 0x00000001002f87e8 mc`do_subshell_chdir(vpath=0x00007fa164c0f8b0, update_prompt=0) at common.c:1345:5
    frame #4: 0x0000000100276299 mc`subshell_chdir(vpath=0x00007fa164c0f8b0) at panel.c:3234:9
    frame #5: 0x00000001002745f6 mc`_do_panel_cd(panel=0x00007fa164c0ec70, new_dir_vpath=0x00007fa164c0f570, cd_type=cd_exact) at panel.c:3275:5
    frame #6: 0x00000001002744e3 mc`do_panel_cd(panel=0x00007fa164c0ec70, new_dir_vpath=0x00007fa164c0f570, cd_type=cd_exact) at panel.c:4628:9
    frame #7: 0x00000001002758be mc`do_cd(new_dir_vpath=0x00007fa164c0f570, exact=cd_exact) at panel.c:5028:11
    frame #8: 0x0000000100279e76 mc`do_enter_on_file_entry(fe=0x00000001008670b8) at panel.c:2795:14
    frame #9: 0x000000010027865e mc`do_enter(panel=0x00007fa164c0ec70) at panel.c:2855:12
    frame #10: 0x00000001002765c1 mc`panel_execute_cmd(panel=0x00007fa164c0ec70, command=1) at panel.c:3446:9
    frame #11: 0x00000001002763f4 mc`panel_key(panel=0x00007fa164c0ec70, key=10) at panel.c:3608:20
    frame #12: 0x0000000100272655 mc`panel_callback(w=0x00007fa164c0ec70, sender=0x0000000000000000, msg=MSG_KEY, parm=10, data=0x0000000000000000) at panel.c:3688:16
    frame #13: 0x000000010023075a mc`send_message(w=0x00007fa164c0ec70, sender=0x0000000000000000, msg=MSG_KEY, parm=10, data=0x0000000000000000) at widget-common.h:216:15
    frame #14: 0x0000000100231a16 mc`dlg_key_event(h=0x00007fa164f05d90, d_key=10) at dialog.c:489:19
    frame #15: 0x0000000100231439 mc`dlg_process_event(h=0x00007fa164f05d90, key=10, event=0x00007ffeef9f5640) at dialog.c:1134:9
    frame #16: 0x0000000100231d48 mc`frontend_dlg_run(h=0x00007fa164f05d90) at dialog.c:545:9
    frame #17: 0x0000000100231b8e mc`dlg_run(h=0x00007fa164f05d90) at dialog.c:1167:5
    frame #18: 0x000000010026d8ed mc`do_nc at midnight.c:1836:16
    frame #19: 0x000000010020bb92 mc`main(argc=1, argv=0x00007ffeef9f5808) at main.c:405:21
    frame #20: 0x00007fff738d97fd libdyld.dylib`start + 1
    frame #21: 0x00007fff738d97fd libdyld.dylib`start + 1
(lldb) frame variable
(lldb) up
frame #1: 0x00000001002f9394 mc`synchronize at common.c:497:9
   494
   495 	    /* Wait until the subshell has stopped */
   496 	    while (subshell_alive && !subshell_stopped)
-> 497 	        sigsuspend (&old_mask);
   498
   499 	    if (subshell_state != ACTIVE)
   500 	    {
(lldb) frame variable
(sigset_t) sigchld_mask = 524288
(sigset_t) old_mask = 0

Backtrace from fish (part of):

frame #2: 0x00000001012380fe fish`exec_external_command(parser=0x00007fb2e2d02030, j=std::__1::shared_ptr<job_t>::element_type @ 0x00007fb2e2f0f3c8 strong=2 weak=1, p=0x00007fb2e2f0f530, proc_io_chain=0x00007ffeeeb43930) at exec.cpp:573:17
   570 	            // We successfully made the attributes and actions; actually call
   571 	            // posix_spawn.
   572 	            int spawn_ret =
-> 573 	                posix_spawn(&pid, actual_cmd, &actions, &attr, const_cast<char *const *>(argv),
   574 	                            const_cast<char *const *>(envv));
   575
   576 	            // This usleep can be used to test for various race conditions

(const char *) actual_cmd = 0x00007ffeeeb435d1 "/bin/kill"
@mc-butler
Copy link
Author

Changed by olfway (olfway@….com) on Mar 29, 2020 at 14:32 UTC (comment 1)

I get it again, mc hangs

   495 	    /* Wait until the subshell has stopped */
   496 	    while (subshell_alive && !subshell_stopped) {
-> 497 	        sigsuspend (&old_mask);

I checked with lldb and mc is waiting in synchronize at common.c:497

Current values
subshell_alive = 1
subshell_stopped = 1

fish subshell actually stopped

I'm not sure how this could be possible

@mc-butler
Copy link
Author

Changed by olfway (olfway@….com) on Mar 29, 2020 at 20:02 UTC (comment 2)

I'm able to reproduce it like this:

Go to ~/Library folder (I guess any folder with lots subfolders will work)
Point cursor to the last subfolder
Start pressing Up ; Enter ; Up ; Enter ; Up ; ... etc as fast as possible
Usually, mc hangs, some times I have to go to the latest subfolder and start again

@mc-butler
Copy link
Author

Changed by olfway (olfway@….com) on Mar 29, 2020 at 20:11 UTC (comment 3)

Tried to rewrite synchronize with nanosleep and without sig* functions, it works without issues

     /* Wait until the subshell has stopped */
     while (subshell_alive && !subshell_stopped) {
-        sigsuspend (&old_mask);
+        // sigsuspend (&old_mask);
+        ts.tv_nsec = 1000 * 1000;
+        nanosleep(&ts, NULL);
     }

(and commented out other sig* calls in synchronize function)

So it seems something wrong with this while loop

subshell already stopped, subshell_stopped=1 and sigsuspend waiting for a signal blocking mc

@mc-butler
Copy link
Author

Changed by ossi (@ossilator) on Mar 30, 2020 at 10:32 UTC (comment 4)

uh-oh, this whole file is full of race conditions. all access to the two volatile variables while the subshell is running needs to happen with SIGCHLD being blocked - the only function that gets it right is synchronize().

this is a deja-vu from the SIGWINCH episode - we're again left with the choice of using pselect() or the self-pipe trick, preferably again opting for the latter in expectation of using the glib event loop at some point (there is also forkfd() since linux 5.4).

a propos nothing, i found this while reading the code:

diff --git a/src/subshell/common.c b/src/subshell/common.c
index 06699233c..ee6309900 100644
--- a/src/subshell/common.c
+++ b/src/subshell/common.c
@@ -354,5 +354,5 @@ init_subshell_child (const char *pty_name)
     /* Attach all our standard file descriptors to the pty */

-    /* This is done just before the fork, because stderr must still      */
+    /* This is done just before the exec, because stderr must still      */
     /* be connected to the real tty during the above error messages; */
     /* otherwise the user will never see them.                   */

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Jan 29, 2022 at 11:10 UTC (comment 5)

Ticket #4331 has been marked as a duplicate of this ticket.

@mc-butler
Copy link
Author

Changed by olfway (olfway@….com) on Mar 7, 2022 at 22:21 UTC (comment 6)

Just a note
To "unfreeze" mc you could send SIGCONT to subshell and then SIGSTOP

@mc-butler
Copy link
Author

Changed by olfway (olfway@….com) on Dec 19, 2023 at 19:10 UTC (comment 7)

Issue on fish shell
fish-shell/fish-shell#6767

@mc-butler mc-butler marked this as a duplicate of #4331 Feb 28, 2025
@zyv zyv added ver: 4.8.33 Reproducible in version 4.8.33 and removed ver: 4.8.24 Reproducible in version 4.8.24 labels Mar 8, 2025
@zyv
Copy link
Member

zyv commented Mar 8, 2025

I'm updating this issue as it seems to be still reproducible with the latest mc (well, no wonder, not much has changed in our subshell interaction since then) and has become more prevalent with fish 4 as reported by @asl.

I think the suggestion in fish-shell/fish-shell#6767 to use OSC-7 is a good one, but I'm afraid it won't solve the race problem, just probably lead to working directory desyncs instead of hangs, unless the races are found and fixed... or the subshell interaction is switched to an event loop, which might be a smaller and more self-contained project than going full on the event loop everywhere?

@krobelus, it seems like you did something on the fish side of things almost 5 years ago, so if it's possible to get you interested here, I'd be happy to work with you.

I'm also curious what @egmontkob would say about using OSC-7 instead of our own monkey business with PS1 hooks/emulation...

/cc @olfway after migrating to GitHub :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: core Issues not related to a specific subsystem prio: medium Has the potential to affect progress ver: 4.8.33 Reproducible in version 4.8.33
Development

No branches or pull requests

2 participants