Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

monitor: Ctrl-]/Ctrl-t + Ctrl-f stopped working (IDFGH-9686) #11027

Closed
3 tasks done
listout opened this issue Mar 20, 2023 · 13 comments
Closed
3 tasks done

monitor: Ctrl-]/Ctrl-t + Ctrl-f stopped working (IDFGH-9686) #11027

listout opened this issue Mar 20, 2023 · 13 comments
Assignees
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally Type: Bug bugs in IDF

Comments

@listout
Copy link

listout commented Mar 20, 2023

Answers checklist.

  • I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
  • I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

IDF version.

v.4.4.3

Operating System used.

Linux

How did you build your project?

Command line with idf.py

If you are using Windows, please specify command line type.

None

Development Kit.

Wroverkit V1

Power Supply used.

USB

What is the expected behavior?

According to the help menu in the idf.py monitor when I press Ctrl-t followed by Ctrl-f, the device should be reflashed. Similarly when pressing Ctrl-] in monitor mode, the mode should exit returning me back to the shell.

What is the actual behavior?

I (309) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CPU.
Hello world!
This is esp32 chip with 2 CPU core(s), WiFi/BT/BLE, silicon revision 1, 2MB external flash
Minimum free heap size: 295104 bytes
Restarting in 10 seconds...
Restarting in 9 seconds...
Restarting in 8 seconds...
Restarting in 7 seconds...
Restarting in 6 seconds...
Restarting in 5 seconds...
Restarting in 4 seconds...
Restarting in 3 seconds...

Traceback (most recent call last):
  File "/home/listout/esp/esp-idf/tools/idf_monitor.py", line 361, in <module>
    main()
  File "/home/listout/esp/esp-idf/tools/idf_monitor.py", line 352, in main
    monitor.main_loop()
  File "/home/listout/esp/esp-idf/tools/idf_monitor.py", line 155, in main_loop
    self._main_loop()
  File "/home/listout/esp/esp-idf/tools/idf_monitor.py", line 259, in _main_loop
    super()._main_loop()
  File "/home/listout/esp/esp-idf/tools/idf_monitor.py", line 195, in _main_loop
    self.serial_handler.handle_commands(data, self.target, self.run_make, self.console_reader,
  File "/home/listout/esp/esp-idf/tools/idf_monitor_base/serial_handler.py", line 173, in handle_commands
    run_make_func('encrypted-flash' if self.encrypted else 'flash')
  File "/home/listout/esp/esp-idf/tools/idf_monitor.py", line 141, in run_make
    with self:
  File "/home/listout/esp/esp-idf/tools/idf_monitor.py", line 135, in __enter__
    self.console_reader.stop()
  File "/home/listout/esp/esp-idf/tools/idf_monitor_base/stoppable_thread.py", line 67, in stop
    self._cancel()
  File "/home/listout/esp/esp-idf/tools/idf_monitor_base/console_reader.py", line 97, in _cancel
    fcntl.ioctl(self.console.fd, termios.TIOCSTI, b'\0')
OSError: [Errno 5] Input/output error
idf_monitor failed with exit code 1

Steps to reproduce.

  1. Build any example (such as hello_world)
  2. Flash esp32 device using idf.py -p /dev/ttyUSB0 flash monitor
  3. Try exiting the monitor using Ctrl-]

Debug Logs.

No response

More Information.

Could be related to #1209

@listout listout added the Type: Bug bugs in IDF label Mar 20, 2023
@github-actions github-actions bot changed the title monitor: Ctrl-]/Ctrl-t + Ctrl-f stopped working monitor: Ctrl-]/Ctrl-t + Ctrl-f stopped working (IDFGH-9686) Mar 20, 2023
@espressif-bot espressif-bot added the Status: Opened Issue is new label Mar 20, 2023
@tim-nordell-nimbelink
Copy link
Contributor

tim-nordell-nimbelink commented Mar 20, 2023

Might be related to: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=83efeeeb3d04b22aaed1df99bc70a48fe9d22c4d.

On Arch from zgrep TIOCSTI /proc/config.gz:

# CONFIG_LEGACY_TIOCSTI is not set

and from cat /proc/version:

Linux version 6.2.2-arch2-1 (linux@archlinux) (gcc (GCC) 12.2.1 20230201, GNU ld (GNU Binutils) 2.40) #1 SMP PREEMPT_DYNAMIC Wed, 08 Mar 2023 04:07:29 +0000

@listout
Copy link
Author

listout commented Mar 20, 2023

aha. I'm also on 6.2.2. So what's to do? rebuild the kernel?

@tim-nordell-nimbelink
Copy link
Contributor

I was just running into this, too, and checking to see if anyone else had hit this. I think console_reader.py should be modified to no longer utilize this ioctl. You could certainly use an older kernel.

@tim-nordell-nimbelink
Copy link
Contributor

This does the trick for now, but it really should be moved to a different approach as this busy waits; this matches the approach in Windows in this file. (I could see opening a pipe between the two threads and sending data such that the select gets it, or by using a signal to cause the read() system call to exit.)

diff --git a/tools/idf_monitor_base/console_reader.py b/tools/idf_monitor_base/console_reader.py
index b0ff5bd103..e8fe2d90fe 100644
--- a/tools/idf_monitor_base/console_reader.py
+++ b/tools/idf_monitor_base/console_reader.py
@@ -7,6 +7,7 @@ import queue
 import time
 
 from serial.tools.miniterm import Console
+from select import select
 
 from .console_parser import ConsoleParser
 from .constants import CMD_STOP, TAG_CMD
@@ -50,6 +51,13 @@ class ConsoleReader(StoppableThread):
                         while self.alive:
                             time.sleep(0.1)
                         break
+                    elif os.name == 'posix':
+                        while self.alive:
+                            r, _, _ = select([0], [], [], 0.1)
+                            if 0 in r:
+                                break
+                        if not self.alive:
+                            break
                     c = self.console.getkey()
                 except KeyboardInterrupt:
                     c = '\x03'
@@ -67,6 +75,8 @@ class ConsoleReader(StoppableThread):
             self.console.cleanup()
 
     def _cancel(self):
+        return
+
         # type: () -> None
         if os.name == 'posix' and not self.test_mode:
             # this is the way cancel() is implemented in pyserial 3.3 or newer,

@listout
Copy link
Author

listout commented Mar 22, 2023

@tim-nordell-nimbelink Thanks for the patch, can confirm that it's working. Can you create a PR with this patch? I really don't wanna downgrade my kernel or rebuild my kernel.

@marekmatej
Copy link
Collaborator

marekmatej commented Mar 23, 2023

I have also encountered this issue since ~5.80, currently on Linux 6.2.6-gentoo.
Can confirm that patch from @tim-nordell-nimbelink works for me!

@espressif-bot espressif-bot added Status: Selected for Development Issue is selected for development and removed Status: Opened Issue is new labels Mar 27, 2023
@peterdragun
Copy link
Collaborator

peterdragun commented Mar 27, 2023

Hi, thank you for reporting the issue, we were able to reproduce it and we will work on providing the fix.

If you are having the same issue, provided workaround seems to be working, but I wouldn't recommend using it with an interactive terminal because it does not support escape sequences - some useful features (such as navigation using arrows, autocomplete, etc.) will be missing.

Another workaround would be to re-enable TIOCSTI which can be done by sysctl -w dev.tty.legacy_tiocsti=1, but there is a reason why it was disabled, there are some security concerns.

@tim-nordell-nimbelink
Copy link
Contributor

tim-nordell-nimbelink commented Mar 27, 2023

@peterdragun -

FYI, not all distros allow you to re-enable TIOCSTI without switching away from the main kernel. The kernel configuration allows you to make it into a controllable sysctl, or to completely disable it. In Arch:

# sysctl -w dev.tty.legacy_tiocsti=1
sysctl: setting key "dev.tty.legacy_tiocsti": Invalid argument

Also, while there shouldn't be any problems with escape sequences with the temporary solution I provided above, I did test it just now against the advanced console example. Using strace reveals that, curiously, the code read 1 character of the escape sequence, and the subsequent select() call didn't return immediately despite pending additional data that would have provided the rest of the escape sequence. While I don't think this should be necessary, one can modify stdin to be non-blocking and have the code do the following instead:

--- a/console_reader.py
+++ b/console_reader.py
@@ -29,6 +29,15 @@ class ConsoleReader(StoppableThread):
 
     def run(self):
         # type: () -> None
+
+        # Modify stdin to be non-blocking
+        if os.name == 'posix':
+            from select import select
+            from fcntl import fcntl, F_GETFL, F_SETFL
+            flags = fcntl(0, F_GETFL, 0)
+            flags |= os.O_NONBLOCK
+            fcntl(0, F_SETFL, flags)
+
         self.console.setup()
         try:
             while self.alive:
@@ -43,6 +52,7 @@ class ConsoleReader(StoppableThread):
                             time.sleep(0.1)
                         if not self.alive:
                             break
+                        c = self.console.getkey()
                     elif self.test_mode:
                         # In testing mode the stdin is connected to PTY but is not used for input anything. For PTY
                         # the canceling by fcntl.ioctl isn't working and would hang in self.console.getkey().
@@ -50,7 +60,19 @@ class ConsoleReader(StoppableThread):
                         while self.alive:
                             time.sleep(0.1)
                         break
-                    c = self.console.getkey()
+                    elif os.name == 'posix':
+                        c = self.console.getkey()
+                        if len(c) == 0:
+                            while self.alive:
+                                r, _, _ = select([0], [], [], 0.1)
+                                if 0 in r:
+                                    break
+                            if not self.alive:
+                                break
+                            continue
+                    else:
+                        c = self.console.getkey()
+
                 except KeyboardInterrupt:
                     c = '\x03'
                 if c is not None:
@@ -64,6 +86,12 @@ class ConsoleReader(StoppableThread):
                             self.event_queue.put(ret)
 
         finally:
+            if os.name == 'posix':
+                from fcntl import fcntl, F_GETFL, F_SETFL
+                flags = fcntl(0, F_GETFL, 0)
+                flags &= ~os.O_NONBLOCK
+                fcntl(0, F_SETFL, flags)
+
             self.console.cleanup()
 
     def _cancel(self):

That works against the advanced console loop example code. I wonder why select() was behaving like that; it should be "level driven" and not "edge driven".

Thanks,
Tim

@peterdragun
Copy link
Collaborator

@tim-nordell-nimbelink Thanks for providing the additional fix! This works for me on newer and older kernels and also in interactive and standard modes of the monitor.

You can create a PR to esp-idf-monitor and take credit for your contribution. We can make it work with minor tweaks. If you are not interested in creating PR, let me know, and I will do it.

@tim-nordell-nimbelink
Copy link
Contributor

Ah, I totally realize now what's going on; I was stracing the wrong thread. The thread with the input has internal buffering in the getch() call, so it's reading 8192 bytes at a time, which is why the select() call only triggered once with the original patch. I'll push an updated patch as a PR into the appropriate repo, with some additional commentary as to why we can't just purely rely on the select() invocation.

@espressif-bot espressif-bot added Status: Reviewing Issue is being reviewed Resolution: NA Issue resolution is unavailable Status: Done Issue is done internally and removed Status: Selected for Development Issue is selected for development Status: Reviewing Issue is being reviewed labels Apr 21, 2023
@espressif-bot espressif-bot added Resolution: Done Issue is done internally and removed Resolution: NA Issue resolution is unavailable labels Apr 21, 2023
@dobairoland
Copy link
Collaborator

The fix is linked and is included in esp-idf-monitor==1.0.4 bundled with ESP-IDF v5.1 or newer. For older ESP-IDF versions we will backport the fix separately. The fixes will be linked similarly.

@espressif-bot espressif-bot added Status: In Progress Work is in progress Status: Reviewing Issue is being reviewed and removed Resolution: Done Issue is done internally Status: Done Issue is done internally Status: In Progress Work is in progress labels Apr 21, 2023
@espressif-bot espressif-bot added Status: Done Issue is done internally Resolution: Done Issue is done internally and removed Status: Reviewing Issue is being reviewed labels May 5, 2023
@karlp
Copy link
Contributor

karlp commented May 24, 2023

so, I can "pip install esp-idf-monitor" and get 1.0.4 "installed" on my v4.4.4 environment, but idf.py monitor still runs the old version, and there doesn't appear to be any binary installed, the wrappers have obviously changed in v5+

@dobairoland
Copy link
Collaborator

The fix for the v4.4 branch is not yet merged.

espressif-bot pushed a commit that referenced this issue Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Done Issue is done internally Status: Done Issue is done internally Type: Bug bugs in IDF
Projects
None yet
Development

No branches or pull requests

7 participants