Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use-dev-tty causes panic #755

Closed
markus-bauer opened this issue Feb 9, 2023 · 7 comments · Fixed by #762
Closed

use-dev-tty causes panic #755

markus-bauer opened this issue Feb 9, 2023 · 7 comments · Fixed by #762

Comments

@markus-bauer
Copy link

markus-bauer commented Feb 9, 2023

Description

The new feature use-dev-tty sometimes causes this panic:

panicked at 'polling tty: Poll(Os { code: 4, kind: Interrupted, message: "Interrupted system call" })',
XXX/.cargo/registry/src/.../crossterm-0.26.0/src/event/source/unix/tty.rs:135:28  ...

res => res.expect("polling tty"),

How to reproduce

I have not found a minimal example, and my full code is not public. But I can reliably trigger it with the code below:

[dependencies]
crossterm = {version="0.26.0", features=["event-stream", "use-dev-tty"]}
futures = {version = "0.3"}
tokio = { version = "1", features = ["process", "macros", "rt-multi-thread"]}
use crossterm::event::EventStream;
use futures::{future::FutureExt, stream::StreamExt};
use std::process::Stdio;
use tokio;

#[tokio::main]
async fn main() {
    let mut reader = EventStream::new();

    loop {
        let event = reader.next().fuse();
        tokio::select! {
            maybe_event = event => {
                match maybe_event {
                    Some(Ok(event)) => println!("Event::{:?}\r", event),
                    Some(Err(e)) => println!("Error: {:?}\r", e),
                    None => break,
                }
            }
        };

        // Spawn enough tasks, and the event stream will panic.
        for _ in 0..100 {
            tokio::spawn(async {
                // I don't think the actual command matters.
                tokio::process::Command::new("pwd")
                    // null i/o is not needed, but I want to rule this out.
                    .stdin(Stdio::null())
                    .stdout(Stdio::null())
                    .stderr(Stdio::null())
                    .status()
                    .await
            });
        }
    }
}

Note that this won't panic without the tty feature. And the panic seems to only happen with this combination (task + command). I don't know enough about tokio to be of help here.

There's another problem: When the EventStream panics, it will do it silently (not counting the panic message). At that point it will stop sending events, and the main loop is just blocked forever. Shouldn't the event stream return a Result<Event> to handle cases where the EventSource is found to be dead? In any case, please avoid panics.

Environment

  • Ubuntu 22.10
  • Rust 1.69.0--nightly
  • Terminal doesn't matter

I've also tested this with the current crossterm on github, with the same results.


Update:

I've changed the code to be more like the official example, which first fuses the future. This still doesn't stop the loop from being blocked after panic.

@markus-bauer
Copy link
Author

markus-bauer commented Feb 9, 2023

I did a bit of searching. Unfortunately I don't know enough about these topics. Please just ignore this if it turns out to be wrong.

It might be that EINTR isn't handled correctly (or at all).

Mio is returning a specific error, which is handled and ignored:

if let Err(e) = self.poll.poll(&mut self.events, timeout.leftover()) {
// Mio will throw an interrupted error in case of cursor position retrieval. We need to retry until it succeeds.
// Previous versions of Mio (< 0.7) would automatically retry the poll call if it was interrupted (if EINTR was returned).
// https://docs.rs/mio/0.7.0/mio/struct.Poll.html#notes
if e.kind() == io::ErrorKind::Interrupted {
continue;
} else {
return Err(e);
}
};

Here's the exact point where the panic occurs in the tty version. Ignoring that specific error will avoid the panic (obviously).
This is not a solution, just a way to find the source of the panic!!!

--- a/src/event/source/unix/tty.rs
+++ b/src/event/source/unix/tty.rs
@@ -132,7 +132,24 @@ impl EventSource for UnixInternalEventSource {
             }
             match poll(&mut fds, timeout.leftover()) {
                 Err(filedescriptor::Error::Io(e)) => return Err(e),
-                res => res.expect("polling tty"),
+                res => match res {
+                    Ok(_res) => {}
+                    Err(fd_err) => {
+                        match fd_err {
+                            filedescriptor::Error::Poll(err)
+                                if err.kind() == io::ErrorKind::Interrupted =>
+                            {
+                                continue
+                            }
+                            other_err => {
+                                // TODO:
+                                // This function already returns Result<T>.
+                                // So don't panic, but instead return an error!
+                                panic!("{:?}", other_err)
+                            }
+                        }
+                    }
+                },
             };
             if fds[0].revents & POLLIN != 0 {
                 loop {

@sigmaSd
Copy link
Contributor

sigmaSd commented Feb 9, 2023

cc @yyogo

@yyogo
Copy link
Contributor

yyogo commented Feb 21, 2023

Thank you for the report! This is indeed due to not handling eintr correctly - I believe the correct solution is to just ignore it, since we already keep polling until we get an event or the timeout expired.

@yyogo
Copy link
Contributor

yyogo commented Feb 21, 2023

The reason this panics btw, is that filedescriptor::poll can return two error types - Poll and Io, and I only handled the latter

@yyogo
Copy link
Contributor

yyogo commented Feb 21, 2023

@markus-bauer I'm unable to reproduce the panic using your example. Can you verify it does not reproduce with #762 ?

@markus-bauer
Copy link
Author

Don't have time right now, but will report back later.

Have you tried pressing any button repeatedly, or increasing the number of tasks in the loop? It only happens after a lot of tasks are spawned. Anyway, in hindsight, I think you should be able to reproduce it if you can make anything send interrupt.

@markus-bauer
Copy link
Author

@yyogo Okay, I did a quick test with my example code and my real code. With your patch, it no longer panics. It still panics with the old version.
I wouldn't have expected other results, because your changes literally removed the source of the explicit panic.

It's interesting that you can't reproduce this, because it happens reliably, every time.

But it appears to be fixed now. Thanks for your work on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants