Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TcpStream::read sometimes blocks forever in 0.1.13 #790

Closed
jonhoo opened this issue Dec 5, 2018 · 2 comments
Closed

TcpStream::read sometimes blocks forever in 0.1.13 #790

jonhoo opened this issue Dec 5, 2018 · 2 comments

Comments

@jonhoo
Copy link
Sponsor Contributor

jonhoo commented Dec 5, 2018

Version

└── tokio v0.1.13
    ├── tokio-codec v0.1.1
    │   └── tokio-io v0.1.10
    ├── tokio-current-thread v0.1.4
    │   └── tokio-executor v0.1.5
    ├── tokio-executor v0.1.5 (*)
    ├── tokio-fs v0.1.4
    │   ├── tokio-io v0.1.10 (*)
    │   └── tokio-threadpool v0.1.9
    │       └── tokio-executor v0.1.5 (*)
    │   └── tokio-io v0.1.10 (*)
    ├── tokio-io v0.1.10 (*)
    ├── tokio-reactor v0.1.7
    │   ├── tokio-executor v0.1.5 (*)
    │   └── tokio-io v0.1.10 (*)
    ├── tokio-tcp v0.1.2
    │   ├── tokio-io v0.1.10 (*)
    │   └── tokio-reactor v0.1.7 (*)
    ├── tokio-threadpool v0.1.9 (*)
    ├── tokio-timer v0.2.8
    │   └── tokio-executor v0.1.5 (*)
    ├── tokio-udp v0.1.3
    │   ├── tokio-codec v0.1.1 (*)
    │   ├── tokio-io v0.1.10 (*)
    │   └── tokio-reactor v0.1.7 (*)
    └── tokio-uds v0.2.4
        ├── tokio-codec v0.1.1 (*)
        ├── tokio-io v0.1.10 (*)
        └── tokio-reactor v0.1.7 (*)

Platform

Linux xpanse 4.19.4-arch1-1-ARCH #1 SMP PREEMPT Fri Nov 23 09:06:58 UTC 2018 x86_64 GNU/Linux

Description

I have a tokio::net::TcpStream that was just yielded by .incoming(), and need to synchronously read a single byte from it immediately after the connection comes in. I currently do that (stupidly) using the code below. It works fine with tokio 0.1.11 (playground), but on tokio 0.1.13, it hangs forever as the read_exact in the client never yields any bytes:

extern crate futures;
extern crate tokio;

use futures::Stream;
use std::io::{prelude::*, ErrorKind};
use std::{net, thread};

fn main() {
    let l = tokio::net::tcp::TcpListener::bind(&"127.0.0.1:0".parse().unwrap()).unwrap();
    let addr = l.local_addr().unwrap();

    let c = thread::spawn(move || {
        let mut c = net::TcpStream::connect(addr).unwrap();
        c.write_all(&[42]).unwrap();
        c.flush().unwrap();
        let mut buf = Vec::new();
        c.read_to_end(&mut buf).unwrap();
        buf
    });

    thread::spawn(move || {
        let mut rt = tokio::runtime::Runtime::new().unwrap();
        rt.block_on_all(l.incoming().for_each(|mut stream| {
            let mut tag = [0];
            let mut i = 0;
            while let Err(e) = stream.read_exact(&mut tag[..]) {
                i += 1;

                if e.kind() == ErrorKind::WouldBlock {
                    if i % 1_000_000 == 0 {
                        // ~= every 1s
                        eprintln!("still haven't gotten tag for new connection ({})", tag[0]);
                    }
                    thread::yield_now();
                    continue;
                }

                // well.. that failed quickly..
                eprintln!("gave up");
                return Err(e);
            }
            eprintln!("all good");
            Ok(())
        }))
    });

    eprintln!("waiting for client...");
    c.join().unwrap();
    eprintln!("client done!");
}

strace shows that the read_exact after the accept never does any syscalls:

1905  socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_IP) = 3
1905  setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
1905  bind(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
1905  listen(3, 1024)                   = 0
1905  getsockname(3, {sa_family=AF_INET, sin_port=htons(33083), sin_addr=inet_addr("127.0.0.1")}, [128->16]) = 0
1905  write(2, "waiting for client...\n", 22) = 22
1906  socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_IP) = 4
1906  connect(4, {sa_family=AF_INET, sin_port=htons(33083), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
1906  sendto(4, "*", 1, MSG_NOSIGNAL, NULL, 0) = 1
1906  recvfrom(4,  <unfinished ...>
1908  accept4(3, {sa_family=AF_INET, sin_port=htons(50562), sin_addr=inet_addr("127.0.0.1")}, [128->16], SOCK_CLOEXEC) = 32
1908  write(2, "still haven't gotten tag for new"..., 45) = 45
1908  write(2, "0", 1)                  = 1
1908  write(2, ")\n", 2)                = 2
1905  --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} ---
1906  <... recvfrom resumed> <unfinished ...>) = ?
1910  +++ killed by SIGINT +++
1909  +++ killed by SIGINT +++
1908  +++ killed by SIGINT +++
1907  +++ killed by SIGINT +++
1906  +++ killed by SIGINT +++
1905  +++ killed by SIGINT +++

With tokio 0.1.11, it shows:

1796  socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_IP) = 3
1796  setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
1796  bind(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
1796  listen(3, 1024)                   = 0
1796  getsockname(3, {sa_family=AF_INET, sin_port=htons(43177), sin_addr=inet_addr("127.0.0.1")}, [128->16]) = 0
1796  write(2, "waiting for client...\n", 22) = 22
1797  socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC, IPPROTO_IP) = 4
1797  connect(4, {sa_family=AF_INET, sin_port=htons(43177), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
1797  sendto(4, "*", 1, MSG_NOSIGNAL, NULL, 0) = 1
1797  recvfrom(4,  <unfinished ...>
1802  accept4(3, {sa_family=AF_INET, sin_port=htons(44092), sin_addr=inet_addr("127.0.0.1")}, [128->16], SOCK_CLOEXEC) = 8
1802  recvfrom(8, "*", 1, 0, NULL, NULL) = 1
1802  write(2, "all good\n", 9)         = 9
1797  <... recvfrom resumed> "", 32, 0, NULL, NULL) = 0
1802  accept4(3, 0x7fc0189fac70, [128], SOCK_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
1797  +++ exited with 0 +++
1796  write(2, "client done!\n", 13)    = 13
1803  +++ exited with 0 +++
1802  +++ exited with 0 +++
1801  +++ exited with 0 +++
1800  +++ exited with 0 +++
1799  +++ exited with 0 +++
1798  +++ exited with 0 +++
1796  +++ exited with 0 +++

Notice in particular that in tokio 0.1.11, we have

1802  accept4(3, {sa_family=AF_INET, sin_port=htons(44092), sin_addr=inet_addr("127.0.0.1")}, [128->16], SOCK_CLOEXEC) = 8
1802  recvfrom(8, "*", 1, 0, NULL, NULL) = 1

whereas in tokio 0.1.13, that recvfrom never appears.

This may or may not be related to #774.

@jonhoo
Copy link
Sponsor Contributor Author

jonhoo commented Dec 5, 2018

/cc @stjepang

@jonhoo
Copy link
Sponsor Contributor Author

jonhoo commented Dec 5, 2018

From discussion on Gitter, this was never intended to work. We're blocking the executor, which is a big no-no. In particular, we're blocking the executor from realizing that it needs to do a syscall (when it polls the reactor), and so therefore it doesn't do the syscall. This could be worked around either by using blocking (which would move the net reactor to a different thread), by using accept_std, reading the byte, and then mapping to non-blocking (using a custom combinator to emulate Incoming), or just operating directly on the raw FD.

@jonhoo jonhoo closed this as completed Dec 5, 2018
jonhoo added a commit to mit-pdos/noria that referenced this issue Dec 5, 2018
jonhoo added a commit to mit-pdos/noria that referenced this issue Dec 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant