Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not open PTY with O_NONBLOCK on FreeBSD #48

Merged
merged 1 commit into from
Apr 14, 2019
Merged

Conversation

fabianfreyer
Copy link
Contributor

FreeBSD's posix_openpt(2) implementation only supports O_RDWR, O_NOCTTY,
and O_CLOEXEC. According to the manpage:

"The posix_openpt() function shall fail when oflag contains other values."

@pkgw
Copy link
Owner

pkgw commented Jan 3, 2019

Thanks for the PR!

With this change, the master PTY FD will never be set to nonblocking mode on FreeBSD. That seems like a problem since the point of the crate is to get nonblocking/async I/O over a PTY connection. Do we need to alter the flags on the master FD to set O_NONBLOCK after it's created, perhaps? Does FreeBSD just not support the concept of a nonblocking PTY connection?

@fabianfreyer
Copy link
Contributor Author

I'm not sure actually, maybe we can set it with libc::fcntl?

@fabianfreyer
Copy link
Contributor Author

@pkgw PTAL at b56e613.

It would be really great to have tests for this, but as I am honestly still trying to figure out how to use this library (I'm trying to hook it up to a Websocket using actix-web), I don't really know where to start...

@pkgw
Copy link
Owner

pkgw commented Jan 3, 2019

Yeah, I feel bad about the lack of testing framework. But, empirically, not bad enough to add any :-/

With the your PR applied, are you able to test if the stund application works for you on FreeBSD?

@fabianfreyer
Copy link
Contributor Author

Hmm, I will take a look and see how to use stund 👍 - I've only really looked at tokio-pty-process...

FreeBSD's posix_openpt(2) implementation only supports O_RDWR, O_NOCTTY,
and O_CLOEXEC. According to the manpage, "The posix_openpt() function
shall fail when oflag contains other values."
@fabianfreyer
Copy link
Contributor Author

fabianfreyer commented Jan 3, 2019

I've run the following manual smoke tests:

  • cargo run open user@example.org => got output STUND:<something that looks like b64, but b64decode can't deal with>, and kept running.
  • cargo run status => showed user@example.org Open
  • cargo run close user@example.org
  • cargo run status => showed user@example.org Closed

Not sure what more functionality to test and/or what to expect.

Update

When compiling tokio-pty-process without the fcntl call, cargo run open hangs and none of the other calls work.

@pkgw
Copy link
Owner

pkgw commented Jan 4, 2019

I'm pretty sure that your first smoketest is indicating that the nonblocking setting isn't working :-(

I'm having trouble pulling up anything relevant on Google, but looking at the FreeBSD source code, it looks like nonblocking PTY's should be possible on that OS. Do you know any experienced low-level FreeBSD developers who might know more about the details?

The other thing is that this is definitely showing that it would be great to have a better test case for checking that nonblocking mode is working. It's hard for me to think of how exactly to go about that, though … maybe write a bunch of data and see whether you ever get an EWOULDBLOCK? That seems prone to spurious failures, though ... as well as being a hassle to implement.

Links

@fabianfreyer
Copy link
Contributor Author

fabianfreyer commented Jan 4, 2019

I'm pretty sure that your first smoketest is indicating that the nonblocking setting isn't working :-(

How so, should cargo run open ... terminate?

I'm having trouble pulling up anything relevant on Google, but looking at the FreeBSD source code, it looks like nonblocking PTY's should be possible on that OS. Do you know any experienced low-level FreeBSD developers who might know more about the details?

I'm guessing it would probably be worth writing to the freebsd-hackers@freebsd.org mailing list about this? I can do this, but since I feel you have a deeper understanding of the expected behavior, it might be better if you would. If you do, please CC me, or ping me to write the mail :).

The other thing is that this is definitely showing that it would be great to have a better test case for checking that nonblocking mode is working. It's hard for me to think of how exactly to go about that, though … maybe write a bunch of data and see whether you ever get an EWOULDBLOCK? That seems prone to spurious failures, though ... as well as being a hassle to implement.

Might be an idea to spawn a child connected to the slave that sleeps and doesn't read, and write a large bunch of data, until the buffers are full? I'll try to take a look.

Links

  • Possibly relevant FreeBSD code?
    I'm not quite sure when PTS_FINISHED get set, we also might be able to trigger this code path if we write to the AsyncPtyMaster when no child is attached.

@pkgw
Copy link
Owner

pkgw commented Jan 4, 2019

How so, should cargo run open ... terminate?

Yes, the whole point is that the open command should authenticate, and then (from the user perspective) detach from the terminal. Really, ssh is running detached the whole time, but during the login process the stund program forwards I/O between the terminal and ssh so that you can authenticate. The STUND:<random junk> is printed as a command on the server; stund looks for that output to decide that the authentication flow has finished. If that message is getting forwarded to the client, that means that something has gone wrong in the I/O forwarding

I should say that it's entirely possible that the stund open behavior is a red herring for this PR. But what you observed definitely isn't how things are supposed to work ...

If you can email freebsd-hackers, that might get a good answer. But I'll also see if I can install a FreeBSD VM and investigate a bit myself.

Might be an idea to spawn a child connected to the slave that sleeps and doesn't read, and write a large bunch of data, until the buffers are full? I'll try to take a look.

Right, that was what I was thinking. It's a bit of a hassle since you've got to write a custom client program … but well, sometimes testing takes work.

@fabianfreyer
Copy link
Contributor Author

fabianfreyer commented Jan 4, 2019

maybe write a bunch of data and see whether you ever get an EWOULDBLOCK?

Any idea how I would do this? If I call write_all on the master, I get a no Task is currently running error. I'm still pretty new to tokio :/

For reference, this is my naïve attempt:

extern crate tokio;
extern crate tokio_pty_process;

use tokio::prelude::*;

fn main() {
    let mut master = tokio_pty_process::AsyncPtyMaster::open().expect("could not open pty");

    let mut buf = vec![0u8, 100];

    // Reading when there is no data available should block.
    let ret = master.read(&mut buf);

    println!("ret: {:?}", ret);
}

@pkgw
Copy link
Owner

pkgw commented Jan 4, 2019

Hmm, maybe consult the Tokio tutorials? I think you need to create a "reactor" to drive the event loop.

if libc::fcntl(fd, libc::F_SETFL, flags | libc::O_NONBLOCK) == -1 {
return Err(io::Error::last_os_error());
};
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Funnily enough, if this hunk is removed, the following still does not panic on FreeBSD:

extern crate tokio;
extern crate tokio_pty_process;

use tokio::prelude::*;
use tokio_pty_process::AsyncPtyMaster;

struct AsyncReadTest(AsyncPtyMaster);

impl Future for AsyncReadTest {
    type Item = ();
    type Error = std::io::Error;

    fn poll(&mut self) -> Poll<Self::Item, Self::Error> {
        let mut buf = vec![0u8, 100];
        self.0.read(&mut buf).map(|_| Async::Ready(()))
    }
}

fn main() {
    let err = AsyncPtyMaster::open()
        .map(AsyncReadTest)
        .expect("could not open pty")
        .wait()
        .expect_err("Read succeeded but no data available");

    assert_eq!(err.kind(), std::io::ErrorKind::WouldBlock);
}

Can you confirm this behavior on Linux too, if the AsyncPtyMaster is not opened with O_NONBLOCK?

Copy link
Owner

@pkgw pkgw Jan 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I get this behavior ... but if I strace the program, it isn't actually doing any PTY I/O! I am not sure what's going on — for some reason I can't step through it in GDB at the moment.

Clarification: the program opens a PTY master, but I don't see any read operations on the resulting FD.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I see that too:

$ kdump -i ktrace.out
[...]
 34900 tokio-runtime-worke RET   thr_new 0
 34900 wouldblock RET   fork 0
 34900 wouldblock CALL  sigaltstack(0,0x7fffdfdfcf80)
 34900 wouldblock RET   sigaltstack 0
 34900 wouldblock CALL  mmap(0,0x8800,0x3<PROT_READ|PROT_WRITE>,0x1002<MAP_PRIVATE|MAP_ANON>,0xffffffff,0)
 34900 tokio-runtime-worke CALL  posix_openpt(0x8002)
 34900 wouldblock RET   mmap 34384453632/0x801792000
 34900 wouldblock CALL  sigaltstack(0x7fffdfdfcf80,0)
 34900 wouldblock RET   sigaltstack 0
 34900 tokio-runtime-worke RET   posix_openpt 15/0xf
 34900 wouldblock CALL  thr_set_name(0x18a46,0x80177d0e0)
[...]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From how I understand it:

What state does that start out in? Will it wait for the OS to signal readiness before even attempting the read?

@LucioFranco
Copy link

@fabianfreyer looks like you are missing an executor to run your futures? If you are using any of the tokio items it needs a tokio runtime to run the futures. Wait generally won't work here.

extern crate tokio;
extern crate tokio_pty_process;

use tokio::prelude::*;
use tokio_pty_process::AsyncPtyMaster;

struct AsyncReadTest(AsyncPtyMaster);

impl Future for AsyncReadTest {
    type Item = ();
    type Error = std::io::Error;

    fn poll(&mut self) -> Poll<Self::Item, Self::Error> {
        let mut buf = vec![0u8, 100];
        self.0.read(&mut buf).map(|_| Async::Ready(()))
    }
}

fn main() {
   	let fut = futures::lazy(|| {
		AsyncPtyMaster::open()
        .map(AsyncReadTest)
		.map_err(|e| panic!("{}", e))
	});

	tokio::run(fut);
}

Now tokio run itself creates a tokio::runtime::Runtime which contains the reactor + executor + timer to power all the tokio items. The runtime contains another method called block_on that actually returns the item and err from the future. tokio::run requires a Future<Item = (), Error = ()> + Send + 'static.

The key to his code I wrote above is that the error you got is that you are trying to spawn a future without actually already being in a future based context. By using futures lazy you can create a future that will execute right away (no blocking or no Async::NotReady) and this allows you to now embed yourself within the context that will then allow you to call other futures and put them on to this executor that tokio provides via tokio::runtime::Runtime.

@fabianfreyer
Copy link
Contributor Author

Thanks @LucioFranco! Even with those changes though, there doesn't seem to be a difference to a blocking PTY descriptor.

@LucioFranco
Copy link

@fabianfreyer I don't quite understand what you mean by a blocking PTY descriptor in this case? Could you show me what code is blocking?

@fabianfreyer
Copy link
Contributor Author

@LucioFranco We're now opening the PTY without libc::O_NONBLOCK:

let fd = libc::posix_openpt(libc::O_RDWR | libc::O_NOCTTY | libc::O_NONBLOCK);
// FreeBSD doesn't support O_NONBLOCK on PTYs.
#[cfg(target_os = "freebsd")]
let fd = libc::posix_openpt(libc::O_RDWR | libc::O_NOCTTY);

The file descriptor returned from this is wrapped in an AsyncPtyFile:
File::from_raw_fd(fd)
};
Ok(AsyncPtyMaster(PollEvented2::new(AsyncPtyFile::new(inner))))

According to the FreeBSD source code, I expect the read to block when the O_NONBLOCK flag is not set, or to return EWOULDBLOCK when it is:
https://github.com/freebsd/freebsd/blob/master/sys/kern/tty_pts.c#L171-L175

The above code's purpose is to verify this. However, it returns an io::ErrorKind::WouldBlock in both the following cases:

  • if the O_NONBLOCK flag is set with libc::fcntl on the file descriptor
  • if it isn't

@fabianfreyer
Copy link
Contributor Author

OTOH, I replicated this test case with two C programs which do show the desired behavior:

Blocking

#include <stdio.h>
#include <fcntl.h>
#include <stdlib.h>
#include <errno.h>
#include <assert.h>
#include <unistd.h>
#include <signal.h>

void blocked(int _) {
	puts("blocked");
	exit(0);
}

int main() {
	alarm(1);
	signal(SIGALRM, blocked);

	int pty_master = posix_openpt(O_RDWR | O_NOCTTY | O_CLOEXEC);

	char *buf[256];
	ssize_t ret = read(pty_master, buf, 256);

	assert(ret == -1);
	assert(errno == EWOULDBLOCK);
}

Output:

$ ./read_block
blocked

Non-Blocking

#include <stdio.h>
#include <fcntl.h>
#include <stdlib.h>
#include <errno.h>
#include <assert.h>
#include <unistd.h>

int main() {
	int pty_master = posix_openpt(O_RDWR | O_NOCTTY | O_CLOEXEC);

	char *buf[256];

	int flags = fcntl(pty_master, F_GETFL, 0);
	assert(fcntl(pty_master, F_SETFL, flags | O_NONBLOCK) != -1);

	ssize_t ret = read(pty_master, buf, 256);
	assert(ret == -1);
        assert(errno == EWOULDBLOCK);
	perror("read");
}

Output:

$ ./read_noblock    
read: Resource temporarily unavailable

@LucioFranco
Copy link

LucioFranco commented Jan 5, 2019

@fabianfreyer sorry for the delayed response, honestly, this is a bit out of my personal knowledge, it seems you are trying to use a non tokio based non blocking item on tokio and I don't know if that will work.

Anyways, good luck finding the solution! Feel free to ping me here or in gitter if help is needed with higher level tokio types and futures.

@pkgw
Copy link
Owner

pkgw commented Feb 2, 2019

Note to self: apparently CirrusCI offers FreeBSD CI so that this can be tested automatically one the proper approach is devised.

@fabianfreyer
Copy link
Contributor Author

fabianfreyer commented Mar 4, 2019

I just checked the current version on macos, and it seems to be exhibiting exactly the same behavior.
Currently I'm waiting on a Linux install so I can check that too. EDIT: stund on Linux also has this problem.

This leads me to the conclusion that this seems to be a different issue.

Any chance we could merge this into tokio-pty-process, and independently take a look at stund?

@fabianfreyer
Copy link
Contributor Author

@pkgw what do you think of the approach outlined in #48 (comment)? Seeing as the issue I am having seems to be an unrelated issue - any chance we could move this forward?

@pkgw
Copy link
Owner

pkgw commented Apr 14, 2019

Sorry for letting this languish. Yes, let's merge this and worry about the other bits later.

@pkgw pkgw merged commit 2d5e9d3 into pkgw:master Apr 14, 2019
@fabianfreyer
Copy link
Contributor Author

Thanks, great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants