Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Haiku test failure. keepalive / connect_timeout/unbound #85

Open
kallisti5 opened this issue Jun 17, 2020 · 15 comments
Open

Haiku test failure. keepalive / connect_timeout/unbound #85

kallisti5 opened this issue Jun 17, 2020 · 15 comments

Comments

@kallisti5
Copy link
Contributor

Just reporting a test failure on Haiku. Overall it's pretty close!

master as of 32969e5

/Data/socket2-rs> cargo test
  Downloaded remove_dir_all v0.5.3
  Downloaded rand v0.4.6
  Downloaded tempdir v0.3.7
  Downloaded 3 crates (97.1 KB) in 0.93s
   Compiling remove_dir_all v0.5.3
   Compiling rand v0.4.6
   Compiling tempdir v0.3.7
   Compiling socket2 v0.3.12 (/Data/socket2-rs)
    Finished test [unoptimized + debuginfo] target(s) in 5.27s
     Running target/debug/deps/socket2-d90163a26b657973

running 13 tests
test socket::test::connect_timeout_unbound ... FAILED
test socket::test::connect_timeout_unrouteable ... ok
test socket::test::connect_timeout_valid ... ok
test socket::test::keepalive ... FAILED
test socket::test::nodelay ... ok
test socket::test::tcp ... ok
test sys::test_ip ... ok
test tests::domain_fmt_debug ... ok
test tests::domain_for_address ... ok
test tests::protocol_fmt_debug ... ok
test tests::socket_address_ipv4 ... ok
test tests::socket_address_ipv6 ... ok
test tests::type_fmt_debug ... ok

failures:

---- socket::test::connect_timeout_unbound stdout ----
thread 'main' panicked at 'unexpected success', src/socket.rs:925:22
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- socket::test::keepalive stdout ----
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: -2147483643, kind: InvalidInput, message: "Invalid Argument" }', src/socket.rs:985:9


failures:
    socket::test::connect_timeout_unbound
    socket::test::keepalive

test result: FAILED. 11 passed; 2 failed; 0 ignored; 0 measured; 0 filtered out

error: test failed, to rerun pass '--lib'
@Thomasdezeeuw
Copy link
Collaborator

@alexcrichton is there anyone we can ping for Haiku support?

@alexcrichton
Copy link
Member

I am unaware of someone myself.

@Thomasdezeeuw
Copy link
Collaborator

This is a bit out of the blue, but @jessicah you authored commit 7d3be9b which involves Haiku, could you help with this issue?

@jessicah
Copy link
Contributor

jessicah commented Jun 19, 2020

@kallisti5 is actually one of the core Haiku maintainers...

Regarding the keepalive test, is c_int dependent on the OS arch? I.e. 32-bit on 32-bit arch, 64-bit on 64-bit arch? If so, then this is likely the problem.

Keepalive: https://github.com/alexcrichton/socket2-rs/blob/32969e53c75849e4047233d8b55f551a8645b21d/src/sys/unix.rs#L839
Setsockopt: https://github.com/alexcrichton/socket2-rs/blob/32969e53c75849e4047233d8b55f551a8645b21d/src/sys/unix.rs#L885
Haiku expecting a 32-bit integer: https://github.com/haiku/haiku/blob/5c63c64bfd2cebc9010219266f2cab3d510623b3/src/add-ons/kernel/network/stack/net_socket.cpp#L1605

if (length != sizeof(int32))
  return B_BAD_VALUE;

As for the connect timeout unbound, I'm not sure, but sounds like a bug on Haiku's side, with the connection lingering after the socket has been dropped.

@Thomasdezeeuw
Copy link
Collaborator

@kallisti5 is actually one of the core Haiku maintainers...

Ah, I didn't know that.

Regarding the keepalive test, is c_int dependent on the OS arch? I.e. 32-bit on 32-bit arch, 64-bit on 64-bit arch? If so, then this is likely the problem.

I didn't check the spec, but I think it only needs to be at least 16 bits. Even on most 64 bit architectures its actually 32 bit (this is at least true for MacOS and Linux).

Keepalive:

https://github.com/alexcrichton/socket2-rs/blob/32969e53c75849e4047233d8b55f551a8645b21d/src/sys/unix.rs#L839

Setsockopt:

https://github.com/alexcrichton/socket2-rs/blob/32969e53c75849e4047233d8b55f551a8645b21d/src/sys/unix.rs#L885

Haiku expecting a 32-bit integer: https://github.com/haiku/haiku/blob/5c63c64bfd2cebc9010219266f2cab3d510623b3/src/add-ons/kernel/network/stack/net_socket.cpp#L1605

if (length != sizeof(int32))
  return B_BAD_VALUE;

Is this true for all setsockopt options on Haiku? If so its seems a relative easy fix: replacing as libc::socklen_t with the correct type in: https://github.com/alexcrichton/socket2-rs/blob/32969e53c75849e4047233d8b55f551a8645b21d/src/sys/unix.rs#L885.

As for the connect timeout unbound, I'm not sure, but sounds like a bug on Haiku's side, with the connection lingering after the socket has been dropped.

@kallisti5 could you look further into that yourself?

@kallisti5
Copy link
Contributor Author

@kallisti5 is actually one of the core Haiku maintainers...

Doh!!!! I wanted to see if anyone else would step up to do the work.

I'll take a look. Thanks for the analysis Jessica :-)

@jessicah
Copy link
Contributor

Is this true for all setsockopt options on Haiku?

No, a couple options can receive a struct as argument, struct linger & struct timeval.

@Thomasdezeeuw
Copy link
Collaborator

@kallisti5 @jessicah is there any way we can setup CI for Haiku (see #78)?

@kallisti5
Copy link
Contributor Author

kallisti5 commented Dec 21, 2020

Hi! We do actually have package Haiku builder nodes, but they're leveraging our native ports (haikuporter) system.
We do support running tests however in it:
https://github.com/haikuports/haikuports/blob/master/dev-lang/openjdk/openjdk14-14.0.2.12.recipe#L238

What CI system are you using? Haiku has full virtio support, and runs fine in cloud environments.
I could go to Haiku, Inc. to see if we'll fund a dedicated full-time x86_64 Haiku vm for CI/CD of rust stuff.

@Thomasdezeeuw
Copy link
Collaborator

@kallisti5 currently we're using GitHub Actions. I don't think we need anything dedicated, currently GitHub Actions completes in ~1 min. libc seems to use Docker https://github.com/rust-lang/libc/tree/master/ci, does Haiku run in that? Issue #78 tracks CI support, if there is anything you can do that would be great and we can keep socket2 for Haiku working.

@kallisti5
Copy link
Contributor Author

kallisti5 commented Dec 22, 2020

Hi!

Haiku's not Linux (we have our own kernel, syscalls, and standard libraries), so we really can't run in containers.

Haiku easily boots in qemu and other full emulation and is fully POSIX compliant with an SSH server running for remote command execution.

There are working vagrant boxes for Haiku, and we have a pretty rich software ecosystem... python, ruby, (early) Rust, (early) Golang support, c, clang, llvm, etc etc

Here is our full searchable software port list: https://depot.haiku-os.org

@Thomasdezeeuw
Copy link
Collaborator

@kallisti5 libc also seems to run NetBSD and Redox in docker, that's why I asked. Do you know of an example Rust project that uses QEMU (or something else) to test on Haiku from which we can setup our own CI for Haiku?

@kallisti5
Copy link
Contributor Author

ooh.. that's new. Let me see what "redoxer" is doing. It would be super handy if we could do something similar

@kallisti5
Copy link
Contributor Author

Latest results under master as of f7023b4

/Data/Code/socket2-rs> cargo test
    Finished test [unoptimized + debuginfo] target(s) in 0.01s
     Running unittests (target/debug/deps/socket2-7f77f3cf6f1ffd13)

running 4 tests
test sockaddr::ipv4 ... ok
test sockaddr::ipv6 ... ok
test sys::in6_addr_convertion ... ok
test sys::in_addr_convertion ... ok

test result: ok. 4 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running tests/socket.rs (target/debug/deps/socket-9c38dbe7041dcda2)

running 31 tests
test broadcast ... ok
test common_flags ... ok
test connect_timeout_unbound ... ok
test connect_timeout_unrouteable ... ok
test connect_timeout_valid ... ok
test domain_fmt_debug ... ok
test domain_for_address ... ok
test from_invalid_raw_fd_should_panic ... ok
test keepalive ... ok
test linger ... FAILED
test niche ... ok
test no_common_flags ... ok
test nodelay ... ok
test only_v6 ... FAILED
test out_of_band ... FAILED
test out_of_band_inline ... ok
test protocol_fmt_debug ... ok
test r#type ... ok
test read_timeout ... ok
test recv_buffer_size ... ok
test recv_from_vectored_truncated ... ok
test recv_vectored_truncated ... ok
test reuse_address ... ok
test send_buffer_size ... ok
test send_from_recv_to_vectored ... ok
test send_recv_vectored ... ok
test set_nonblocking ... ok
test tcp_keepalive ... FAILED
test ttl ... ok
test type_fmt_debug ... ok
test unicast_hops_v6 ... ok

failures:

---- linger stdout ----
thread 'main' panicked at 'failed to get initial value: Os { code: -2147454942, kind: Other, message: "Protocol option not available" }', tests/socket.rs:1123:1

---- only_v6 stdout ----
thread 'main' panicked at 'failed to get initial value: Os { code: -2147454933, kind: Other, message: "Operation not supported" }', tests/socket.rs:1138:1

---- out_of_band stdout ----
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `12`,
 right: `1`', tests/socket.rs:495:5

---- tcp_keepalive stdout ----
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: -2147483643, kind: InvalidInput, message: "Invalid Argument" }', tests/socket.rs:717:39


failures:
    linger
    only_v6
    out_of_band
    tcp_keepalive

test result: FAILED. 27 passed; 4 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.27s

error: test failed, to rerun pass '--test socket'

still looking at redoxer, it's not - not complex :-)

@link2xt
Copy link
Contributor

link2xt commented Jul 26, 2021

In tcp_keepalive the problem is this line:

unsafe { setsockopt(fd, libc::IPPROTO_TCP, KEEPALIVE_TIME, secs)? }

This if corresponds to the case when keepalive.time is set. keepalive.time is the time the connection has to be idle before keepalives are sent. There is no such option on Haiku at all. The only option on Haiku related to keepalive is the call setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, 0) to disable keepalives and setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, 1) to enable it. Same for OpenBSD by the way, except that on OpenBSD you can control global setting net.inet.tcp.keepidle via sysctl. But per socket on OpenBSD and Haiku you can only enable/disable keepalives without controlling any constants.

In the line quoted above on both Haiku and OpenBSD the following call is issued:

setsockopt(fd, IPPROTO_TCP, SO_KEEPALIVE, seconds)

which is totally wrong as IPPROTO_TCP level has no SO_KEEPALIVE option, and SO_KEEPALIVE option accepts boolean, not the number of seconds.

By the way on Linux this call only sets the number of seconds, but does not enable keepalives. You still need to do setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, 1) on Linux in addition to setting the time. Currently the code is broken: it tweaks keepalives, but does not enable them.

So overall the whole function set_tcp_keepalive should be rewritten to:

  1. Call setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, keepalive.time.is_some()) on all systems supporting it: Haiku, OpenBSD, Linux, etc. I think enabling/disabling keepalives is supported on all operating systems. This is already done in
    self.set_keepalive(true)?;
  2. Call setsockopt(fd, IPPROTO_TCP, TCP_KEEPIDLE, keepalive.time.unwrap()) (of course use if let, not unwrap) on Linux and other operating systems supporting it. On MacOS for some reason this option is called TCP_KEEPALIVE, see https://www.unix.com/man-page/mojave/4/tcp/
  3. Keep the code setting TCP_KEEPINTV and TCP_KEEPCNT as is, this seems to be correct.

I will probably make a PR with a fix myself so you don't have to read all these notes and redo figuring this out. Haiku has no documentation it seems, I figured out it does not have TCP options and how to use setsockopt there by grepping their source.

Update: made a PR #251

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants