Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std::net::lookup_host repeats output #24250

Closed
nieksand opened this issue Apr 9, 2015 · 3 comments
Closed

std::net::lookup_host repeats output #24250

nieksand opened this issue Apr 9, 2015 · 3 comments

Comments

@nieksand
Copy link
Contributor

nieksand commented Apr 9, 2015

Using std::net::lookup_host seems to duplicate each IP in the result set.

prompt> ./main rust-lang.org
V4(72.8.141.90:0)
V4(72.8.141.90:0)

prompt> ./main google.com | sort
V4(74.125.28.100:0)
V4(74.125.28.100:0)
V4(74.125.28.101:0)
V4(74.125.28.101:0)
V4(74.125.28.102:0)
V4(74.125.28.102:0)
V4(74.125.28.113:0)
V4(74.125.28.113:0)
V4(74.125.28.138:0)
V4(74.125.28.138:0)
V4(74.125.28.139:0)
V4(74.125.28.139:0)
V6([2607:f8b0:400e:c04::66]:0)
V6([2607:f8b0:400e:c04::66]:0)

The documentation snippet (https://doc.rust-lang.org/std/net/fn.lookup_host.html) is enough to reproduce.

I also have a simple gist here: https://gist.github.com/nieksand/9c42100be6dc52df3305

I'm using the latest Rust from master branch on MacOS X 10.9.5. I have not checked on other platforms. I see this for any hostname that I try, so I don't think it's a problem on the nameserver side.

Expected behavior is seeing each IP address listed only once.

@imron
Copy link

imron commented Apr 13, 2015

This is happening because the underlying call to getaddrinfo doesn't provide an addrinfo as a hint, and so the linked list of returned addrinfo structs contains addrinfo for both TCP and UDP protocols.

A simple fix would be to simply pass in a hint specifying only one protocol (e.g. TCP)

However I'm not sure how often you'll find a hostname that has UDP available but not TCP and vice versa, and whether or not it's worth capturing that with something like LookupHostUdp and LookupHostTcp.

@nieksand
Copy link
Contributor Author

nieksand commented May 6, 2015

Requiring a hint argument specifying tcp vs udp makes sense to me.

Manishearth added a commit to Manishearth/rust that referenced this issue Jul 8, 2016
Use hints with getaddrinfo() in std::net::lookup_host()

As noted in rust-lang#24250, `std::net::lookup_host()` repeats each IPv[46] address in the result set. The number of repetitions is OS-dependent; e.g., Linux and FreeBSD give three copies, OpenBSD gives two. Filtering the duplicates can be done by the user if `lookup_host()` is used explicitly, but not with functions like `TcpStream::connect()`. What happens with the latter is that any unsuccessful connection attempt will be repeated as many times as there are duplicates of the address.

The program:

```rust
use std::net::TcpStream;

fn main() {
    let _stream = TcpStream::connect("localhost:4444").unwrap();
}
```

results in the following capture:

[capture-before.txt](https://github.com/rust-lang/rust/files/352004/capture-before.txt)

assuming that "localhost" resolves both to ::1 and 127.0.0.1, and that the listening program opens just an IPv4 socket (e.g., `nc -l 127.0.0.1 4444`.) The reason for this behavior is explained in [this comment](rust-lang#24250 (comment)): `getaddrinfo()` is not constrained.

Various OSS projects (I checked out Postfix, OpenLDAP, Apache HTTPD and BIND) which use `getaddrinfo()` generally constrain the result set by using a non-NULL `hints` parameter and setting at least `ai_socktype` to `SOCK_STREAM`. `SOCK_DGRAM` would also work. Other parameters are unnecessary for pure name resolution.

The patch in this PR initializes a `hints` struct and passes it to `getaddrinfo()`, which eliminates the duplicates. The same test program as above with this change produces:

[capture-after.txt](https://github.com/rust-lang/rust/files/352042/capture-after.txt)

All `libstd` tests pass with this patch.
Manishearth added a commit to Manishearth/rust that referenced this issue Jul 8, 2016
Use hints with getaddrinfo() in std::net::lookup_host()

As noted in rust-lang#24250, `std::net::lookup_host()` repeats each IPv[46] address in the result set. The number of repetitions is OS-dependent; e.g., Linux and FreeBSD give three copies, OpenBSD gives two. Filtering the duplicates can be done by the user if `lookup_host()` is used explicitly, but not with functions like `TcpStream::connect()`. What happens with the latter is that any unsuccessful connection attempt will be repeated as many times as there are duplicates of the address.

The program:

```rust
use std::net::TcpStream;

fn main() {
    let _stream = TcpStream::connect("localhost:4444").unwrap();
}
```

results in the following capture:

[capture-before.txt](https://github.com/rust-lang/rust/files/352004/capture-before.txt)

assuming that "localhost" resolves both to ::1 and 127.0.0.1, and that the listening program opens just an IPv4 socket (e.g., `nc -l 127.0.0.1 4444`.) The reason for this behavior is explained in [this comment](rust-lang#24250 (comment)): `getaddrinfo()` is not constrained.

Various OSS projects (I checked out Postfix, OpenLDAP, Apache HTTPD and BIND) which use `getaddrinfo()` generally constrain the result set by using a non-NULL `hints` parameter and setting at least `ai_socktype` to `SOCK_STREAM`. `SOCK_DGRAM` would also work. Other parameters are unnecessary for pure name resolution.

The patch in this PR initializes a `hints` struct and passes it to `getaddrinfo()`, which eliminates the duplicates. The same test program as above with this change produces:

[capture-after.txt](https://github.com/rust-lang/rust/files/352042/capture-after.txt)

All `libstd` tests pass with this patch.
bors added a commit that referenced this issue Jul 9, 2016
Use hints with getaddrinfo() in std::net::lookup_host()

As noted in #24250, `std::net::lookup_host()` repeats each IPv[46] address in the result set. The number of repetitions is OS-dependent; e.g., Linux and FreeBSD give three copies, OpenBSD gives two. Filtering the duplicates can be done by the user if `lookup_host()` is used explicitly, but not with functions like `TcpStream::connect()`. What happens with the latter is that any unsuccessful connection attempt will be repeated as many times as there are duplicates of the address.

The program:

```rust
use std::net::TcpStream;

fn main() {
    let _stream = TcpStream::connect("localhost:4444").unwrap();
}
```

results in the following capture:

[capture-before.txt](https://github.com/rust-lang/rust/files/352004/capture-before.txt)

assuming that "localhost" resolves both to ::1 and 127.0.0.1, and that the listening program opens just an IPv4 socket (e.g., `nc -l 127.0.0.1 4444`.) The reason for this behavior is explained in [this comment](#24250 (comment)): `getaddrinfo()` is not constrained.

Various OSS projects (I checked out Postfix, OpenLDAP, Apache HTTPD and BIND) which use `getaddrinfo()` generally constrain the result set by using a non-NULL `hints` parameter and setting at least `ai_socktype` to `SOCK_STREAM`. `SOCK_DGRAM` would also work. Other parameters are unnecessary for pure name resolution.

The patch in this PR initializes a `hints` struct and passes it to `getaddrinfo()`, which eliminates the duplicates. The same test program as above with this change produces:

[capture-after.txt](https://github.com/rust-lang/rust/files/352042/capture-after.txt)

All `libstd` tests pass with this patch.
@alexcrichton
Copy link
Member

Fixed by #34700

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants