Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IO::Socket::INET Documentation Inconsistency [PF_INET as a default] #2162

Closed
pprocacci opened this issue Aug 1, 2018 · 8 comments · Fixed by MoarVM/MoarVM#1166
Closed

IO::Socket::INET Documentation Inconsistency [PF_INET as a default] #2162

pprocacci opened this issue Aug 1, 2018 · 8 comments · Fixed by MoarVM/MoarVM#1166
Labels

Comments

@pprocacci
Copy link

The documentation on IO::Socket::INET shows PF_INET being a default for the family argument to new.

However, it is defaulting to PF_INET6:

perl6 -e 'my \a = IO::Socket::INET.new(:host<www.google.com>, :port<443>);'
Could not connect socket: No route to host
in block at -e line 1

Here's a truss (FreeBSD) showing this:
socket(PF_INET6,SOCK_STREAM,0) = 17 (0x11)
connect(17,{ AF_INET6 [2607:f8b0:4002:c0c::67]:443 },28) ERR#65 'No route to host'

It never attempts ipv4.

The following works as expected:

perl6 -e 'my \a = IO::Socket::INET.new(:host<www.google.com>, :port<443>, :family<PF_INET>);'

I suspect this is an actual bug rather than a documentation problem as it makes more sense that PF_INET be the default.

Environment

  • Operating system: FreeBSD 11.1-RELEASE
  • Compiler version: This is Rakudo version 2018.06-304-g8cedbbd18 built on MoarVM version 2018.06-329-g21ea40f68 implementing Perl 6.c.
@zoffixznet
Copy link
Contributor

The following works as expected

It doesn't. It returns a Failure because you gave incorrect :family arg (you gave a string "PF_INET" instead of an enum/number; should be :family(PF_INET) instead). It doesn't crash because you didn't do anything with that variable to trigger the explosion.

Does this give any errors?

$ perl6 -e 'given IO::Socket::INET.new(:host<www.google.com>, :port<443>, :family(PF_INET)) { .print: "trash\r\n";.recv.say }'
�����F

Here's a truss (FreeBSD) showing this:

Are you sure that's the only line for connecting? I do see an AF_INET6 line (don't know why it's attempted), but it's followed by IPv4 lines too. This is on an ancient Ubuntu on 2018.06-146-g1b3dd35:

$ strace perl6 -e 'IO::Socket::INET.new(:host<www.google.com>, :port<443>);' 2>&1 | grep -E 'PF_INET|connect'
[…]
connect(14, {sa_family=AF_INET6, sin6_port=htons(443), inet_pton(AF_INET6, "2607:f8b0:400b:80e::2004", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 ENETUNREACH (Network is unreachable)
[…]
connect(14, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("172.217.2.100")}, 16) = 0
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 14
connect(14, {sa_family=AF_INET, sin_port=htons(443), sin_addr=inet_addr("172.217.2.100")}, 16) = 0

In the code I see PF_INET is set as a default, so something else is probably going on.

@pprocacci
Copy link
Author

pprocacci commented Aug 1, 2018

Here is a full session:

% truss -f -o data perl6 -e 'my \a = IO::Socket::INET.new(:host<www.google.com>, :port<443>);'
Could not connect socket: No route to host
in block <unit> at -e line 1

% fgrep connect data
64043: connect(16,{ AF_INET 10.1.0.98:53 },16) = 0 (0x0)
64043: connect(16,{ AF_INET 10.1.0.98:53 },16) = 0 (0x0)
64043: connect(16,{ AF_INET6 [2607:f8b0:4002:c09::69]:443 },28) ERR#65 'No route to host'
64043: write(2,"Could not connect socket: No rou"...,75) = 75 (0x4b)

You'll notice it only attempts ipv6.

The command you asked me to run provides:

% perl6 -e 'given IO::Socket::INET.new(:host<www.google.com>, :port<443>, :family(PF_INET)) { .print: "trash\r\n";.recv.say }'
Could not connect socket: No route to host
in block <unit> at -e line 1

If I expand that, with using truss to capture what it's doing as far as connecting is concerned:

% truss -o data perl6 -e 'given IO::Socket::INET.new(:host<www.google.com>, :port<443>, :family(PF_INET)) { .print: "trash\r\n";.recv.say }'
Could not connect socket: No route to host
in block <unit> at -e line 1

% fgrep connect data
connect(16,{ AF_INET 10.1.0.98:53 },16) = 0 (0x0)
connect(16,{ AF_INET 10.1.0.98:53 },16) = 0 (0x0)
connect(16,{ AF_INET6 [2607:f8b0:4002:c09::93]:443 },28) ERR#65 'No route to host'
write(2,"Could not connect socket: No rou"...,75) = 75 (0x4b)

Edit: and yes, my mistake with :family(string) .... still learning. The problem however still exists without that small blunder of mine.

@zoffixznet
Copy link
Contributor

OK, so looks like even with family set to PF_INET it only tries IPv6 on FreeBSD (also, for some reason it tries IPv6 on Ubuntu first too).

@Kaiepi would you have an idea about what's going on? I think you might be familiar with this area... and have FreeBSD on hand.

@pprocacci
Copy link
Author

pprocacci commented Aug 1, 2018

Some notes which I think is related:

In:
MoarVM: src/io/syncsocket.c: line 345

  • Resolution of the name is done, struct sockaddr stored in the memory location *dest and returned to the caller.
  • Afterwords dest->ai_family is used for the socket creation.

The important part is src/io/syncsocket.c: line 280

As the notes say, family is derived from the port: (port >>16) & USHORT_MAX
The only way I could be getting PF_INET6 is if this value is either PF_INET6 or AF_UNSPEC.

Assuming it is 0, this is essentially whateveryougot '*' in perl6..

The routine MVM_io_resolve_host_name seems to be broken for several reasons.

  • First, getaddrinfo stores "zero or more" results, yet this function is only interested in the first. This is important to point out because it's quite possible to have more than 1 record. Call it poor man's load balancing. This function should return all applicable records.

  • Not only should it return all applicable records, the connect routine should attempt to connect to all records in series until one succeeds which doesn't look like it's happening, though Zoffix's last comment makes me wonder whether that is true or not.

  • The connect routine uses the family returned from the function. If *dest->ai_family is PF_INET6, then I'll connect using PF_INET6.

Anyways, I'm not 100% sure of any of this, but useful notes here for whoever is more familiar with MoarVM which I am not.

EDIT: []https://github.com/rakudo/rakudo/blob/master/src/core/IO/Socket/INET.pm6

line 117 reads:

my $PIO := nqp::socket($.listening ?? 10 !! 0);

These hardcoded values seem strange too. Are these values families? It seems like it.

10 is PF_INET6 on Linux
28 is PF_INET6 on FreeBSD
0 is AF_UNSPEC.
2 is PF_INET on Linux and FreeBSD

This has got to be part of the problem I would think. Seems coincidental if it isn't.
To me this reads, "If we're listening, create a socket as (linux)AF_INET6/(freebsd)AF_CCITT otherwise create a AF_UNSPEC socket. Family is completely ignored.

@pprocacci
Copy link
Author

pprocacci commented Aug 2, 2018

syncsocket.c.diff.txt

Attached you'll find an untested patch that addresses one of the three issues I pointed out.
I would have tested it more thoroughly but I'm not sure how to rebuild Moar with it applied.
This patch only solves the sync socket case, and not the async sockets. Whoever looks at this needs to take that into account.

It changes the behavior of MVM_io_resolve_host_name to return the addrinfo * from the routine instead of sockaddr. The returned addrinfo is then looped upon within the socket_connect routine trying all returned hosts until one is successful or they all fail.

What this still doesn't address is . . . .

Why PF_INET6 when I'm explicitly setting PF_INET. I shouldn't have to set it at all in fact, it's the default.

line 117 of IO::Socket::INET.pm6:

my $PIO := nqp::socket($.listening ?? 10 !! 0);

^^ possibly related hardcoded values for socket families by the looks of it

@zoffixznet
Copy link
Contributor

how to rebuild Moar with it applied.

FWIW, I use Z-Script and its z m command rebuilds MoarVM with the changes.

@pprocacci
Copy link
Author

pprocacci commented Aug 3, 2018

Thanks for that. My problem is fixed with patches I wrote:

Problem 1 - rakudo/src/core/IO/Socket/INET.pm6

    `my $PIO := nqp::socket($.listening ?? 10 !! 0);`

The parameter here does nothing. A parameter is required, but it isn't used in the C function body.
It's misleading and should be cleaned up.

Problem 2 - :family does nothing for outbound sockets.

As pointed out already and described within the C source code, the port parameter that gets passed around is expecting the family to be available within the port variable at bits 16-63. I've supplied a patch that essentially provides this as it expects but it feels so so so wrong:

- nqp::connect($PIO, nqp::unbox_s($.host), nqp::unbox_i($.port));
+ nqp::connect($PIO, nqp::unbox_s($.host), nqp::unbox_i( ($!family +< 16) +| $.port));

This needs to be evaluated.

Problem 3 - Connection attempts are only done on one host returned by getaddrinfo.

If there are 1 or more hosts and 1 is having a problem temporarily, it's a disservice not to attempt the other hosts. Attached patch has logic, but it's not tested.

Problem 4 - My MoarVM experience is shit.

My patches work for me(tm) and have only been tested on FreeBSD using syncsocket's. asyncsockets NEED to be reviewed as well.

Patches -

MoarVM.syncsocketh.diff.txt
rakudo.INET.pm6.diff.txt

Example -

% truss -o data perl6 -e 'given IO::Socket::INET.new(:host<www.google.com>, :port<443>) { .print: "trash\r\n";.recv.say }'

% egrep 'connect|socket' data
socket(PF_INET,SOCK_DGRAM|SOCK_CLOEXEC,0) = 16 (0x10)
connect(16,{ AF_INET 192.168.1.1:53 },16) = 0 (0x0)
socket(PF_INET,SOCK_STREAM,0) = 16 (0x10)
connect(16,{ AF_INET 108.177.122.99:443 },16) = 0 (0x0)

Notes -

  • This fails with my patches:

perl6 -e 'given IO::Socket::INET.new(:host<www.google.com>, :port<443>, :family(PF_INET6))

This is because it just so happens that PF_INET has the correct numerical value defined in IO::Socket::INET.
On FreeBSD PF_INET6 is 28; 10 on Linux.

@Kaiepi
Copy link
Contributor

Kaiepi commented Aug 5, 2018

Can confirm this happens on my FreeBSD machine:

# tcpdump > log &
# perl6 -e 'IO::Socket::INET.new: :host<youtube.com>, :80port, :family(PF_INET)'
# pkill tcpdump
# cat log
...
05:38:30.084987 IP <vps ip>.55105 > 108.61.10.10.choopa.net.domain: 13598+ A? youtube.com. (29)
05:38:30.085455 IP 108.61.10.10.choopa.net.domain > <vps ip>.55105: 13598 1/0/0 A 172.217.4.78 (45)
05:38:30.085578 IP <vps ip>.26508 > 108.61.10.10.choopa.net.domain: 25569+ AAAA? youtube.com. (29)
05:38:30.085808 IP 108.61.10.10.choopa.net.domain > <vps ip>.26508: 25569 1/0/0 AAAA 2607:f8b0:4009:805::200e (57)
05:38:30.086087 IP6 <vps ip>.44111 > ord37s18-in-x0e.1e100.net.http: Flags [S], seq 957584002, win 65535, options [mss 1440,nop,wscale 6,sackOK,TS val 707961818 ecr 0], length 0
05:38:30.087454 IP6 ord37s18-in-x0e.1e100.net.http > <vps ip>.44111: Flags [S.], seq 4281074533, ack 957584003, win 26960, options [mss 1360,sackOK,TS val 1569764365 ecr 707961818,nop,wscale 8], length 0
05:38:30.087561 IP6 <vps ip>.44111 > ord37s18-in-x0e.1e100.net.http: Flags [.], ack 1, win 1041, options [nop,nop,TS val 707961827 ecr 1569764365], length 0

This fails with my patches:
perl6 -e 'given IO::Socket::INET.new(:host<www.google.com>, :port<443>, :family(PF_INET6))

If having the system's values for PF_INET, PF_INET6, etc. would be useful to know on the Rakudo end of things, you could add an NQP function to get a hash of their values and set them using it so the correct values get passed to MVM_io_resolve_host_name, otherwise adding something like this to it before the #ifndef would fix it:

switch (family) {
    case 0:
        family = PF_LOCAL;
        break;
    case 1:
        family = PF_UNIX;
        break;
    case 2:
        family = PF_INET;
        break;
    case 3:
        family = PF_INET6;
        break;
    case 4:
        family = PF_MAX;
        break;
    default:
        MVM_exception_throw_adhoc(tc, "Received an invalid socket family value");
        break;
}

My patches work for me(tm) and have only been tested on FreeBSD using syncsocket's. asyncsockets NEED to be reviewed as well.

This would break async sockets as is since they don't pass the family down through port and it would assume the type of address it wants is PF_UNSPEC (or PF_LOCAL if the above fix is used). There are a few different ways I can think of to fix this:

  • use a different function for resolving hostnames for async sockets (maybe use uv_getaddrinfo)
  • add an extra parameter like bool has_family to MVM_io_resolve_host_name and check its value when getting the address
  • refactor async sockets to allow specifying families

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants