Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resolve: reject host names containing spaces #2073

Closed
wants to merge 1 commit into from

Conversation

mkauf
Copy link
Contributor

@mkauf mkauf commented Nov 12, 2017

Some implementations of getaddrinfo() resolve 127.0.0.1 www.example.com
to 127.0.0.1. Reject such invalid host names.

There are conference talks by Orange Tsai about this topic. He specifically mentions curl and explains why this is a problem:

Affected getaddrinfo() implementations:

Not affected:

  • OpenBSD
  • Windows

Test program: getaddrinfotest.c

Some implementations of getaddrinfo() resolve "127.0.0.1 www.example.com"
to "127.0.0.1". Reject such invalid host names.

Reported-by: Orange Tsai
@jay
Copy link
Member

jay commented Nov 12, 2017

no objection, as discussed in the redhat bug that seems to be a valid delim where you could do "foo bar baz" and then x[0], x[4], x[8] without terminating, but I can't think of any reason we would want to allow that in the hostname. i'll watch his presentation this week so i can get more detail

@bagder
Copy link
Member

bagder commented Nov 12, 2017

Shouldn't we rather fail already in url.c:parseurlandfillconn() if there's a space in the host name?

@bagder
Copy link
Member

bagder commented Nov 13, 2017

I will maintain and emphasize that curl still doesn't do a 100% validity check on the URL so the fact that curl considers a URL to be slightly different than other URL parsers when given crap is not surprising. I will also maintain that this is also a direct result of the fact that the URL spec situation right now is mostly anarchy and decay. It is hard to know how to parse "URLs".

@jay
Copy link
Member

jay commented Nov 13, 2017

That's related but somewhat separate. Essentially the parser is a little forgiving to be compatible with the wackiness out there, but I can't think of any broken implementation writing a Location like
http://foo bar/baz
I'm sure you would agree that just doesn't make any sense.

@bagder
Copy link
Member

bagder commented Nov 13, 2017

I do indeed think we should reject spaces in host names. My comment about not checking 100% correctness was more in regards to the general problem discussed in the presentation by Orange.

@mkauf
Copy link
Contributor Author

mkauf commented Nov 14, 2017

"foo bar baz"

getaddrinfo() only accepts names containing whitespace if the first element is a numerical IP address. The reason is that some getaddrinfo() implementations call inet_aton(). The input for inet_aton() may be "zero-terminated" or "whitespace-terminated". See inet_addr.c (glibc)

Shouldn't we rather fail already in url.c:parseurlandfillconn() if there's a space in the host name?

Makes sense, because there are some other URL syntax checks in this function.

On the other hand, it's probably easier to let getaddrinfo() check the syntax of the hostname. The rules for valid hostnames may be complex.

@bagder
Copy link
Member

bagder commented Nov 14, 2017

On the other hand, it's probably easier to let getaddrinfo() check the syntax of the hostname. The rules for valid hostnames may be complex.

Right, a check in url.c:parseurlandfillconn() could check for valid URL letters, and space is not one...

@krytarowski
Copy link

Is the NetBSD behavior wrong? If so, I will report it as a bug.

@bagder
Copy link
Member

bagder commented Nov 17, 2017

@krytarowski I have no doubts that the libc hackers will have their entirely own and unique perspective on whether this is the correct behavior or not... ask them!

bagder added a commit that referenced this pull request Nov 17, 2017
Host names like "127.0.0.1 moo" would otherwise be accepted by some
getaddrinfo() implementations.

Fixes #2073
@mkauf
Copy link
Contributor Author

mkauf commented Nov 17, 2017

@krytarowski yes, please file a bug report for NetBSD. While you are at it, could you also file a bug report for FreeBSD...? They probably use the same code anyway and could share the bugfix.

@krytarowski
Copy link

yes, please file a bug report for NetBSD.

OK, thanks!

could you also file a bug report for FreeBSD...?

No.

@krytarowski
Copy link

Hmm.. I was trying to find a reproducer in C that behaves differently on NetBSD/FreeBSD/Linux vs OpenBSD. I don't observe a different behavior.

Could you please show an example code to show the problem.

bagder added a commit that referenced this pull request Nov 17, 2017
Host names like "127.0.0.1 moo" would otherwise be accepted by some
getaddrinfo() implementations.

Fixes #2073
@mkauf
Copy link
Contributor Author

mkauf commented Nov 18, 2017

Could you please show an example code to show the problem.

Example code: getaddrinfotest.c

@krytarowski
Copy link

I have an access to a remote OpenBSD host with 6.1.

$ ./a.out                                                                                                                            
127.0.0.1
$ uname -a
OpenBSD obsd 6.1 GENERIC#19 i386

My desktop:

rugged$ ./a.out  
127.0.0.1
rugged$ uname -a
NetBSD rugged 8.99.1 NetBSD 8.99.1 (GENERIC) #7: Mon Jul  3 15:56:30 CEST 2017  root@chieftec:/public/netbsd-root/sys/arch/amd64/compile/GENERIC amd64

The result is the same.

@mkauf
Copy link
Contributor Author

mkauf commented Nov 18, 2017

In my tests, OpenBSD has sent a query to the DNS server to resolve 127.0.0.1 www.example.com. So the result depends on the DNS server that the system uses, but OpenBSD is not (directly) affected by the bug. I have tested with Google's DNS server (8.8.8.8) in /etc/resolv.conf.

My test results (OpenBSD run inside a VMware VM):

$ uname -a
OpenBSD openbsd.localdomain 6.2 GENERIC#163 i386
$ ./a.out
getaddrinfo: no address associated with name
$ uname -a
OpenBSD openbsd.localdomain 6.1 GENERIC#291 i386
$ ./a.out
getaddrinfo: no address associated with name
$ uname -a
OpenBSD openbsd.localdomain 6.0 GENERIC#1917 i386
$ ./a.out
getaddrinfo: no address associated with name

@krytarowski
Copy link

So, is there a better test-case regardless of DNS server?

bagder added a commit that referenced this pull request Nov 20, 2017
Host names like "127.0.0.1 moo" would otherwise be accepted by some
getaddrinfo() implementations.

Updated test 1034 and 1035 accordingly.

Fixes #2073
@mkauf
Copy link
Contributor Author

mkauf commented Nov 21, 2017

So, is there a better test-case regardless of DNS server?

I don't know. I don't mind reporting the bug myself with the testcase at hand.

@krytarowski
Copy link

krytarowski commented Nov 22, 2017

Someone (TCP/IP stack engineer) suggested me that OpenBSD is wrong and NetBSD is behaving correctly. This behavior depending on DNS is suspicious.

@bagder bagder closed this in fa93922 Nov 22, 2017
@mkauf
Copy link
Contributor Author

mkauf commented Dec 17, 2017

I have reported the bug in getaddrinfo() to FreeBSD and NetBSD.

@mkauf mkauf deleted the getaddrinfo_whitespace branch August 11, 2018 11:05
@lock lock bot locked as resolved and limited conversation to collaborators Nov 9, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants