Skip to content

HttpProxyHandler generates invalid CONNECT url and Host header when address is resolved#7784

Merged
normanmaurer merged 1 commit into
netty:4.1from
slandelle:4.1
Mar 27, 2018
Merged

HttpProxyHandler generates invalid CONNECT url and Host header when address is resolved#7784
normanmaurer merged 1 commit into
netty:4.1from
slandelle:4.1

Conversation

@slandelle
Copy link
Copy Markdown
Contributor

Motivation:

HttpProxyHandler uses NetUtil#toSocketAddressString to compute
CONNECT url and Host header.

The url is correct when the address is unresolved, as
NetUtil#toSocketAddressString will then use
getHoststring/getHostname. If the address is already resolved, the
url will be based on the IP instead of the hostname.

There’s an additional minor issue with the Host header: default port
443 should be omitted.

Modifications:

  • Only use NetUtil.toSocketAddressString when address is unresolved,
    use getHostname otherwise
  • Don’t append port to Host header when it’s 443

Result:

HttpProxyHandler performs properly when connecting to a resolved address

@normanmaurer
Copy link
Copy Markdown
Member

@slandelle this produced multiple test errors... please check

@slandelle
Copy link
Copy Markdown
Contributor Author

Yes, I'll have a look.

But I think those tests are broken.
For example, setting the SslHandler before the HttpProxyHandler in the pipeline causes the CONNECT request to be sent over HTTPS instead of clear HTTP.
Those tests only pass because the proxy is a fake one and is broken too.

@normanmaurer
Copy link
Copy Markdown
Member

@slandelle then fix the broken tests ;)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slandelle should we make this a utility method in the http package ? Like in HttpUtils ? I think this is something that is useful in general.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do, sure. But does it make sense to introduce this now while work on Netty 5 targeting Java 8 is about to start?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slandelle I think it still does as netty 5 is most likely still be 1 year away

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This utility would be better suited for NetUtil than HttpUtil, don't you think?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure... isnt it kind of related how you format stuff for the http header ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok, I thought you were only talking about the first line above your comment, not the whole block. Will do.

@slandelle
Copy link
Copy Markdown
Contributor Author

@normanmaurer @trustin Note that tests passes, but I didn't fix the logic in ProxyHandlerTest that looks broken to me. That would be a separate issue.

@normanmaurer
Copy link
Copy Markdown
Member

@slandelle sounds good... if you have time we love PRs as you know ;)

@kachayev
Copy link
Copy Markdown
Contributor

kachayev commented Mar 16, 2018

@slandelle @normanmaurer I have a question regarding

default port 443 should be omitted

RFC 7230 and older RFC2616 on Host header and RFC7231 on CONNECT method do not require excluding 443 port from the header.

RFC2616 says

A "host" without any trailing port information implies the default port for the service requested

Which might be applied to 80 and 443 if we still honor 2616, but not exclusively to 443.

@slandelle
Copy link
Copy Markdown
Contributor Author

@kachayev From my experience with AHC), there are some servers that really expect with port. And that's how browsers compute Host headers too.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you make this change?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted

@kachayev
Copy link
Copy Markdown
Contributor

kachayev commented Mar 16, 2018

@slandelle With the latest changes getHostHeader implementation makes perfect sense and correlates with RFCs (excluding both default ports). But here you always call the method with secure set to true, when in fact I can use http-only proxy. And here url is calculated by checking only 443 (my point that it's better either honor both default ports or none of them).

@slandelle
Copy link
Copy Markdown
Contributor Author

@kachayev You wouldn't use this handler for http-only proxy, the mechanism is different and doesn't use a CONNECT request (directly connect to proxy and send absolute url instead of relative). I have to check for WebSockets then.

@kachayev
Copy link
Copy Markdown
Contributor

@slandelle "wouldn't use" doesn't mean it's not allowed. RFC7231 I mentioned above does not specify tunnel to be used only with a secure connection. CURL, for example, does not establish tunnel when connecting to HTTP proxy by default but you still can force this:

$ curl -vvv -o /dev/null -x http://localhost:8888 --proxytunnel http://www.google.com
* Rebuilt URL to: http://www.google.com/
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
*   Trying ::1...
* TCP_NODELAY set
* Connected to localhost (::1) port 8888 (#0)
* Establish HTTP proxy tunnel to www.google.com:80
> CONNECT www.google.com:80 HTTP/1.1
> Host: www.google.com:80
> User-Agent: curl/7.54.0
> Proxy-Connection: Keep-Alive
>
< HTTP/1.1 200 Connection established
<
* Proxy replied OK to CONNECT request
> GET / HTTP/1.1
> Host: www.google.com
> User-Agent: curl/7.54.0
> Accept: */*
>
< HTTP/1.1 200 OK

@slandelle
Copy link
Copy Markdown
Contributor Author

@kachayev Which proxy do you use? Then, what do you suggest? Ignore 80 and 443? It's not possible to guess the actual scheme and current API doesn't let you specify.

@kachayev
Copy link
Copy Markdown
Contributor

kachayev commented Mar 16, 2018

@slandelle Charles for Mac, but that's irrelevant because a lot of proxies do support CONNECT over plain HTTP. My suggestion is not to ignore even default ports (neither 443 nor 80) as it might break functionality in some cases (at least we cannot guarantee that it would not).

@slandelle
Copy link
Copy Markdown
Contributor Author

slandelle commented Mar 16, 2018

My suggestion is not to ignore even default ports (neither 443 nor 80) as it might break functionality in some cases (at least we cannot guarantee that it would not).

From my experience, it's the other way around: things broke when the default port was there. And that's what AHC and browser does. I'm fine with not ignoring by default, but I'll introduce an option to override.

@slandelle
Copy link
Copy Markdown
Contributor Author

@normanmaurer PTAL again

@kachayev
Copy link
Copy Markdown
Contributor

@slandelle 👍

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slandelle you need to revert all the removals of throws... as its a breaking change. People who may have override the method and have throws on it will be in trouble.

@slandelle
Copy link
Copy Markdown
Contributor Author

@normanmaurer fixed

@carl-mastrangelo
Copy link
Copy Markdown
Member

cc: @zpencer

@normanmaurer
Copy link
Copy Markdown
Member

@carl-mastrangelo @ejona86 @trustin any of you can verify this one ?

@zpencer
Copy link
Copy Markdown
Contributor

zpencer commented Mar 20, 2018

I can take a look

@ejona86
Copy link
Copy Markdown
Member

ejona86 commented Mar 20, 2018

FWIW, the getHostString changes all look good. It's the ignoreDefaultPortsInConnectHostHeader that I needed to look at more (including the discussion here) before feeling comfortable approving.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a test for this case.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slandelle can you do this ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced it would make sense to test this single line of code.
Then, url and Host header computations are entangled so I can save allocations.
I can't test HttpProxyHandler#newInitialMessage because I would need to override destinationAddress() which is final.
I'm going to extract CONNECT request creation in a static method.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it should be fine. Stay tuned.

@slandelle
Copy link
Copy Markdown
Contributor Author

@normanmaurer @zpencer Host header tests added, PTAL again.

@normanmaurer
Copy link
Copy Markdown
Member

@trustin any concerns ?

@trustin
Copy link
Copy Markdown
Member

trustin commented Mar 22, 2018 via email

Copy link
Copy Markdown
Member

@normanmaurer normanmaurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits... otherwise LGTM

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/to/so/ ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can remove the else as you return anyway.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add . on EOL.

@slandelle
Copy link
Copy Markdown
Contributor Author

@normanmaurer done

Copy link
Copy Markdown
Member

@trustin trustin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small comment.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess a user of this method may think the return value will contain the port number. How about adding a boolean parameter withPort, so that this method adds :<port> when it's true?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about trying to find a better name instead? Maybe formatHostnameForHttp?
I'm not fond of expanding an API beyond what's being used internally so it can serve for whatever other purpose we're not aware of.
WDYT?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's also fine.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

…ddress is resolved

Motivation:

HttpProxyHandler uses `NetUtil#toSocketAddressString` to compute
CONNECT url and Host header.

The url is correct when the address is unresolved, as
`NetUtil#toSocketAddressString` will then use
`getHoststring`/`getHostname`. If the address is already resolved, the
url will be based on the IP instead of the hostname.

There’s an additional minor issue with the Host header: default port
443 should be omitted.

Modifications:

* Introduce NetUtil#getHostname
* Introduce HttpUtil#formatHostnameForHttp to format an
InetSocketAddress to
HTTP format
* Change url computation to favor hostname instead of IP
* Introduce HttpProxyHandler ignoreDefaultPortsInConnectHostHeader
parameter to ignore 80 and 443 ports in Host header

Result:

HttpProxyHandler performs properly when connecting to a resolved address
@normanmaurer normanmaurer merged commit d60cd02 into netty:4.1 Mar 27, 2018
@normanmaurer
Copy link
Copy Markdown
Member

@slandelle thanks!

@normanmaurer normanmaurer added this to the 4.1.23.Final milestone Mar 27, 2018
@slandelle
Copy link
Copy Markdown
Contributor Author

@normanmaurer Thanks for merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants