Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Follow cURL's rules for parsing and matching NO_PROXY #1332

Merged
merged 9 commits into from Oct 7, 2021

Conversation

abatkin
Copy link
Contributor

@abatkin abatkin commented Sep 20, 2021

There are a few ways in which reqwest's handling of NO_PROXY differs from cURL (and other implementations). The biggest issue is that whitespace between entries should be ignored/trimmed, but is not (i.e. "NO_PROXY='a, b'" would never match "b"). In addition, according to cURL's rules, a NO_PROXY entry without a leading dot should match the domain itself as well as any subdomains (reqwest only handles exact matches if there is no leading dot) and entries with a leading dot should only match subdomains (but request allows exact matches). Finally, cURL allows a special entry "*" to match all entries (effectively disabling use of the proxy).

All changes in behavior have corresponding changes to the tests.

I tried to keep the logic clear/clean/idiomatic but I'm definitely open to suggestions for improvement.

src/proxy.rs Show resolved Hide resolved
@seanmonstar
Copy link
Owner

Thanks for this! I think trying to match curl's rules is a great goal. Just to help me reason on the change, will this "break" any behavior that people might have come to rely on? I realize in a crazy way, people can rely on any ol' bug, but I mean if things were sufficiently different.

@abatkin
Copy link
Contributor Author

abatkin commented Sep 23, 2021

The addition of wildcard (*) support is a new feature that shouldn't break anything, and handling embedded spaces in lists (i.e. google.com,microsoft.com and google.com, microsoft.com should be equivalent) I look at as a bugfix.

This does change behavior in a breaking way for a couple cases:

no_proxy prior behavior new behavior
no_proxy=google.com (no leading dot) google.com (exact match) would match and not use the proxy, www.google.com (subdomains) would not match and would use the proxy google.com (exact match) will will match and will use the proxy, www.google.com (subdomains) will also match and will not use the proxy
no_proxy=.google.com (leading dot) google.com (exact match) and www.google.com (subdomains) would match and not use the proxy google.com (exact match) would not match and will use the proxy, www.google.com (subdomains) will continue to match and use the proxy

I would argue that the first of these changes is absolutely correct (i.e. if this breaks someone, they need to fix their no_proxy) whereas the second one is a little more questionable. Doing some more research, it does appear that - while cURL follows these (newly implemented) rules - not all other implementations do. In other words, some implementations treat no_proxy=google.com and no_proxy=.google.com identically (match on exact strings sans dot, as well as substrings).

I can undo that second bit if you think it is safer. So: in this instance, should reqwest prefer better compatibility with the existing behavior, or copy cURL's behavior? (I'm torn)

@abatkin
Copy link
Contributor Author

abatkin commented Sep 23, 2021

I should add that reqwest's no_proxy behavior already diverges from cURL: reqwest allows IP address (both v4 and v6) matches to specify a "prefix mask" (i.e. no_proxy=192.168.1.0/24 would match 192.168.1.17) whereas cURL only allows exact string matches. There are other implementations that do what reqwest does, and I'd say the behavior here is far superior compared to cURL.

@abatkin
Copy link
Contributor Author

abatkin commented Oct 3, 2021

@seanmonstar Rebased against master, reverted the (incompatible) bit where NO_PROXY entries beginning with . no longer matched themselves, and added some docs to the NoProxy struct.

Copy link
Owner

@seanmonstar seanmonstar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted the (incompatible) bit where NO_PROXY entries beginning with . no longer matched themselves

I kinda think we could make that change even though it's breaking, since it doesn't seem like someone would normally be using .foo.bar unless trying to get the exact behavior of curl. But we can also make that a follow-up... Either way, thanks for this! It's excellent!

@seanmonstar seanmonstar enabled auto-merge (squash) October 5, 2021 00:19
@seanmonstar seanmonstar merged commit 203cd5b into seanmonstar:master Oct 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants