Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relax parsing restrictions around host and hostname #1606

Merged
merged 2 commits into from Feb 16, 2020

Conversation

@pvande
Copy link
Contributor

@pvande pvande commented Feb 16, 2020

This approach allows us to match and process any potential input and still meet the contractual requirements of Request#split_authority without requiring the input to be well-formed or RFC-compliant. In turn, this eliminates unexpected failures on invalid-but-functional inputs.

If we deemed it necessary to do validation (particularly on IP addresses), the AUTHORITY regex still provides enough context on which validations should apply, though doing so would require some consideration on what should be returned upon validation failure.

If validations can be ruled unnecessary – I suspect this should be the case – the implementation could be replaced by the semantically identical:

def split_authority(authority)
  /\A(?<host>\[\g<addr>\]|(?<addr>.*?))(:(?<port>\d+))?\Z/ =~ authority
  return host, addr, port&.to_i
end

This gives up differentiation between IPv6, IPv4, and DNS addresses, but is arguably simpler.

(Resolves #1604)

pvande added 2 commits Feb 16, 2020
This approach allows us to match and process any potential input and still meet the contractual requirements of `Request#split_authority` _without_ requiring the input to be well-formed or RFC-compliant.  In turn, this eliminates unexpected failures on invalid-but-functional inputs.

If we deemed it necessary to do validation (particularly on IP addresses), the `AUTHORITY` regex still provides enough context on which validations should apply, though doing so would require some consideration on what should be returned upon validation failure.

If validations can be ruled unnecessary – I suspect this should be the case – the implementation could be replaced by the semantically identical:

``` ruby
def split_authority(authority)
  /\A(?<host>\[\g<addr>\]|(?<addr>.*?))(:(?<port>\d+))?\Z/ =~ authority
  return host, addr, port&.to_i
end
```

This gives up differentiation between IPv6, IPv4, and DNS addresses, but is arguably simpler.
@pvande pvande requested a review from ioquatix Feb 16, 2020
@ioquatix
Copy link
Member

@ioquatix ioquatix commented Feb 16, 2020

This looks good to me, you took what I did and made it way better. Thanks!

@ioquatix ioquatix merged commit 766761b into rack:master Feb 16, 2020
18 checks passed
18 checks passed
test (ubuntu-latest, 2.3)
Details
test (ubuntu-latest, 2.4)
Details
test (ubuntu-latest, 2.5)
Details
test (ubuntu-latest, 2.6)
Details
test (ubuntu-latest, 2.7)
Details
test (ubuntu-latest, jruby)
Details
test (ubuntu-latest, truffleruby-head)
Details
test (macos-latest, 2.3)
Details
test (macos-latest, 2.4)
Details
test (macos-latest, 2.5)
Details
test (macos-latest, 2.6)
Details
test (macos-latest, 2.7)
Details
test (macos-latest, jruby)
Details
test (macos-latest, truffleruby-head)
Details
external (ubuntu-latest, 2.6)
Details
external (ubuntu-latest, 2.7)
Details
external (macos-latest, 2.6)
Details
external (macos-latest, 2.7)
Details
@ioquatix
Copy link
Member

@ioquatix ioquatix commented Feb 16, 2020

I’ll backport to 2-2-stable as I consider this a bug fix.

@ioquatix ioquatix added this to In progress in Development via automation Feb 16, 2020
@ioquatix ioquatix moved this from In progress to Backport in Development Feb 16, 2020
@ioquatix ioquatix added this to the v2.2.3 milestone Feb 16, 2020
AUTHORITY = /^
# The host:
AUTHORITY = /
\A

This comment has been minimized.

@ioquatix

ioquatix Feb 16, 2020
Member

@pvande what is the difference between \A and \Z vs ^ and $?

This comment has been minimized.

@jeremyevans

jeremyevans Feb 16, 2020
Collaborator

\A matches the beginning of the string. ^ matches any beginning of line in the string. \Z matches the end of string, allowing possible newline at the end of the string. $ matches the end of any line in the string.

In general, you are much more likely to want \A and \Z (or \z, which matches the end of string) than ^ and $. Using ^ and $ usually leads to bugs, in my experience.

@pvande pvande deleted the pvande:relax-hostname-validations branch Feb 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development
  
Backport
Linked issues

Successfully merging this pull request may close these issues.

3 participants
You can’t perform that action at this time.