Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle URLs with a colon after host but no port #5108

Merged
merged 3 commits into from
Jun 13, 2019

Conversation

ethomson
Copy link
Member

Core git copes with URLs that have a colon after the port, but no actual numeric value. eg http://example.com:/foo.git or http://example.com:. That's horrible, but RFC 3986 says:

URI producers and normalizers should omit the port component and its
":" delimiter if port is empty or if its value would be the same as
that of the scheme's default.

Which indicates that they may and therefore we must accept it.

Test that we can handle URLs with a colon but no following port number, fix our http-parser implementation, and require use of our http-parser which corrects this issue.

Fixes #5100.

Core git copes with URLs that have a colon after the port, but no actual
numeric value.  eg `http://example.com:/foo.git` or
`http://example.com:`.  That's horrible, but RFC 3986 says:

> URI producers and normalizers should omit the port component and its
> ":" delimiter if port is empty or if its value would be the same as
> that of the scheme's default.

Which indicates that they may and therefore we must accept it.

Test that we can handle URLs with a colon but no following port number.
When the end of the host is reached, and we're at the colon separating
the host with the port (ie, there is no numeric port) then do not error.
This is allowed by RFC 3986.
@ethomson
Copy link
Member Author

I suspect this will be contentious. 😀

@ethomson ethomson force-pushed the ethomson/urlparse_empty_port branch 2 times, most recently from f135d57 to d8d0ac6 Compare June 12, 2019 12:41
@ethomson
Copy link
Member Author

I updated this to prefer out http-parser but allow users to specify the system http-parser if they want, even sans a version check, because this bug really isn't that serious. Basically, it allows them to opt-in to very slightly broken URL parsing if they want.

I updated the CMake option for this so that users with an existing option wouldn't accidentally opt-in to it, which also allowed us to align closer to the zlib option.

Copy link
Member

@pks-t pks-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading RFC3986 several times I agree that your fix is correct. Some bikeshedding on the CMake option name, otherwise I'm fine with your changes. Thanks!

CMakeLists.txt Outdated
OPTION(DEBUG_POOL "Enable debug pool allocator" OFF)
OPTION(ENABLE_WERROR "Enable compilation with -Werror" OFF)
OPTION(USE_BUNDLED_ZLIB "Use the bundled version of zlib" OFF)
OPTION(USE_BUNDLED_HTTP_PARSER "Use the bundled HTTP Parser" ON)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, didn't know we already had an option. If we're about to change, then we should use a generic USE_HTTP_PARSER=(bundled|system), if you ask me. The reasoning is that this would be much easier to extend. For example at a future point, we'd be able to add an option USE_HTTP_PARSER=/path/to/http_parser.

http-parser that we have fixed in our implementation. The bundled
library is now the default, but if you wish to force the use of the
system http-parser implementation despite incompatibilities, you can
specify `-DUSE_BUNDLED_HTTP_PARSER=OFF` to CMake.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should treat changes to CMake options as breaking changes and thus highlight this in a separate section

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally we have a section titled "Breaking API Changes" at the bottom. CMake changes don't feel like they fit there, so I created a new section "Breaking CMake configuration changes" and slotted this in there. I also moved the "breaking" stuff to the top. We should be done breaking APIs, but if not, it should be the first thing people see in the release notes, IMO.

@pks-t
Copy link
Member

pks-t commented Jun 13, 2019 via email

@ethomson ethomson force-pushed the ethomson/urlparse_empty_port branch 2 times, most recently from 531d9b3 to 3fb6d3d Compare June 13, 2019 19:50
Our bundled http-parser includes bugfixes, therefore we should prefer
our http-parser until such time as we can identify that the system
http-parser has these bugfixes (using a version check).

Since these bugs are - at present - minor, retain the ability for users
to force that they want to use the system http-parser anyway.  This does
change the cmake specification so that people _must_ opt-in to the new
behavior knowingly.
@ethomson ethomson force-pushed the ethomson/urlparse_empty_port branch from 3fb6d3d to fb529a0 Compare June 13, 2019 20:24
@ethomson ethomson merged commit e277ff4 into master Jun 13, 2019
@ethomson
Copy link
Member Author

Upstream: nodejs/http-parser#483

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handling of port component in URLs (with schema) differs from git
2 participants