url: Fix parsing for when 'file' is the default protocol #1124

Closed
wants to merge 3 commits into
from

Projects

None yet

4 participants

@jay
Member
jay commented Nov 13, 2016

Follow-up to 3463408.

Prior to this set of changes hostnames were silently stripped and the
file scheme did not work properly as the default protocol.

Ref: https://curl.haxx.se/mail/lib-2016-11/0081.html


Also I thought the error message on file://foo/bar was confusing so I changed it

Before: curl: (3) Valid host name with slash missing in URL
After:  curl: (3) Invalid file://hostname/, expected localhost or 127.0.0.1 or none
@mention-bot

@jay, thanks for your PR! By analyzing the history of the files in this pull request, we identified @bagder, @dfandrich and @captain-caveman2k to be potential reviewers.

lib/url.c
- protobuf, path)) &&
- strcasecompare(protobuf, "file")) {
+ for(i = 0; i < 16 && data->change.url[i] &&
+ data->change.url[i] != '\n'; ++i) {
@bagder
bagder Nov 13, 2016 Member

This function already verifies that there is no CR or LF in the URL, immediately before this code snippet so this newline check is superfluous.

@bagder
bagder Nov 13, 2016 Member

Thinking about it, I suppose the check for 16 bytes scheme could instead be modified to the longest supported scheme. Which I think is a draw between TELNET and GOPHER, at merely 6 bytes.

@jay
jay Nov 14, 2016 Member

Ok I'll work on a second draft. The reason I did the things you commented on the way I did is because I was following what is already in the function as a model.

lib/url.c
+ data->change.url[i] != '\n'; ++i) {
+ if(data->change.url[i] == ':') {
+ if(!i) {
+ failf(data, "Bad URL");
@bagder
bagder Nov 13, 2016 Member

Maybe even spell out that there was no colon/valid scheme in the URL?

@@ -4073,7 +4098,8 @@ static CURLcode parseurlandfillconn(struct Curl_easy *data,
char *ptr;
if(!checkprefix("localhost/", path) &&
!checkprefix("127.0.0.1/", path)) {
- failf(data, "Valid host name with slash missing in URL");
+ failf(data, "Invalid file://hostname/, "
+ "expected localhost or 127.0.0.1 or none");
@bagder
bagder Nov 13, 2016 Member

We're back to what we discussed on the mailing list. I tried to make the message also convey that there needs to be a valid hostname and slash in the URL as this message will be shown for file://locahost as well, which does have a valid host name but no slash. But sure, error messages are hard and if you think this is better then I don't mind.

@jay
Member
jay commented Dec 6, 2016

@bagder I made the changes you requested, however I did not change the magic numbers of 15/16 since I wanted to be consistent with the size of protobuf sscanf used elsewhere in the function.

jay added some commits Nov 13, 2016
@jay jay url: Fix parsing for when 'file' is the default protocol. draft1
Follow-up to 3463408.

Prior to 3463408 file:// hostnames were silently stripped.

Prior to this commit it did not work when a schemeless url was used with
file as the default protocol.

Ref: https://curl.haxx.se/mail/lib-2016-11/0081.html
Ref: curl#1124
3cf8303
@jay jay url: Fix parsing for when 'file' is the default protocol. draft2
Squash this into draft1

- Support --proto-default file c:/foo/bar.txt

- Support file://c:/foo/bar.txt

Assisted-by: Anatol Belski
5857a56
@jay
Member
jay commented Jan 9, 2017

@bagder can you check this again, I did a second draft to cover --proto-default file c:/foo/bar.txt and file://c:/foo/bar.txt. Note I didn't use isalpha for the drive letters since I wasn't sure it would be C locale. I could put the ('a' <= path[0] && path[0] <= 'z') || ('A' <= path[0] && path[0] <= 'Z') into a macro if preferred.

@weltling
Contributor
weltling commented Jan 9, 2017

@jay the current version of your patch fully restores the backward compatibility. Please let me know, if i have to test any follow up changes later on.

Thanks.

@bagder
bagder approved these changes Jan 10, 2017 View changes

It looks fine to me. I only have one little concern: this has no #ifdefs for windows or windows-like systems so what happens when you try this URL on a *nix system? Shouldn't it fail the URL parsing then?

@jay
Member
jay commented Jan 11, 2017 edited

I only have one little concern: this has no #ifdefs for windows or windows-like systems so what happens when you try this URL on a *nix system? Shouldn't it fail the URL parsing then?

I thought it would be cleaner, and it could fail on fopen if the OS doesn't support drive letters. If you want I can wrap those areas in #if defined(WIN32) || defined(MSDOS)

@bagder
Member
bagder commented Jan 11, 2017

I thought it would be cleaner, and it could fail on fopen if the OS doesn't support drive letters.

You're right. The biggest difference is then what error code it would return back, if we avoid the unlikely event that there actually can be a file named like that on a non-windows machine.

IMO: I think it is safer that we return "malformed URL" earlier and pay the price with the additional #ifdef.

@jay
Member
jay commented Jan 11, 2017

IMO: I think it is safer that we return "malformed URL" earlier and pay the price with the additional #ifdef.

Well how about immediately after the file url parsing fail if drive letter and not msdos/win. This way the flow is the same for all platforms but there's only one guard and we catch it as malformed rather than passing it around. For example see draft3.

@jay jay url: Fix parsing for when 'file' is the default protocol. draft3
Squash me into previous draft

- Fail when a file:// drive letter is detected and not MSDOS/Windows.
dd99723
@bagder
Member
bagder commented Jan 12, 2017

That's perfectly fine with me and makes the error message even clearer. 👍

@jay jay added a commit that closed this pull request Jan 12, 2017
@jay jay url: Fix parsing for when 'file' is the default protocol
Follow-up to 3463408.

Prior to 3463408 file:// hostnames were silently stripped.

Prior to this commit it did not work when a schemeless url was used with
file as the default protocol.

Ref: https://curl.haxx.se/mail/lib-2016-11/0081.html
Closes #1124

Also fix for drive letters:

- Support --proto-default file c:/foo/bar.txt

- Support file://c:/foo/bar.txt

- Fail when a file:// drive letter is detected and not MSDOS/Windows.

Bug: #1187
Reported-by: Anatol Belski
Assisted-by: Anatol Belski
1d4202a
@jay jay closed this in 1d4202a Jan 12, 2017
@jay jay deleted the jay:fix_file_default-protocol branch Jan 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment