Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URLs without protocol causes error in windows but not linux #619

Closed
florian-j-auer opened this issue Oct 14, 2019 · 8 comments
Closed

URLs without protocol causes error in windows but not linux #619

florian-j-auer opened this issue Oct 14, 2019 · 8 comments

Comments

@florian-j-auer
Copy link

I got a different behavior using httr in windows and in linux. It apeared first time at the end of August on the automatic build of my bioconductor package (ndexr).

It seems that omitting the protocol in the url causes an error on windows machines since:

httr::GET(url="www.ndexbio.org/v2/admin/status")
#Error in if (is_http) { : argument is of length zero

linux or mac os are not affected.

@cderv
Copy link
Contributor

cderv commented Oct 14, 2019

I think this is an issue with curl with curl::fetch_memory resulting differently on windows and linux.

This is with curl 4.2.

On rstudio cloud, with R 3.6.0

library(curl)
    url <- "www.ndexbio.org/v2/admin/status"
    h <- new_handle()
    resp <- curl_fetch_memory(url, handle = h)
    resp$url
#> [1] "http://www.ndexbio.org/v2/admin/status"

Created on 2019-10-14 by the reprex package (v0.3.0)

With Windows 10, on my computer,

library(curl)
url <- "www.ndexbio.org/v2/admin/status"
h <- new_handle()
resp <- curl_fetch_memory(url, handle = h)
resp$url
#> [1] "www.ndexbio.org/v2/admin/status"

Created on 2019-10-14 by the reprex package (v0.3.0)

As you see, the http is added in the result on linux, but not on windows.
httr is currently testing the curl result to determine if the result is http or not, and that is where the error is. There must have been a change in curl I guess...

@cderv
Copy link
Contributor

cderv commented Oct 14, 2019

I did further tests on windows and I found that it worked with curl 3.3 but not with curl 4.0.
In 4.0, there was an upgrade to libcurl 7.64.1. So the issue is in curl.

@cderv
Copy link
Contributor

cderv commented Oct 14, 2019

I have just also realise that this is a duplicate of #607

@jonathon-love
Copy link

hi,

i must be missing something, because i can't see how this error could be caused by curl. parse_url() will return a scheme of NULL when the scheme is missing:

https://github.com/r-lib/httr/blob/master/R/url.r#L38-L48

then request_perform() fails here because $scheme is NULL with that error:

https://github.com/r-lib/httr/blob/master/R/request.R#L159-L161

trivial fix is for parse_url() to set the scheme to http when none is provided.

happy to submit a PR.

cheers

@cderv
Copy link
Contributor

cderv commented Oct 15, 2019

You spotted the right lines, but parse_url then is_http are on resp$scheme with resp being the response from request_fetch which uses curl_fetch_memory

resp <- request_fetch(req$output, req$url, handle)

hence the hint on curl behavior.
There was a change of behavior in curl 4.0 on windows. Before, the scheme was added by curl itself (req$url has no scheme, resp$url has a scheme), from 4.0 it wasn't. On linux, it is still the case, the scheme is added. Having a different behavior on windows and linux is not ideal.

However, I completely agree that a fix could be done in httr directly by handling missing scheme before doing request_fetch. or assuming curl response if from http. Though it should not break the initial fix while assuming a missing scheme is for http because the aim of is_http is to deal with non http method (introduced in #549).

It possible that the curl behavior is now intented and a fix must be suggested here.
Please, feel free to submit PR if you wish. However, I won't be the one to review it, you'll have to wait for httr team. here, I am just trying to help find the root cause.

@cderv
Copy link
Contributor

cderv commented Oct 15, 2019

It possible that the curl behavior is now intented

From a comment from curl developer in jeroen/curl#209 (comment), it seems it could be a regression

@cderv
Copy link
Contributor

cderv commented Oct 16, 2019

This is now fixed in last dev version of curl. See the issue there.
But I'll guess it will be fixed in httr with next curl release.

@jonathon-love
Copy link

it would still be good to merge #620 though ... the lag between fixes and them arriving in debian/ubuntu distros is often very long. the ubuntu LTS, for example, will probably never receive that update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants