httr2 #1

hadley · 2016-04-01T23:47:22Z

httr is a library for two main tasks: creating http requests and parsing the responses. Currently this dichotomoy is a little muddled because:

There is no explicit request object - it is only created internally by
the request functions. This tends to lead to large request functions that
have many arguments passed in ...
There's no consistent naming scheme that indicates whether a function
works with a request or a response. This is particuarly confusing for
functions like content_type(): does it set the content type of
the request or extract the content type of the response?

Additionally, httr was designed prior to the pipe and so uses ... rather than functions that modify an object. This makes the API feel rather different to similar APIs (e.g. rvest). It also makes it harder to test, because you can only easily test the result of issuing a request, not the internal request object. Overall the API feels a little dated, and a little underdesigned.

Request API

Basics

req("http://url.com/") %>% req_fetch()

# req_fetch could be a generic so you could still do
req_fetch("http://url.com/")

Performing the request

The HTTP method doesn't affect the input arguments or the output type, so that suggests that the key API verb should be the output, rather than the HTTP method:

req("http://url.com") %>% req_fetch()
req("http://url.com") %>% req_async()
req("http://url.com") %>% req_save(path = path)
req("http://url.com") %>% req_stream(fun = fun)

The request verb would default to GET, unless a body was set, in which case it would change to POST. Otherwise, you could override yourself in two ways:

req("http://url.com") %>% req_method("DELETE") %>% req_fetch()
req("http://url.com") %>% req_fetch("DELETE")

The first form would be most useful when generating partial requests for API wrappers.

Url

# Would generally make smaller functions that took multiple arguments
# This would make iterative url construction much nicer for APIs
req("http://url.com/") %>% 
  req_path("a", username, "b") %>%  # replaces
  req_path_suffix("y") %>%  # appends
  req_path_prefix("x") %>%  # for completeness
  req_query(y = 1) %>% 
  req_params() %>% 
  req_fetch()

# req would need to parse the url so you could start with
req("http://url.com/api/v1/?q=2")

Body

# This would also lead to a nicer API for bodies
req("http://url.com/") %>% 
  req_body_json(a = 1, b = 2, c = 3) %>% 
  req_fetch() 

# Setting body would change default request type, but you could
# override
req("http://url.com/") %>% 
  req_body_json(a = 1, b = 2, c = 3) %>% 
  req_fetch("PUT") 

req("http://url.com/") %>% 
  req_body_json(a = 1, b = 2, c = 3) %>% 
  req_save(path = "~/Desktop/bigfile.blah") 
  # If path is directory, automatically add name from url?

# req_body_file()
# req_body_form()
# req_body_json()
# req_body_multipart()
# req_body_raw()

Authentication

req("http://url.com/") %>% req_auth_basic()
req("http://url.com/") %>% req_auth_oauth1()
req("http://url.com/") %>% req_auth_oauth2()

Headers

req("http://url.com/") %>% 
  req_header(`Content-type` = "application/json") %>% 
  req_header(a_list_from_somewhere_else) # would unlist() inputs as needed.
# list(...) %>% flatten() %>% map_chr(as.character())

Curl

# And there would be a new function for setting specific curl requests
# These would be applied after other request parameters
req_config()

Response API

Basics

resp <- req("http://google.com") %>% req_fetch()

# Pipeable API to check that the response is what you expect
resp %>% 
  resp_check_ok() %>% 
  resp_check_body_xml() %>% 
  resp_content_xml()

# Other functions extract headers
resp %>% resp_headers()
resp %>% resp_content_type()

# And other http components
resp %>% resp_status()
resp %>% resp_url()
resp %>% resp_timings()

Content

resp %>% resp_content_raw()
resp %>% resp_content_text(encoding = "UTF-8")

# Would rely on user to check content type using helper from above
resp %>% resp_content_json()
resp %>% resp_content_xml()
resp %>% resp_content_html()
resp %>% resp_content_png()
resp %>% resp_content_jpeg()

Would not have resp_content_auto() because I think time has shown that this is a bad idea.

New package?

Should be a new package, httr2?

Pros:

Can start from scratch without having to work with existing API
Could aim for high unit test coverage from the beginning.
Documentation will be less confusing because it doesn't have to describe
two APIs.

Cons:

May re-introduce bugs because I miss important logic
Will have to maintain two packages in the short term (in the long term
would deprecate httr).

I think the pros probably outweigh the cons - the API will a sufficiently large change that it's worth starting from scratch.

@craigcitro, @jeroenooms, @jennybc I'd love your thoughts on this, if you have a little spare time.

The text was updated successfully, but these errors were encountered:

hadley · 2016-05-22T11:59:54Z

(Also note that I probably won't work on this for a while since I have higher priority stuff that I should be working, but this might happen next time I have a couple of days downtime and a good internet connection)

jeroen · 2016-05-23T16:23:05Z

It has been on my to do list to experiment with async libcurl bindings in order to support many concurrent requests. Something along the lines of jeroen/curl#51 but perhaps a bit simpler, like promises in JavaScript.

# New scheduler
pool <- multi_handle()

# Add requests with callback, returns immediately
req("http://url.com") %>% schedule(pool) %>% then(function(x){
  print("Done: ", x$status)
})

req("http://url.com") %>% schedule(pool) %>% then(function(x){
  print("Another one: ", x$status)
})

# This would block until all request are finished
complete(pool)

@jcheng5 and me briefly chatted about using the R event loop rather than something like complete and never block. But this might not be reliable on windows.

Only vaguely related to the above :)

craigcitro · 2016-05-24T21:05:01Z

More to say later, but one quick thought to throw out there:

I do think it'd be a bummer to no longer have a simple GET('http://some.url/') call for interactive use. There's a fine line to walk here -- I much prefer the explicit composability described above, especially when it comes to testing, but it's often useful to be able to quickly grab a page to use for examples or while experimenting.

I'm not sure what the right answer is here -- I chatted about this with @hadley who was also on the fence. What do other folks think?

jennybc · 2016-05-25T06:29:12Z

Should be a new package, httr2?

It feels like that would be easier on everyone. Maybe you should name first packages foo0, so you can just drop the zero on the second pass :)

There's no consistent naming scheme that indicates whether a function works with a request or a response.
... large request functions that have many arguments passed in ...

This rings true. I have wondered whether it's best to build the pieces first (e.g. the URL), then pass to the verb. Or just do everything at once and write a monster VERB() call. The pipe-based design clears some of that up. I have pieces of the request that are conditionally included (e.g. token if URL matches some regex, progress if interactive). How do you express that? BTW progress isn't mentioned in the request API notes.

I do think it'd be a bummer to no longer have a simple GET('http://some.url/') call for interactive use.

I agree. And maybe not just for interactive use. When you're reading API docs, it's always in terms of GET, POST, etc. so it is nice to have those same verbs.

hadley · 2016-05-25T15:36:59Z

@jennybc my main worry about also having a simpler GET(), POST() etc set of functions is that it's going to be qplot() all over again - yes, it's easier to get started with, and it would be easier to copy examples from the web, but as soon as you want to do anything complicated, you'll want to switch to the full system.

(And progess would be something like req(url) %>% req_show_progress() %>% req_fetch())

jcheng5 · 2016-05-25T16:01:33Z

What if req_fetch can take a string? (Or really, any of the functions that operate on a request could take a string instead.)

hadley · 2016-05-25T16:32:56Z

@jcheng5 you mean like this example req_fetch("http://url.com/") 😛

You could certainly start any chain with a string instead of a request, but I think you'd still need to explicitly send the request with req_fetch() etc.

jennybc · 2016-05-25T16:40:57Z

Will the design of .httr-oauth stay the same?

I know you are generally anti-options() but what about allowing us to set user agent package-wide with an option?

jcheng5 · 2016-05-25T16:49:04Z

@hadley Reading comprehension FTW

@craigcitro If req_fetch(url) works, do you still need GET(url)? Or you're thinking that GET would take options as parameters?

hadley · 2016-05-25T17:13:09Z

@jennybc Yes, .httr-oauth is likely to stay the same (and you'd be able to use both httr and httr2 with the same file)

I'd really prefer not to use options, but I think it won't be as important since with the pipeable syntax, you'd just create a basic prototype request that every function in your package would customise.

jeroen · 2016-06-06T22:02:31Z

Wrote a first working version of fully async requests in the ‘curl’ package. See the ?multi help page and examples. Maybe this helps rethinking the httr api?

jennybc · 2017-01-05T16:49:00Z

Re "Performing the request", it would be nice if the Request API offered a way to specify the behaviour available currently via RETRY().

hadley · 2017-07-24T08:10:17Z

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

httr2 #1

httr2 #1

hadley commented Apr 1, 2016

hadley commented May 22, 2016

jeroen commented May 23, 2016 •

edited

craigcitro commented May 24, 2016

jennybc commented May 25, 2016

hadley commented May 25, 2016

jcheng5 commented May 25, 2016

hadley commented May 25, 2016

jennybc commented May 25, 2016

jcheng5 commented May 25, 2016

hadley commented May 25, 2016

jeroen commented Jun 6, 2016

jennybc commented Jan 5, 2017

hadley commented Jul 24, 2017

hadley commented Jul 24, 2017

hadley commented Jul 24, 2017

jennybc commented Nov 20, 2017

hadley commented May 26, 2021

httr2 #1

httr2 #1

Comments

hadley commented Apr 1, 2016

Request API

Basics

Performing the request

Url

Body

Authentication

Headers

Curl

Response API

Basics

Content

New package?

hadley commented May 22, 2016

jeroen commented May 23, 2016 • edited

craigcitro commented May 24, 2016

jennybc commented May 25, 2016

hadley commented May 25, 2016

jcheng5 commented May 25, 2016

hadley commented May 25, 2016

jennybc commented May 25, 2016

jcheng5 commented May 25, 2016

hadley commented May 25, 2016

jeroen commented Jun 6, 2016

jennybc commented Jan 5, 2017

hadley commented Jul 24, 2017

hadley commented Jul 24, 2017

hadley commented Jul 24, 2017

jennybc commented Nov 20, 2017

hadley commented May 26, 2021

jeroen commented May 23, 2016 •

edited