-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible to implement asynchronous (i.e. non-blocking io) in curl? #51
Comments
And a follow up: getURIs =
function(uris, ..., multiHandle = getCurlMultiHandle(), .perform = TRUE)
{
content = list()
curls = list()
for(i in uris) {
curl = getCurlHandle()
content[[i]] = basicTextGatherer()
opts = curlOptions(URL = i, writefunction = content[[i]]$update, ...)
curlSetOpt(.opts = opts, curl = curl)
multiHandle = push(multiHandle, curl)
}
if(.perform) {
complete(multiHandle)
lapply(content, function(x) x$value())
} else {
return(list(multiHandle = multiHandle, content = content))
}
} There is also getURIAsynchronous Here's an example use case where this would be helpful. I want to submit 1,000 requests to the server, and each request takes 10 minutes to process (the server has to lookup some data and do some math that takes a long time). However, the sever can handle many thousands of simultaneous requests. Currently, I'm looping through something like this: requests <- lapply(urls, POST, ...) Which blocks on each request and takes 1,000 x 10 minutes to complete. It'd be really nice to be able to send each request off to the server, without blocking on the request being completed. Then after they have all been submitted, we can block on collecting the results with a loop like this: requests <- lapply(urls, POST, ..., async=TRUE)
results <- lapply(results, httr::complete) Where |
I think the natural R solution might be a non-blocking connection object. Although R is limited to 128 connections which is kind of lame. |
Would that be a feature request for R core? If so, how would I submit it to R core? |
There is now experimental support for async requests in the dev version of curl: install.packages("https://github.com/jeroenooms/curl/archive/master.tar.gz", repos = NULL)
library(curl)
?multi Let me know if you have any feedback. Here is an example: https://github.com/jeroenooms/curl/blob/master/examples/crawler.R |
Wahoo!!! |
curl 2.0 which includes the async stuff should be on CRAN this week. |
From r-lib/httr#271:
The text was updated successfully, but these errors were encountered: