New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

possibly add queue fxn for downloads #266

Closed
sckott opened this Issue Jun 23, 2017 · 5 comments

Comments

Projects
None yet
1 participant
@sckott
Copy link
Member

sckott commented Jun 23, 2017

via https://discuss.ropensci.org/t/queueing-gbif-download-requests/718

gbif_queue <- function(...) {
  reqs <- lazyeval::lazy_dots(...)
  results <- list()
  groups <- split(reqs, ceiling(seq_along(reqs)/3))

  for (i in seq_along(groups)) {
    cat("running group of three: ", i)
    res <- lapply(groups[[i]], function(w) {
      tmp <- tryCatch(lazyeval::lazy_eval(w), error = function(e) e)
      if (inherits(tmp, "error")) {
        "http request error"
      } else {
        tmp
      }
    })

    # filter out errors
    res_noerrors <- Filter(function(x) inherits(x, "occ_download"), res)
    still_running <- TRUE
    while (still_running) {
      metas <- lapply(res_noerrors, occ_download_meta)
      status <- vapply(metas, "[[", "", "status", USE.NAMES = FALSE)
      still_running <- !all(tolower(status) %in% c('succeeded', 'killed'))
      Sys.sleep(2)
    }
    results[[i]] <- res
  }

  results <- unlist(results, recursive = FALSE)

  return(results)
}

usage

library(rgbif)
library(lazyeval)

output <- gbif_queue(
  occ_download('taxonKey = 3119195', "year = 1976"),
  occ_download('taxonKey = 3119195', "year = 2001"),
  occ_download('taxonKey = 3119195', "year = 2001", "month <= 8"),
  occ_download('taxonKey = 5229208', "year = 2011"),
  occ_download('taxonKey = 2480946', "year = 2015"),
  occ_download("country = NZ", "year = 1999", "month = 3"),
  occ_download("catalogNumber = Bird.27847588", "year = 1998", "month = 2")
)

Improvements:

  • Make it such that when each individual download is done, another is kicked off. Right now, a set of 3 are sent off, then we wait for all 3 to be done, before kicking off another 3. should speed things up

sckott added a commit that referenced this issue Aug 3, 2017

@sckott

This comment has been minimized.

Copy link
Member Author

sckott commented Aug 3, 2017

  • add some kind of progress or so to track progress
@sckott

This comment has been minimized.

Copy link
Member Author

sckott commented Aug 9, 2017

kathryn reports problems, see chat with her

@sckott

This comment has been minimized.

Copy link
Member Author

sckott commented May 15, 2018

@sckott

This comment has been minimized.

Copy link
Member Author

sckott commented Jun 15, 2018

work on queue branch, just made some commits

@sckott

This comment has been minimized.

Copy link
Member Author

sckott commented Jun 15, 2018

queueing, examples:

install from queue branch first:

remotes::install_github("ropensci/rgbif@queue")

pass in any number of occ download requests

out <- occ_download_queue(
  occ_download('taxonKey = 3119195', "year = 1976"),
  occ_download('taxonKey = 3119195', "year = 2001", "month <= 8"),
  occ_download("country = NZ", "year = 1999", "month = 3"),
  occ_download("catalogNumber = Bird.27847588", "year = 1998", "month = 2")
)

or prepare them lazily in a list

years <- 1967:1970
queries <- list()
library(lazyeval)
for (i in seq_along(years)) {
  queries[[i]] <- lazy(occ_download('taxonKey = 3119195', paste0("year = 1976", years[i])))
}
out <- occ_download_queue(.list = queries)

you'll need to then use other rgbif functions to process the output downstream, e.g.,

lapply(out, occ_download_get)

@sckott sckott added the downloads label Jun 16, 2018

@sckott sckott added this to the v1.0 milestone Jun 16, 2018

@sckott sckott closed this in #305 Jun 16, 2018

@sckott sckott added the queue label Jun 27, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment