Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

limit on number of packages as argument to cran_downloads #56

Open
adfi opened this issue Jul 16, 2020 · 7 comments
Open

limit on number of packages as argument to cran_downloads #56

adfi opened this issue Jul 16, 2020 · 7 comments

Comments

@adfi
Copy link

adfi commented Jul 16, 2020

Hi,

I tried to do get download counts for 8000 packages and ran into a HTTP 414 (Request-URI Too Long). After some trial and error it seems the limit is at 905 packages, reproducable with following code:

cran_downloads(package = rep('cranlogs', 906))

I can split up the requests but it would be nicer to have that done by the package. Also the limit is not documented. Let me know if I'm doing something the package wasn't intended for.

@gaborcsardi
Copy link
Contributor

Well, that's the URL length limit I guess, because the package names are sent in the URL. We could have a POST API, and then there is no limit.

@adfi
Copy link
Author

adfi commented Jul 17, 2020

So where does the change need to happen? In cranlogs.app?

@gaborcsardi
Copy link
Contributor

Everywhere. Frankly, it is simpler to return all packages, if you want 8000, then you might as well get all of them. :)

@bschilder
Copy link

@gaborcsardi This could be done within cranlogs by submitting the list of packages in batches, right?

@bschilder
Copy link

bschilder commented Dec 19, 2022

Would need to know what the max batch size can be (ie at what point does the URI get too long, on average):

batch_size =1000
v <- rownames(utils::available.packages())
batches <- split(v, ceiling(seq_along(v)/batch_size))
     cran <- lapply(seq_len(length(batches)),
                           function(i){
                               b <- batches[[i]]
                               message(paste("Batch:",i,"/",length(batches)))
                               dt <-  cran <- cranlogs::cran_downloads(
                                   packages = b, 
                                   from = "1990-01-01", 
                                   to = Sys.Date()-1)   
                               return(dt)
                           }) |> 
            data.table::rbindlist(fill=TRUE) 

Should be an easy fix. Happy to make a PR.

@bschilder
Copy link

@adfi , I agree this should be handled internally by the package or at least documented to note the limitation.

@bschilder
Copy link

Done here @adfi :
#67

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants