New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pxweb_advanced_get Too Many Requests (RFC 6585) (HTTP 429) error #7
Comments
Dear Elliot, Thank you very much for using my package and reporting this issue! The "BFS" package is using under the hood the R package {pxweb} to query the BFS API. After a quick look I haven't found any option to increase the batch size or add a delay. I will investigate more. Feel free to share in this issue any discovery or suggestion from your side. Another solution is to reduce the size of the dataset using the Please let me know if this works for you. # Install dev version
devtools::install_github("lgnbhl/BFS") library(BFS)
# choose a BFS number and language
number_bfs <- "px-x-1003020000_103"
language <- "en"
# create the BFS api url
pxweb_api_url <- paste0("https://www.pxweb.bfs.admin.ch/api/v1/",
language, "/", number_bfs, "/", number_bfs, ".px")
# Get BFS table metadata using {pxweb}
px_meta <- pxweb::pxweb_get(pxweb_api_url)
# list variables items
str(px_meta$variables)
# Manually create BFS query dimensions
# Use `code` and `values` elements in `px_meta$variables`
# Use "*" to select all
dims <- list("Jahr" = c("2020", "2021"),
"Monat" = c("YYYY"),
"Indikator" = c("*"))
# Query BFS data with specific dimensions
BFS::bfs_get_data(
number_bfs = number_bfs,
language = language,
query = dims
)
Best, |
Short question @lgnbhl , I get this message, too, but I guess the bfs has added purposely burdens on the API for security reasons. Is there other solution to surpass the API limits in clever ways apart from using some VPN switchers and other networking magic? |
@philipp-baumann no, I am not aware of other solutions to surpass the API limits. |
Hi @elliotbeck and @philipp-baumann I ran again the R code shared for this issue and it works just fine for me now. Is the following R code still throwing an error to you? BFS::bfs_get_data(number_bfs = "px-x-1003020000_103", language = "de") Maybe they have changed something in the BFS API or in the {pxweb} R package since this issue has been submitted... By the way, the new version of the BFS package (for now only available on GitHub) provides a new function to download locally any file by BFS number (or asset number). For the case of a large PX file, this speeds up the R code a lot. devtools::install_github("lgnbhl/BFS")
BFS::bfs_download_asset(
number_bfs = "px-x-1003020000_103", #number_asset also possible
destfile = "px-x-1003020000_103.px"
)
library(pxR) # install.packages("pxR")
large_dataset <- pxR::read.px(filename = "px-x-1003020000_103.px") |>
as.data.frame()
Please note that reading a PX file using |
Thanks! I'll give it a test tomorrow and let you know. Cheers |
I still get the Too Many Requests (RFC 6585) (HTTP 429) error. |
With a Swiss IP I get "px-x-1003020000_103.px" without error. Both with the batched approach and new r$> sessioninfo::session_info()
─ Session info ───────────────────────────────────────────────────────────────────────────
setting value
version R version 4.1.3 (2022-03-10)
os Ubuntu 22.04.2 LTS
system x86_64, linux-gnu
ui X11
language
collate en_US.UTF-8
ctype en_US.UTF-8
tz Europe/Zurich
date 2023-07-27
pandoc 2.9.2.1 @ /usr/bin/pandoc
─ Packages ───────────────────────────────────────────────────────────────────────────────
! package * version date (UTC) lib source
anytime 0.3.9 2020-08-27 [1] CRAN (R 4.1.3)
backports 1.4.1 2021-12-13 [1] CRAN (R 4.1.2)
BFS 0.5.1.999 2023-07-27 [1] Github (lgnbhl/BFS@a583276)
bit 4.0.5 2022-11-15 [1] CRAN (R 4.1.3)
bit64 4.0.5 2020-08-30 [1] CRAN (R 4.1.2)
blob 1.2.3 2022-04-10 [1] CRAN (R 4.1.3)
cachem 1.0.8 2023-05-01 [1] CRAN (R 4.1.3)
callr 3.7.3 2022-11-02 [1] CRAN (R 4.1.3)
checkmate 2.2.0 2023-04-27 [1] CRAN (R 4.1.3)
cli 3.6.1 2023-03-23 [1] CRAN (R 4.1.3)
crancache 0.0.0.9001 2022-01-20 [1] Github (r-lib/crancache@7ea4e47)
cranlike 1.0.2 2018-11-26 [1] CRAN (R 4.1.2)
crayon 1.5.2 2022-09-29 [1] CRAN (R 4.1.3)
V curl 5.0.0 2023-06-07 [1] CRAN (R 4.1.3) (on disk 5.0.1)
DBI 1.1.3 2022-06-18 [1] RSPM (R 4.1.0)
debugme 1.1.0 2017-10-22 [1] CRAN (R 4.1.2)
desc 1.4.2 2022-09-08 [1] CRAN (R 4.1.3)
digest 0.6.33 2023-07-07 [1] CRAN (R 4.1.3)
dplyr 1.1.2 2023-04-20 [1] CRAN (R 4.1.3)
fansi 1.0.4 2023-01-22 [1] CRAN (R 4.1.3)
fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.1.3)
generics 0.1.3 2022-07-05 [1] CRAN (R 4.1.3)
glue 1.6.2 2022-02-24 [1] CRAN (R 4.1.2)
httr 1.4.6 2023-05-08 [1] CRAN (R 4.1.3)
httr2 0.2.3 2023-05-08 [1] CRAN (R 4.1.3)
janitor 2.2.0 2023-02-02 [1] CRAN (R 4.1.3)
jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.1.3)
lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.1.3)
lubridate 1.9.2 2023-02-10 [1] CRAN (R 4.1.3)
magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.1.3)
memoise 2.0.1 2021-11-26 [1] CRAN (R 4.1.2)
parsedate 1.2.1 2021-04-20 [1] CRAN (R 4.1.2)
pillar 1.9.0 2023-03-22 [1] CRAN (R 4.1.3)
pkgbuild 1.4.0 2022-11-27 [1] CRAN (R 4.1.3)
pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.2)
plyr * 1.8.7 2022-03-24 [1] CRAN (R 4.1.3)
prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.1.2)
processx 3.8.2 2023-06-30 [1] CRAN (R 4.1.3)
ps 1.7.5 2023-04-18 [1] CRAN (R 4.1.3)
purrr 1.0.1 2023-01-10 [1] CRAN (R 4.1.3)
pxR * 0.42.7 2022-11-23 [1] CRAN (R 4.1.3)
pxweb 0.16.2 2022-10-31 [1] CRAN (R 4.1.3)
R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.2)
rappdirs 0.3.3 2021-01-31 [1] CRAN (R 4.1.2)
Rcpp 1.0.11 2023-07-06 [1] CRAN (R 4.1.3)
rematch2 2.1.2 2020-05-01 [1] CRAN (R 4.1.2)
remotes * 2.4.2 2021-11-30 [1] CRAN (R 4.1.3)
reshape2 * 1.4.4 2020-04-09 [1] CRAN (R 4.1.3)
RJSONIO * 1.3-1.6 2021-09-16 [1] CRAN (R 4.1.2)
rlang 1.1.1 2023-04-28 [1] CRAN (R 4.1.3)
rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.1.3)
RSQLite 2.2.14 2022-05-07 [1] CRAN (R 4.1.3)
sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.1.2)
snakecase 0.11.0 2019-05-25 [1] CRAN (R 4.1.3)
stringi 1.7.12 2023-01-11 [1] CRAN (R 4.1.3)
stringr * 1.5.0 2022-12-02 [1] CRAN (R 4.1.3)
tibble 3.2.1 2023-03-20 [1] CRAN (R 4.1.3)
tidyRSS 2.0.7 2023-03-05 [1] CRAN (R 4.1.3)
tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.1.3)
timechange 0.2.0 2023-01-11 [1] CRAN (R 4.1.3)
utf8 1.2.3 2023-01-31 [1] CRAN (R 4.1.3)
vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.1.3)
withr 2.5.0 2022-03-03 [1] CRAN (R 4.1.3)
xml2 1.3.5 2023-07-06 [1] CRAN (R 4.1.3)
[1] /home/philipp/R/x86_64-pc-linux-gnu-library/4.1
[2] /opt/R/4.1.3/lib/R/library
V ── Loaded and on-disk version mismatch.
──────────────────────────────────────────────────────────────────────────────────────────
|
@philipp-baumann yes, there is a time window limit of 10: https://www.pxweb.bfs.admin.ch/api/v1/de/?config. |
thanks @lgnbhl for pointing to that config. |
I ran The error is not caused by a new version of the {pxweb} R package (currently 0.16.2) as they have not pushed a new version since 2022-10-31. I have updated the documentation to reflect our discussion: https://github.com/lgnbhl/BFS#too-many-requests-error-message Best, |
Please find below an R script showing a programmatic solution to query a large BFS dataset. This R code creates a list of smaller queries and join them using To avoid getting an error message due to the BFS API limits, I added the new argument "delay" in Be sure to have a least v.0.5.6 of the BFS package installed. #devtools::install_github("lgnbhl/BFS") # for BFS v.0.5.6
library(BFS)
library(purrr)
# should at least use version 0.5.6
packageVersion("BFS") >= "0.5.6"
# choose a BFS number and language
number_bfs <- "px-x-1003020000_103"
language <- "en"
# get metadata
meta <- bfs_get_metadata(number_bfs = number_bfs, language = language)
# create dimension object
dims <- meta$values
names(dims) <- meta$code
# split 1st dimension "Jahr" in chunks of 1 element
# NOTE: depending of the data, other dimension should be used, e.g. dims[[2]]
dims1 <- dims[[1]]
dim_splited <- split(dims1, cut(seq_along(dims1), length(dims1), labels = FALSE))
names(dim_splited) <- rep(names(dims)[1], length(dim_splited))
# create query list
query_list <- vector(mode = "list", length = length(dim_splited))
for (i in seq_along(dim_splited)) {
query_list[[i]] <- c(dim_splited[i], dims[-1])
}
names(query_list) <- rep("query", length(query_list))
# list of arguments for loop
args_list <- list(
number_bfs = rep(number_bfs, length(query_list)),
language = rep(language, length(query_list)),
delay = rep(10, length(query_list)), # 10 seconds delay before query
query = query_list
)
# loop with smaller queries using bfs_get_data()
df <- purrr::pmap_dfr(.l = args_list, .f = bfs_get_data, .progress = TRUE)
df
@philipp-baumann @elliotbeck feel free to let me know if this solution works for you :) |
Dear Félix
Thanks a lot for the nice package you provide! I came across the following issue when downloading a rather long query:
Maybe this could be resolved by increasing the batch size or adding a delay?
best
Elliot
The text was updated successfully, but these errors were encountered: