Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"You have no access" error if trying to obtain eurostat data #181

Open
snaiperis opened this issue Jun 1, 2020 · 20 comments
Open

"You have no access" error if trying to obtain eurostat data #181

snaiperis opened this issue Jun 1, 2020 · 20 comments
Assignees
Labels

Comments

@snaiperis
Copy link

Hello,

I get an error using eurostat package.

> dd <- get_eurostat("namq_10_gdp")
You have no access to ec.europe.eu.
      Please check your connection and/or review your proxy settings

I've tried to look into internals, implementation of function check_access_to_data(). The detail of error is:

> temp <- tempfile()
> http_url <- "http://ec.europa.eu/eurostat/cache/GISCO/distribution/v2/nuts/geojson/NUTS_RG_60M_2006_4326_LEVL_0.geojson"
> download.file(http_url, temp)
trying URL 'http://ec.europa.eu/eurostat/cache/GISCO/distribution/v2/nuts/geojson/NUTS_RG_60M_2006_4326_LEVL_0.geojson'
Error in download.file(http_url, temp) : 
  cannot open URL 'http://ec.europa.eu/eurostat/cache/GISCO/distribution/v2/nuts/geojson/NUTS_RG_60M_2006_4326_LEVL_0.geojson'
In addition: Warning message:
In download.file(http_url, temp) :
  InternetOpenUrl failed: 'A connection with the server could not be established'

wget is able to download this URL after 2 redirects:

>wget http://ec.europa.eu/eurostat/cache/GISCO/distribution/v2/nuts/geojson/NUTS_RG_60M_2006_4326_LEVL_0.geojson
--2020-06-01 16:44:09--  http://ec.europa.eu/eurostat/cache/GISCO/distribution/v2/nuts/geojson/NUTS_RG_60M_2006_4326_LEVL_0.geojson
Resolving ec.europa.eu (ec.europa.eu)... 147.67.210.30, 147.67.34.30
Connecting to ec.europa.eu (ec.europa.eu)|147.67.210.30|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://ec.europa.eu/eurostat/cache/GISCO/distribution/v2/nuts/geojson/NUTS_RG_60M_2006_4326_LEVL_0.geojson [following]
--2020-06-01 16:44:09--  https://ec.europa.eu/eurostat/cache/GISCO/distribution/v2/nuts/geojson/NUTS_RG_60M_2006_4326_LEVL_0.geojson
Connecting to ec.europa.eu (ec.europa.eu)|147.67.210.30|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://gisco-services.ec.europa.eu/distribution/v2/nuts/geojson/NUTS_RG_60M_2006_4326_LEVL_0.geojson [following]
--2020-06-01 16:44:09--  https://gisco-services.ec.europa.eu/distribution/v2/nuts/geojson/NUTS_RG_60M_2006_4326_LEVL_0.geojson
Resolving gisco-services.ec.europa.eu (gisco-services.ec.europa.eu)... 40.113.93.170
Connecting to gisco-services.ec.europa.eu (gisco-services.ec.europa.eu)|40.113.93.170|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 141040 (138K)
Saving to: 'NUTS_RG_60M_2006_4326_LEVL_0.geojson'

NUTS_RG_60M_2006_4326_LEVL_0.geojs 100%[================================================================>] 137,73K  --.-KB/s    in 0,1s

2020-06-01 16:44:10 (1024 KB/s) - 'NUTS_RG_60M_2006_4326_LEVL_0.geojson' saved [141040/141040]

Win7 OS, no proxies or other network limitations.

Best regards

@antagomir
Copy link
Member

Thanks. I cannot reproduce this. If it is not an institutional limitation in network settings then I am not sure how to solve. Is this a persistent (not temporary) issue?

@glilienthal
Copy link

glilienthal commented Jun 22, 2020

I am seeing the same thing, on my laptop at home (win10) as well as on the remote webserver. It started 2020-06-18 and is persisting.

check_access_to_data()
renders
FALSE

@glilienthal
Copy link

While this is being fixed. Here my workaround:

get files utils.R tidy_eurostat.R
source them
download tsv.gz, unzip
do (if you want to have GDP)

dat <- readr::read_tsv("data/eurostat/nama_10_gdp.tsv", na = ":",  
                     col_types = readr::cols(.default = readr::col_character()))
   
GDP <-tidy_eurostat(dat)

@glilienthal
Copy link

And now: It is working again. After three days of off-time...

check_access_to_data()
[1] TRUE

(on both my machines...)

@jhuovari
Copy link

This is strage. And meantime you very able to access data "manually"?

@DanVal80
Copy link

DanVal80 commented Jul 31, 2020

Same issue as Snaiperis. i am on Ubuntu 20.04 using R 4.0.2.

> library(eurostat)
> gdp <- eurostat::get_eurostat("namq_10_gdp")
You have no access to ec.europe.eu.
      Please check your connection and/or review your proxy settings

> check_access_to_data()
[1] FALSE

> temp <- tempfile()
> http_url <- "http://ec.europa.eu/eurostat/cache/GISCO/distribution/v2/nuts/geojson/NUTS_RG_60M_2006_4326_LEVL_0.geojson"
> curl::curl_download(http_url, temp)
Error in curl::curl_download(http_url, temp) :
  Timeout was reached: [] Operation timed out after 10001 milliseconds with 0 out of 0 bytes received

> packageVersion("eurostat")
[1] ‘3.6.1’
> packageVersion("curl")
[1] ‘4.3’
>> curl_version()
$version
[1] "7.68.0"

$ssl_version
[1] "GnuTLS/3.6.13"

$libz_version
[1] "1.2.11"

$libssh_version
[1] "libssh/0.9.3/openssl/zlib"

$libidn_version
[1] "2.2.0"

$host
[1] "x86_64-pc-linux-gnu"

$protocols
 [1] "dict"   "file"   "ftp"    "ftps"   "gopher" "http"   "https"  "imap"
 [9] "imaps"  "ldap"   "ldaps"  "pop3"   "pop3s"  "rtmp"   "rtsp"   "scp"
[17] "sftp"   "smb"    "smbs"   "smtp"   "smtps"  "telnet" "tftp"

$ipv6
[1] TRUE

$http2
[1] TRUE

$idn
[1] TRUE

In the command line, wget works:

$ wget http://ec.europa.eu/eurostat/cache/GISCO/distribution/v2/nuts/geojson/NUTS_RG_60M_2006_4326_LEVL_0.geojson
--2020-07-31 11:47:48--  http://ec.europa.eu/eurostat/cache/GISCO/distribution/v2/nuts/geojson/NUTS_RG_60M_2006_4326_LEVL_0.geojson
Resolving ec.europa.eu (ec.europa.eu)... 147.67.34.30, 147.67.210.30, 2a01:7080:14:100::666:30, ...
Connecting to ec.europa.eu (ec.europa.eu)|147.67.34.30|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://ec.europa.eu/eurostat/cache/GISCO/distribution/v2/nuts/geojson/NUTS_RG_60M_2006_4326_LEVL_0.geojson [following]
--2020-07-31 11:47:48--  https://ec.europa.eu/eurostat/cache/GISCO/distribution/v2/nuts/geojson/NUTS_RG_60M_2006_4326_LEVL_0.geojson
Connecting to ec.europa.eu (ec.europa.eu)|147.67.34.30|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://gisco-services.ec.europa.eu/distribution/v2/nuts/geojson/NUTS_RG_60M_2006_4326_LEVL_0.geojson [following]
--2020-07-31 11:47:49--  https://gisco-services.ec.europa.eu/distribution/v2/nuts/geojson/NUTS_RG_60M_2006_4326_LEVL_0.geojson
Resolving gisco-services.ec.europa.eu (gisco-services.ec.europa.eu)... 40.113.93.170
Connecting to gisco-services.ec.europa.eu (gisco-services.ec.europa.eu)|40.113.93.170|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 141040 (138K)
Saving to: ‘NUTS_RG_60M_2006_4326_LEVL_0.geojson.1’

NUTS_RG_60M_2006_4326_LEVL_0.geojson.1               100%[=====================================================================================================================>] 137.73K  --.-KB/s    in 0.05s

2020-07-31 11:47:49 (2.70 MB/s) - ‘NUTS_RG_60M_2006_4326_LEVL_0.geojson.1’ saved [141040/141040]

curl works (if I set the option -L considering the redirects).

$ curl -L  http://ec.europa.eu/eurostat/cache/GISCO/distribution/v2/nuts/geojson/NUTS_RG_60M_2006_4326_LEVL_0.geojson > test.txt
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   807  100   807    0     0   9170      0 --:--:-- --:--:-- --:--:--  9170
100   309  100   309    0     0   1161      0 --:--:-- --:--:-- --:--:--  3433
100  137k  100  137k    0     0   321k      0 --:--:-- --:--:-- --:--:--  321k

@jhuovari
Copy link

jhuovari commented Aug 3, 2020

Could it be that for some reason download.file uses method = "curl" in these cases? Then extra = "-L" is needed according to documentation for redirections. By default is shouldn't and redirections should work.

Could you test:

tfile <- tempfile()
url <- "https://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?sort=1&file=data%2Fnamq_10_gdp.tsv.gz"
test <- utils::download.file(url, tfile, method = "libcurl")
test2 <- utils::download.file(url, tfile, method = "curl", extra = "-L")

@DanVal80
Copy link

DanVal80 commented Aug 6, 2020

Ok, that's weird. Both test and test2 are successful

> tfile <- tempfile()
> url <- "https://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?sort=1&file=data%2Fnamq_10_gdp.tsv.gz"

> test <- utils::download.file(url, tfile, method = "libcurl")
trying URL 'https://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?sort=1&file=data%2Fnamq_10_gdp.tsv.gz'
Content type 'application/octet-stream;charset=UTF-8' length 14051440 bytes (13.4 MB)
==================================================
downloaded 13.4 MB

> test2 <- utils::download.file(url, tfile, method = "curl", extra = "-L")
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 13.4M  100 13.4M    0     0  1907k      0  0:00:07  0:00:07 --:--:-- 1961k

but:

> gdp <- eurostat::get_eurostat("namq_10_gdp")
You have no access to ec.europe.eu.
      Please check your connection and/or review your proxy settings

> eurostat::check_access_to_data()
[1] FALSE

@jhuovari
Copy link

jhuovari commented Aug 6, 2020

Thanks for testing. That is then download.file problem. It should use by default wininet or libcurl, but it seems that it uses curl. Have you set download.file.method-option? Could you try what getOption("download.file.method") gives?

@jhuovari
Copy link

jhuovari commented Aug 6, 2020

Could you also test does the following work with the same url?

test <- readr::read_tsv(url, na = ":", col_types = readr::cols(.default = readr::col_character()))

@DanVal80
Copy link

DanVal80 commented Aug 6, 2020

Hi, you are welcome. Here the output to the commands you requested.

> getOption("download.file.method")
NULL
> test <- readr::read_tsv(url, na = ":", col_types = readr::cols(.default = readr::col_character()))
|=================================================================| 100%   47 MB

Note that I am on Linux, so (I guess) the check_access_to_data() performs the download with curl::curl_download() and not with download.file(). Something interesting, though:

> url1 <- 'https://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?sort=1&file=data%2Fnamq_10_gdp.tsv.gz'
> url2 <- 'http://ec.europa.eu/eurostat/cache/GISCO/distribution/v2/nuts/geojson/NUTS_RG_60M_2006_4326_LEVL_0.geojson'
# url1 is the file I want. url2 is the file used by eurostat::check_access_to_data()

> curl::curl_download(url1, tfile, quiet = FALSE)
 Downloaded 14095138 bytes...

> curl::curl_download(url2, tfile, quiet = FALSE)
 [100%] Downloaded 807 bytes...
 [100%] Downloaded 309 bytes...
Error in curl::curl_download(url2, tfile, quiet = FALSE, ) :
  Timeout was reached: [] Operation timed out after 10000 milliseconds with 0 out of 0 bytes received

Can this be the problem?

@jhuovari
Copy link

jhuovari commented Aug 6, 2020

So using directly readr works. It seems to be also faster. You can try to installing from a new branch:
remotes::install_github("ropengov/eurostat", ref = "speed")

There seems to be also other thinks we could do to speed the package.

@fpa2
Copy link

fpa2 commented Oct 1, 2020

Hi, first, thanks for your hard work on this...
I have been a regular user of the package and in the past 8 months had not problems at all.
Today, I was downloading some data (again with no problems).

Then, suddenly I got the same error...

> xtemp <- get_eurostat("sts_inpr_m")
You have no access to ec.europe.eu. 
Please check your connection and/or review your proxy settings

So, I tried the above to see what I get:

> getOption("download.file.method")
[1] "wininet"

url <- "https://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?sort=1&file=data%2Fnamq_10_gdp.tsv.gz"
test <- readr::read_tsv(url, na = ":", col_types = readr::cols(.default = readr::col_character()))`

also updated as suggested

>remotes::install_github("ropengov/eurostat", ref = "speed")

but still getting the same error...

Any further ideas to test on Windows?

** Btw, I can navigate on Eurostat's website and find the data on the browser.. So, this does not really seem to be a problem with my IP or proxy settings...

** Update: Tried again after 4 hours... it now works..

@antagomir
Copy link
Member

Thanks for the update. I do not know what was temporarily out of order, perhaps something at the website? I propose we investigate solutions only is this becomes a more persisent issue.

Does this only work in "speed" branch or could we merge the necessary parts in master, and delete the speed branch @jhuovari ?

@jhuovari
Copy link

jhuovari commented Oct 2, 2020

It is still only in "speed". I was suppose to finalize it and merge, but I haven't have time to do it. I try to do it soon.

@monteirojaf
Copy link

Greetings from Basel!

I have started to get the same essor message indicating that I have no access to eurostat data. It is interesting that last year I was able to get data without problems. I am using the following code:

data.table(label_eurostat(get_eurostat("urb_lpop1")),
orig=get_eurostat("urb_lpop1")[1],
orig=get_eurostat("urb_lpop1")[2],
set="urb_lpop1")

Including proxy config is neither working:

data.table(label_eurostat(get_eurostat("urb_lpop1", config = use_proxy(url="http://xxxxxxxxxx",port=3128, username = "xxxxxx", password = "xxxxxx"))),
orig=get_eurostat("urb_lpop1")[1],
orig=get_eurostat("urb_lpop1")[2],
set="urb_lpop1")

Is there any update in this issue?

@antagomir
Copy link
Member

@jhuovari any comments / updates

@jhuovari
Copy link

Sorry, that is still unfinished. However, I also now behind proxy, and having issues, so I have interest here. Unfortunately also busy...

Meanwhile, you could try to set
options(download.file.method = "wininet") or "auto"

@umbe1987
Copy link

I also have this issue from time to time.

For instance, I am having it right now: the eurostat webiste is accessible but trying to download data from R with get_eurostat gives You have no access to ec.europe.eu.

tp <-
  get_eurostat(id = "migr_asytpfq",
               filters = list(geo = eu27,
                              citizen = "UA"))

error

You have no access to ec.europe.eu.
      Please check your connection and/or review your proxy settings

Info on my R configuration:

> R.version
               _                                          
platform       x86_64-pc-linux-gnu                        
arch           x86_64                                     
os             linux-gnu                                  
system         x86_64, linux-gnu                          
status         Patched                                    
major          4                                          
minor          2.2                                        
year           2022                                       
month          11                                         
day            10                                         
svn rev        83330                                      
language       R                                          
version.string R version 4.2.2 Patched (2022-11-10 r83330)
nickname       Innocent and Trusting

@pitkant
Copy link
Member

pitkant commented Sep 28, 2023

If you have problems (especially behind a proxy connection) could you test httr2-branch of eurostat and tell if it works / doesn't work?

remotes::install_github("ropengov/eurostat", ref = "httr2")

I notice that some queries here have been concerned with bulk download files. To use proxy option you have to use get_eurostat_json() function directly.

I tested httr2 proxy functionalities with public proxies found in https://www.proxynova.com/proxy-server-list/ and the success-rate seemed to be mostly based on the quality of the chosen proxy. With a select few I was able to get things working, but most failed with various types of timeouts. Private proxies might of course be much less burdened and of higher quality than these public proxies...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants