HTTP error 403 #20

rrik · 2021-06-16T23:39:44Z

Hello,

I am getting a 403 error when attempting the following

`> GetIncome("FB", 2016)
Error in fileFromCache(file) :
Error in download.file(file, cached.file, quiet = !verbose) :
cannot open URL 'https://www.sec.gov/Archives/edgar/data/1326801/000132680116000043/fb-20151231.xsd'

In addition: Warning message:
In download.file(file, cached.file, quiet = !verbose) :
cannot open URL 'https://www.sec.gov/Archives/edgar/data/1326801/000132680116000043/fb-20151231.xsd': HTTP status was '403 Forbidden'`

Do the source links need updating? Thank you!

darh78 · 2021-06-20T17:44:11Z

Hello,
I'm having a similar issue, but with "404 Not Found":

GetIncome("TSLA", 2020)
Error in fileFromCache(file.inst) : 
  Error in download.file(file, cached.file, quiet = !verbose) : 
  cannot open URL 'https://www.sec.gov/Archives/edgar/data/1318605/000156459020004475/tsla-20191231.xml'

In addition: Warning message:
In download.file(file, cached.file, quiet = !verbose) :
  cannot open URL 'https://www.sec.gov/Archives/edgar/data/1318605/000156459020004475/tsla-20191231.xml': HTTP status was '404 Not Found'

selgamal · 2021-06-20T21:02:11Z

@darh78 That file doesn't exist try:
https://www.sec.gov/Archives/edgar/data/1318605/000156459020004475/tsla-10k_20191231_htm.xml

@rrik that happens to me also with older submissions, seems like it has to do with the SEC fair use policy, you can try downloading the file manually and put it in the cache folder, or you can run the code few times, it will eventually end up downloading it.

PatronMaster · 2021-07-25T15:54:59Z

Hi,

I also tried same error,

   if (foreign == FALSE) {
        url <- paste0("http://www.sec.gov/cgi-bin/browse-edgar?action=getcompany&CIK=", 
            symbol, "&type=10-k&dateb=&owner=exclude&count=100")
    }
    filings <- xml2::read_html(url)

I try to change count for 1 and works, so it seems this page is detecting that we are not a browser and block. We need to use rSelenium :(

uramnama · 2021-07-30T18:14:13Z

I have been receiving the same error. Is there any workaround?

smartgamer · 2021-08-29T19:04:09Z

same error here:

CompanyInfo("GOOG")
Error in open.connection(x, "rb") : HTTP error 403.

ramirezjaime · 2021-09-09T17:22:21Z

Same error 403 in all functions

AnnualReports ("TSLA")
Error in open.connection(x, "rb") : HTTP error 403.

ramirezjaime · 2021-09-09T17:23:03Z

R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] edgarWebR_1.1.0 finreportr_1.0.2

loaded via a namespace (and not attached):
[1] xml2_1.3.2 magrittr_2.0.1 tidyselect_1.1.1 rvest_1.0.1 R6_2.5.1 rlang_0.4.11
[7] fansi_0.5.0 stringr_1.4.0 httr_1.4.2 dplyr_1.0.7 tools_4.1.0 utf8_1.2.2
[13] DBI_1.1.1 selectr_0.4-2 ellipsis_0.3.2 assertthat_0.2.1 tibble_3.1.4 lifecycle_1.0.0
[19] crayon_1.4.1 purrr_0.3.4 vctrs_0.3.8 curl_4.3.2 glue_1.4.2 stringi_1.7.4
[25] compiler_4.1.0 pillar_1.6.2 generics_0.1.0 pkgconfig_2.0.3

j-uchiha · 2021-11-07T01:29:34Z

I am also experiencing this problem.

vsoler · 2021-12-01T23:17:18Z

Here is my workaround to your problem.

The problem is that the SEC wants the scraper to be identified in what it is called user-agent.

Before placing my request for data I execute ...

     options(HTTPUserAgent = "your name here   my_name@domain.com")

The user name is only remembered during the current session.

With this workaround, everything works fine for me, no more errors 403

VS

eweiss99 · 2022-01-21T16:34:41Z

I used vsoler's suggestion to use the options statement and I'm still having trouble:

GetIncome("MA", 2020)

Error in fileFromCache(file.inst) : 
  Error in download.file(file, cached.file, quiet = !verbose) : 
  cannot open URL 'https://www.sec.gov/Archives/edgar/data/1141391/000114139120000032/ma-20191231.xml'

In addition: Warning messages:
1: In download.file(file, cached.file, quiet = !verbose) :
  downloaded length 0 != reported length 324
2: In download.file(file, cached.file, quiet = !verbose) :
  cannot open URL 'https://www.sec.gov/Archives/edgar/data/1141391/000114139120000032/ma-20191231.xml': HTTP status was '404 Not Found'

According to the SEC the user-agent must be used in the request header.

Padiol · 2022-03-27T02:56:33Z

Hi guys,

Any chances of having an update solving the pb here?
I am still running into errors despite using the user agent, but only for specific years.

billytaipei101 · 2022-04-09T16:28:39Z

My work around for this problem was to install two missing packages 'XBRL' and 'Rcpp'

Alex-Sigma · 2022-08-13T17:15:45Z

Guys could you please suggest current solution for this problem? (HTTP error 403)
Secondly is this package actively maintained or not?
Thanks in advance!

riazarbi · 2022-11-16T12:39:32Z

There are several errors being conflated in this issue.

The 403 errors are because your clement is not authorised. This is because you have not set (or have improperly set) your User-Agent header and the SEC is saying you can’t have access.

The 404 error mentioned by @eweiss99 is because the file that finreportr is trying to download does not exist. The finreportr package guesses the correct file name of the submission file by adding the date to the ticker code (ma-20191231.xml). But, for whatever reason, the filer didn’t name their submission file like that. If you got to the actual accession web page, you see that the file is actually called ma12312019-10xk_htm.xml. This is a legit bug in finreportr because it is not correctly determining the file name.

IMO the best fix here would be for finreportr to actually download the header file for the accession number, extract the table with the file descriptions, and select the correct file name on the basis of the description.

I’ve got a bit of momentum here so I’ll try see if it’s a simple fix and make a pull request.

matthewgson · 2023-09-07T14:14:43Z

@vsoler's answer on

options(HTTPUserAgent = "your name here   my_name@domain.com")

worked like a charm. Hope this can be seen on the main readme page!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTTP error 403 #20

HTTP error 403 #20

rrik commented Jun 16, 2021

darh78 commented Jun 20, 2021 •

edited

Loading

selgamal commented Jun 20, 2021 •

edited

Loading

PatronMaster commented Jul 25, 2021

uramnama commented Jul 30, 2021

smartgamer commented Aug 29, 2021

ramirezjaime commented Sep 9, 2021

ramirezjaime commented Sep 9, 2021

j-uchiha commented Nov 7, 2021

vsoler commented Dec 1, 2021

eweiss99 commented Jan 21, 2022

Padiol commented Mar 27, 2022

billytaipei101 commented Apr 9, 2022

Alex-Sigma commented Aug 13, 2022

riazarbi commented Nov 16, 2022 •

edited

Loading

matthewgson commented Sep 7, 2023

HTTP error 403 #20

HTTP error 403 #20

Comments

rrik commented Jun 16, 2021

darh78 commented Jun 20, 2021 • edited Loading

selgamal commented Jun 20, 2021 • edited Loading

PatronMaster commented Jul 25, 2021

uramnama commented Jul 30, 2021

smartgamer commented Aug 29, 2021

ramirezjaime commented Sep 9, 2021

ramirezjaime commented Sep 9, 2021

j-uchiha commented Nov 7, 2021

vsoler commented Dec 1, 2021

eweiss99 commented Jan 21, 2022

Padiol commented Mar 27, 2022

billytaipei101 commented Apr 9, 2022

Alex-Sigma commented Aug 13, 2022

riazarbi commented Nov 16, 2022 • edited Loading

matthewgson commented Sep 7, 2023

darh78 commented Jun 20, 2021 •

edited

Loading

selgamal commented Jun 20, 2021 •

edited

Loading

riazarbi commented Nov 16, 2022 •

edited

Loading