Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limitation of scraping in historical series through Google Quotes with web.dataReader #422

Closed
nativanando opened this issue Nov 21, 2017 · 5 comments

Comments

@nativanando
Copy link

I'm trying to scrape the historical series through the pandas web.dataReader but the period returned is only between 2016 and 2017.

Has there been any update limiting this search or is it a bug caused by the API?

Here is my code:

class Crawler:

    def __init__(self, nome_empresa, codigo):
        self.start = datetime.datetime(2001, 1, 1)
        self.end = datetime.datetime(2017, 8, 31)
        self.nome_empresa = nome_empresa
        self.codigo = codigo

    def executa_busca(self):
        file = web.DataReader(self.codigo, 'google',  self.start, self.end)
        file.to_csv('~/Documentos/TCC/dist-tcc/Implementacao/dados_calculados/dados_brutos/' + self.nome_empresa + '.txt')
@rsvp
Copy link

rsvp commented Nov 21, 2017

The issue is currently unresolved.
For an overview, and alternatives, see
rsvp/fecon235#7

@paintdog
Copy link

In this thread a working url sheme was presented by VicTangg to get again "full" access to the historical data! It would be nice if this url sheme could be used to fix this bug soon.

For example:
https://finance.google.com/finance/historical?q=ETR:SIE&startdate=2000/01/01&enddate=2017/05/22&output=csv

@gliptak
Copy link
Contributor

gliptak commented Nov 29, 2017

@paintdog What does "full" access mean?

This "old style" URL (generated by current pandas_datareader) does work https://finance.google.com/finance/historical?q=ETR:SIE&startdate=Jan%2001,%202017&enddate=Nov%2029,%202017&output=csv

@paintdog
Copy link

paintdog commented Dec 2, 2017

Have you tested your "old style" URL? It doesn't work!!!

Using your url we get stock prices starting at 2-Jan-17 (!). Using the new url sheme we get stock prices starting at 2-Jan-02 (!). Please try it out.

When I understand the current source code then we use http:// instead of https:// in pandas datareader to generate the url. Maybe this is the error!?

We must fix it asap, I think.

@bashtage
Copy link
Contributor

Google Quotes has been deprecated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants