Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Updated tickers #9750

Merged
merged 13 commits into from Sep 14, 2017
Merged

[MRG+1] Updated tickers #9750

merged 13 commits into from Sep 14, 2017

Conversation

@vrishank97
Copy link
Contributor

@vrishank97 vrishank97 commented Sep 13, 2017

*Adds Nasdaq, NYSE, etc Tickers and updates/removes obsolete tickers

Fixes part of #9749.

@jnothman
Copy link
Member

@jnothman jnothman commented Sep 13, 2017

Sorry, I'd inadvertently made a merge conflict for you. Are you able to resolve it? Thanks!

@vrishank97
Copy link
Contributor Author

@vrishank97 vrishank97 commented Sep 13, 2017

How do I go about resolving them?
I'm new here.

@jnothman
Copy link
Member

@jnothman jnothman commented Sep 13, 2017

The only conflict is that I had put a prefix on SAP already.

Either use the github interface (the Resolve Conflicts button) or, on a command-line:

git fetch https://github.com/scikit-learn/scikit-learn master
git merge FETCH_HEAD
# ... edit examples/applications/plot_stock_market.py ...
git add examples/applications/plot_stock_market.py
git commit
git push
@vrishank97
Copy link
Contributor Author

@vrishank97 vrishank97 commented Sep 13, 2017

Thanks

@jnothman jnothman added this to the 0.19.1 milestone Sep 13, 2017
@jnothman
Copy link
Member

@jnothman jnothman commented Sep 13, 2017

Now this is working okay, except that I suspect that Google Finance is that is no longer reporting historical prices for tickers that no longer exist. CVC is breaking (https://finance.google.com/finance/historical?q=CVC) because its' been taken over by ATUS.

You should test the changes on your local machine if possible, so that any errors can be fixed without waiting for our continuous integration servers.

Copy link
Member

@jnothman jnothman left a comment

I don't get why this is working still with NYSE:CVC in there...?

@@ -0,0 +1,312 @@
"""

This comment has been minimized.

@jnothman

jnothman Sep 13, 2017
Member

I don't think you should have added this file

vrishank97 added 2 commits Sep 13, 2017
@jnothman
Copy link
Member

@jnothman jnothman commented Sep 13, 2017

I will admit I'm not sure why this broke. It works okay for me locally :\

@vrishank97
Copy link
Contributor Author

@vrishank97 vrishank97 commented Sep 13, 2017

How should I proceed from here?

@jnothman
Copy link
Member

@jnothman jnothman commented Sep 13, 2017

LGTM

@jnothman jnothman changed the title Updated tickers [MRG+1] Updated tickers Sep 13, 2017
@lesteve
Copy link
Member

@lesteve lesteve commented Sep 13, 2017

I will admit I'm not sure why this broke. It works okay for me locally :\

Same thing for me. I did not manage to reproduce the error locally. I think I am fine to add the NYSE and NASDAQ qualifiers but not to remove the tickers if there is no need to ...

Note that (at least for me locally at the time of writing),

To be honest another option would be to put the datasets on figshare and avoid this kind of errors in the future.

@lesteve
Copy link
Member

@lesteve lesteve commented Sep 13, 2017

Small comment @vrishank97, please use "Fix #issueNumber" in your PR description, this way the associated issue gets closed automatically when the PR is merged. For more details, look at this. I have edited your PR description but try to remember this for next time.

@vrishank97
Copy link
Contributor Author

@vrishank97 vrishank97 commented Sep 13, 2017

Thanks. I'll keep that in mind.

@jnothman
Copy link
Member

@jnothman jnothman commented Sep 13, 2017

@jnothman
Copy link
Member

@jnothman jnothman commented Sep 13, 2017

Okay. Let's leave CVC. I'm a bit confused about the erratic errors being produced by Google Finance, but I assume they're just rolling out some changes or have synchrony glitches. I like this change in any case.

@jnothman
Copy link
Member

@jnothman jnothman commented Sep 13, 2017

By "leave CVC" i mean keep it in.

@lesteve
Copy link
Member

@lesteve lesteve commented Sep 14, 2017

OK now I can reproduce the problem on master. Now for some reason with this PR I get 244 entries for 'NYSE:DD-B' instead of the expected 251 ... not sure why.

@vrishank97
Copy link
Contributor Author

@vrishank97 vrishank97 commented Sep 14, 2017

The name changed from 'NYSE:DD' to 'NYSE:DD-B' recently there might not enough data for the time span.

@vrishank97
Copy link
Contributor Author

@vrishank97 vrishank97 commented Sep 14, 2017

Should I go ahead and remove 'NYSE:DD-B' ?

@lesteve
Copy link
Member

@lesteve lesteve commented Sep 14, 2017

For some reason it looks like the startDate and endDate is not taken into account ...
Look at http://www.google.com/finance/historical?q=NASDAQ%3AAABA&startdate=Jan+01%2C+2003&enddate=Jan+01%2C+2008&output=csv for example that starts on September 15th 2016 and stops on September 13th 2017.

@lesteve
Copy link
Member

@lesteve lesteve commented Sep 14, 2017

Maybe that explains some of the problems we are seeing e.g. the need to remove CVC.

@jnothman
Copy link
Member

@jnothman jnothman commented Sep 14, 2017

@vrishank97
Copy link
Contributor Author

@vrishank97 vrishank97 commented Sep 14, 2017

How about we add a fixed start and end date in
url = 'http://www.google.com/finance/historical?' + params
to future proof the code.

@lesteve
Copy link
Member

@lesteve lesteve commented Sep 14, 2017

@vrishank97 you need to investigate to see if there is a known ongoing problem with the google finance API. There is not much we can do until the situation stabilizes ... for example the link I posted in http://www.google.com/finance/historical?q=NASDAQ%3AAABA&startdate=Jan+01%2C+2003&enddate=Jan+01%2C+2008&output=csv seems to sometimes take startdate and enddate into account and sometimes ignore it completely.

@lesteve
Copy link
Member

@lesteve lesteve commented Sep 14, 2017

I pushed fb66956 in master to improve the error message when the data we get from Google finance is not as it should.

@lesteve
Copy link
Member

@lesteve lesteve commented Sep 14, 2017

@vrishank97 can you pull upstream master into your branch? From your commit history it looks like you already know how to do that.

@vrishank97
Copy link
Contributor Author

@vrishank97 vrishank97 commented Sep 14, 2017

I've pulled upstream master into my branch.
How about using yahoo finance until the issues with google settle?

@jnothman
Copy link
Member

@jnothman jnothman commented Sep 14, 2017

@jnothman
Copy link
Member

@jnothman jnothman commented Sep 14, 2017

@vrishank97
Copy link
Contributor Author

@vrishank97 vrishank97 commented Sep 14, 2017

@vrishank97
Copy link
Contributor Author

@vrishank97 vrishank97 commented Sep 14, 2017

We can also consider the NASDAQ website.
They offer csv downloads too.

http://www.nasdaq.com/symbol/gsk/historical

@jnothman
Copy link
Member

@jnothman jnothman commented Sep 14, 2017

@lesteve
Copy link
Member

@lesteve lesteve commented Sep 14, 2017

Isn't the Yahoo finance API deprecated at least?

We were getting the finance data from matplotlib (from matplotlib.finance import quotes_historical_yahoo_och) and the URL it uses is not valid any more. No idea whether the same csv data is accessible somewhere from Yahoo.

I am fine with adding the exchange in front of the symbols. Any other change I am slightly reluctant because I don't think this is fixing anything at all and there is no way to know at the moment because of the problems I mentioned in the data returned by Google Finance.

@jnothman
Copy link
Member

@jnothman jnothman commented Sep 14, 2017

@lesteve
Copy link
Member

@lesteve lesteve commented Sep 14, 2017

I am going to merge this PR like this, e.g. only adding exchange names.

@vrishank97 this would be great if you could investigate why Google finance starts misbehaving, whether it is a known problem and possible ways to work around the problem.

@lesteve lesteve merged commit 8fa7771 into scikit-learn:master Sep 14, 2017
0 of 4 checks passed
0 of 4 checks passed
ci/circleci CircleCI is running your tests
Details
continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
lgtm analysis: Python Running analyses for revisions
Details
@jnothman
Copy link
Member

@jnothman jnothman commented Sep 14, 2017

@jnothman
Copy link
Member

@jnothman jnothman commented Sep 14, 2017

@lesteve
Copy link
Member

@lesteve lesteve commented Sep 15, 2017

Well, the doc build has succeeded and failed since this merge. Maybe give Google a few days to settle down????

I am not sure what is happening indeed, I would try to wait and see whether this issue goes away. If that does not, I still have a cache of the matplotlib finance data on my computer, we can upload that to figshare and forget about this.

btw, @lesteve, I really don't like the expected_len_data in a function which has dates as a parameter...

Oops that is a good point certainly ... I will fix that.

@lesteve
Copy link
Member

@lesteve lesteve commented Sep 15, 2017

Oops that is a good point certainly ... I will fix that.

Done in 7a2ce27.

lesteve added a commit to jnothman/scikit-learn that referenced this pull request Sep 29, 2017
jnothman added a commit to jnothman/scikit-learn that referenced this pull request Oct 3, 2017
maskani-moh added a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017
jwjohnson314 pushed a commit to jwjohnson314/scikit-learn that referenced this pull request Dec 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants