Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Updated plot_stock_market.py to use Google Finance #9010

Merged
merged 4 commits into from Jun 7, 2017
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 1 addition & 2 deletions doc/conf.py
Expand Up @@ -241,8 +241,7 @@
'matplotlib': 'http://matplotlib.org',
'numpy': 'http://docs.scipy.org/doc/numpy-1.8.1',
'scipy': 'http://docs.scipy.org/doc/scipy-0.13.3/reference'},
'expected_failing_examples': [
'../examples/applications/plot_stock_market.py']
'expected_failing_examples': []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove the expected_failing_examples but please double-check

Copy link
Contributor Author

@superbobry superbobry Jun 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. The key was added in this commit: 9759019.

}


Expand Down
67 changes: 49 additions & 18 deletions examples/applications/plot_stock_market.py
Expand Up @@ -64,27 +64,60 @@
# Author: Gael Varoquaux gael.varoquaux@normalesup.org
# License: BSD 3 clause

import datetime
from datetime import datetime

import numpy as np
import matplotlib.pyplot as plt
try:
from matplotlib.finance import quotes_historical_yahoo_ochl
except ImportError:
# quotes_historical_yahoo_ochl was named quotes_historical_yahoo before matplotlib 1.4
from matplotlib.finance import quotes_historical_yahoo as quotes_historical_yahoo_ochl
from matplotlib import pyplot as plt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason for changing this import from how it was?

We have it both ways in the codebase, but overwhelmingly more common in the first way. There should be no difference so it's not a big deal, just curious why you changed it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a matter of personal preference. I'll revert it back, it is unrelated to the PR.

from matplotlib.collections import LineCollection
from six.moves.urllib.request import urlopen
from six.moves.urllib.parse import urlencode
from sklearn import cluster, covariance, manifold

###############################################################################
# Retrieve the data from Internet

def quotes_historical_google(symbol, date1, date2):
"""Get the historical data from Google finance.

Parameters
----------
symbol : str
Ticker symbol to query for, for example ``"DELL"``.
date1 : datetime.datetime
Start date.
date2 : datetime.datetime
End date.

Returns
-------
X : array
The columns are ``date`` -- datetime, ``open``, ``high``,
``low``, ``close`` and ``volume`` of type float.
"""
params = urlencode({
'q': symbol,
'startdate': date1.strftime('%b %d, %Y'),
'enddate': date2.strftime('%b %d, %Y'),
'output': 'csv'
})
url = 'http://www.google.com/finance/historical?' + params
print(url)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not print the url.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this is leftover debugging code. Will remove.

with urlopen(url) as response:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't work in python2.7

dtype = {
'names': ['date', 'open', 'high', 'low', 'close', 'volume'],
'formats': ['object', 'f4', 'f4', 'f4', 'f4', 'f4']
}
converters = {0: lambda s: datetime.strptime(s.decode(), '%d-%b-%y')}
return np.genfromtxt(response, delimiter=',', skip_header=1,
dtype=dtype, converters=converters,
missing_values='-', filling_values=-1)


# Choose a time period reasonably calm (not too long ago so that we get
# high-tech firms, and before the 2008 crash)
d1 = datetime.datetime(2003, 1, 1)
d2 = datetime.datetime(2008, 1, 1)
d1 = datetime(2003, 1, 1)
d2 = datetime(2008, 1, 1)

# kraft symbol has now changed from KFT to MDLZ in yahoo
symbol_dict = {
'TOT': 'Total',
'XOM': 'Exxon',
Expand All @@ -102,7 +135,6 @@
'AMZN': 'Amazon',
'TM': 'Toyota',
'CAJ': 'Canon',
'MTU': 'Mitsubishi',
'SNE': 'Sony',
'F': 'Ford',
'HMC': 'Honda',
Expand All @@ -113,7 +145,6 @@
'MMM': '3M',
'MCD': 'Mc Donalds',
'PEP': 'Pepsi',
'MDLZ': 'Kraft Foods',
'K': 'Kellogg',
'UN': 'Unilever',
'MAR': 'Marriott',
Expand All @@ -131,9 +162,7 @@
'CSCO': 'Cisco',
'TXN': 'Texas instruments',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while we're at it, the i should probably be uppercase, same for the 'e' in American Express, and I think "Mc Donalds" should be "McDonald's".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

'XRX': 'Xerox',
'LMT': 'Lookheed Martin',
'WMT': 'Wal-Mart',
'WBA': 'Walgreen',
'HD': 'Home Depot',
'GSK': 'GlaxoSmithKline',
'PFE': 'Pfizer',
Expand All @@ -149,15 +178,17 @@

symbols, names = np.array(list(symbol_dict.items())).T

quotes = [quotes_historical_yahoo_ochl(symbol, d1, d2, asobject=True)
for symbol in symbols]
quotes = [
quotes_historical_google(symbol, d1, d2) for symbol in symbols
]

open = np.array([q.open for q in quotes]).astype(np.float)
close = np.array([q.close for q in quotes]).astype(np.float)
close = np.stack([q['close'] for q in quotes])
open = np.stack([q['open'] for q in quotes])

# The daily variations of the quotes are what carry most information
variation = close - open
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, not this pr's fault, but open is a reserved keyword in python, could we rename these close_quote / open_quote or whatever the explicit finance terms are?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

close_prices/open_prices seems fine with me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.



###############################################################################
# Learn a graphical structure from the correlations
edge_model = covariance.GraphLassoCV()
Expand Down