#### Download Free Intraday Stock Data from Google Finance with Python

The google finance api is available at [google](https://finance.google.com/finance/getprices?p=1d&amp;f=d,o,h,l,c,v&amp;q=MSFT&amp;i=60&amp;x=NASD
)


https://finance.google.com/finance/getprices?p=1d&amp;f=d,o,h,l,c,v&amp;q=MSFT&amp;i=60&amp;x=NASD


With this URL we use the getprices method and pass in the desired parameters after the ? sign. In this example:

* period is the horizon over which to fetch data: 1 day
* f defines what should be fetched: date, open, high, low, close, volume
* q is the stock to fetch: Apple
* i defines the interval in seconds: every 60 seconds
* x is the exchange from which to fetch: NASDAQ

#### Downloading Intraday Data from Google Finance

To download the free intraday stock data we scrape the content of the above URL for those elements that we are interested in. The required modules are

In [8]:
import csv
import datetime
import re
import codecs
import requests
import pandas as pd
import cufflinks as cf #cufflinks binds pandas to plotly for easy plotting
from plotly.offline import init_notebook_mode, iplot

Write a function to download the data as follows::

In [9]:
def get_google_finance_intraday(ticker, period=60, days=1, exchange='NASD'):
    """
    Retrieve intraday stock data from Google Finance.
    
    Parameters
    ----------------
    ticker : str
        Company ticker symbol.
    period : int
        Interval between stock values in seconds.
        i = 60 corresponds to one minute tick data
        i = 86400 corresponds to daily data
    days : int
        Number of days of data to retrieve.
    exchange : str
        Exchange from which the quotes should be fetched
    
    Returns
    ---------------
    df : pandas.DataFrame
        DataFrame containing the opening price, high price, low price,
        closing price, and volume. The index contains the times associated with
        the retrieved price values.
    """
 
    # build url
    url = 'https://finance.google.com/finance/getprices' + \
          '?p={days}d&f=d,o,h,l,c,v&q={ticker}&i={period}&x={exchange}'.format(ticker=ticker, 
                                                                               period=period, 
                                                                               days=days,
                                                                               exchange=exchange)
    
    page = requests.get(url)
    reader = csv.reader(codecs.iterdecode(page.content.splitlines(), "utf-8"))
    columns = ['Open', 'High', 'Low', 'Close', 'Volume']
    rows = []
    times = []
    for row in reader:
        if re.match('^[a\d]', row[0]):
            if row[0].startswith('a'):
                start = datetime.datetime.fromtimestamp(int(row[0][1:]))
                times.append(start)
            else:
                times.append(start+datetime.timedelta(seconds=period*int(row[0])))
            rows.append(map(float, row[1:]))
    if len(rows):
        return pd.DataFrame(rows, index=pd.DatetimeIndex(times, name='Date'), columns=columns)
    else:
        return pd.DataFrame(rows, index=pd.DatetimeIndex(times, name='Date'))


The function above finds the data and scraps it.

In [10]:
# input data
ticker = 'AMZN'
period = 60
days = 1
exchange = 'NASD'

In [11]:
df = get_google_finance_intraday(ticker, period=period, days=days)

In [12]:
df.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2018-03-01 09:30:00,1513.8,1515.8,1512.23,1513.6,46797.0
2018-03-01 09:31:00,1515.33,1515.85,1511.7,1515.85,20523.0
2018-03-01 09:32:00,1517.5,1517.5,1513.1001,1515.5841,17115.0
2018-03-01 09:33:00,1517.29,1518.49,1516.194,1517.0,23144.0
2018-03-01 09:34:00,1513.83,1517.7402,1513.78,1517.22,13789.0


#### Plotting

We can plot the data with a few lines of code using the plotly and cufflinks modules. We initialize the notebook mode and want to use plotly in offline mode, thus

In [13]:
# initialize notebook mode
init_notebook_mode(connected=True)
#type below in terminal before starting jupiter
# this stops an error message about limit of data that 
# can be read into notebooks
#jupyter notebook --NotebookApp.iopub_data_rate_limit=10000000000
# set to offline
cf.go_offline()

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.


In [14]:
df[['Open', 'High', 'Low', 'Close']].iplot(kind='candle', up_color='#9900cc', down_color='#00ffcc', theme='solar', 
                                           title='Candlesticks for Intraday Prices of {ticker}'.format(ticker=ticker), xTitle='Time')
