# Analyze Stock Prices
The goal of this notebook is to analyze prices of stocks, which we will accomplish by extracting stock prices from an online source, such as Yahoo Finance or MorningStar, and analyze and visualise it.

We will try to address the following specific questions:

  * How to download/extract prices from an online source?
  * Historic change of a Stock's price over time?
  * Calculate the daily return average of a stock.
  * What was the correlation between daily returns of different stocks?
  * Use Regression/ML models to predict future stock prices.

In [3]:
#Import libraries
import pandas as pd
from pandas import Series,DataFrame
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style('whitegrid')
%matplotlib inline
import pandas_datareader.data as web
from datetime import datetime
from __future__ import division

We will first create a list of stocks that we are interested in analyzing in this study. In `stocks` variable store the stock ticker. We will set the periof of analysis using `start` and `end` variables. Then using `DataReader` we will read historic open, close, high, and low prices.

In [6]:
# Create a list of stocks to analyze
stocks = ['HASI','PEGI','CAFD','BEP','CVA']

# Lets set today's date as the time up to which the analysis has to be carried out
end = datetime.now()

# Set the historic analysis period in years (yrs = 1), and initialize the start date
yrs = 1
start = datetime(end.year-yrs,end.month,end.day) 

#Using Yahoo Finance/MorningStar to grab the stock data
for stock in stocks:
    globals()[stock] = web.DataReader(stock,'morningstar',start,end) #The globals method sets the stock name to a global variable

The advantage of using the `globals()` method ensures that each stock-ticker is stored as a variable and is available across this notebook and within any local functions. We then call `head()` and `describe()` to inspect the stock prices stored in the dataframe. The `head()` method shows the top five records in the dataframe (think about `tail()` method). The `describe()` method on the other hand shows the summary statistics of each column (high, low, open, close, volume etc.) of the dataframe.

In [10]:
HASI.tail()

Unnamed: 0_level_0,Unnamed: 1_level_0,Close,High,Low,Open,Volume
Symbol,Date,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
HASI,2018-03-19,18.56,18.95,18.28,18.94,258278
HASI,2018-03-20,18.61,18.739,18.43,18.56,391662
HASI,2018-03-21,18.78,18.9,18.52,18.66,220597
HASI,2018-03-22,18.8,19.13,18.7,18.7,309696
HASI,2018-03-23,18.83,18.97,18.78,18.83,339865


In [11]:
HASI.describe()

Unnamed: 0,Close,High,Low,Open,Volume
count,260.0,260.0,260.0,260.0,260.0
mean,22.490654,22.70522,22.29139,22.501769,329648.8
std,1.708258,1.695137,1.724651,1.710681,255009.0
min,17.6,17.9,17.33,17.59,0.0
25%,21.79,21.9625,21.60675,21.7675,203544.8
50%,23.01,23.16,22.83,23.03,271339.0
75%,23.77,24.045,23.5125,23.7625,387751.5
max,25.22,25.28,24.8,25.22,2858697.0
