# Data Analysis & Visualisation of Stock Data

* Project Scope: Analyze and visualize Stock data using python libraries
* Data source: Yahoo finance using pandas datareader
* Data: Using pandas datareader, we will get stock information for the following banks
            * Bank of America
            * CitiGroup
            * Goldman Sachs
            * JPMorgan Chase
            * Morgan Stanley
            * Wells Fargo


## Import libraries

In [None]:
# First things first: 
## Install pandas datareader by using this "pip install pandas-datareader" in python command line

# Import Data Analysis libraries
from pandas_datareader import data, wb
import datetime
import numpy as np
import pandas as pd

# Import Data Visualisation libraries
import matplotlib.pyplot as plt
import seaborn as sns
import cufflinks as cf
cf.go_offline()

# For display in notebook
%matplotlib inline

## Load data

Idea is to get stock data from Jan 1st 2006 to Jan 1st 2016 for each of the above banks, set each bank to be a separate dataframe, with the variable name for that bank being its ticker symbol. Steps to get data
    * first use datetime to set start and end datetime objects
    * Use datareader to get information on the stock from yahoo

In [80]:
# start and end date time objects
start = datetime.datetime(2006, 1, 1)
end = datetime.datetime(2016, 1, 1)

In [81]:
# datareader to get info
# Bank of America
BAC = data.DataReader("BAC", 'yahoo', start, end)

# CitiGroup
C = data.DataReader("C", 'yahoo', start, end)

# Goldman Sachs
GS = data.DataReader("GS", 'yahoo', start, end)

# JPMorgan Chase
JPM = data.DataReader("JPM", 'yahoo', start, end)

# Morgan Stanley
MS = data.DataReader("MS", 'yahoo', start, end)

# Wells Fargo
WFC = data.DataReader("WFC", 'yahoo', start, end)

** Lets view BAC info.. **

In [86]:
BAC.head()

Unnamed: 0_level_0,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2006-01-03,47.18,46.150002,46.919998,47.080002,16296700.0,36.535149
2006-01-04,47.240002,46.450001,47.0,46.580002,17757900.0,36.147141
2006-01-05,46.830002,46.32,46.580002,46.639999,14970700.0,36.193695
2006-01-06,46.91,46.349998,46.799999,46.57,12599800.0,36.13937
2006-01-09,46.970001,46.360001,46.720001,46.599998,15619400.0,36.162647


can get data like this also

df = data.DataReader(['BAC', 'C', 'GS', 'JPM', 'MS', 'WFC'],'yahoo', start, end)

In [82]:
# ticker symbols in alphabetical order
tickers = ['BAC', 'C', 'GS', 'JPM', 'MS', 'WFC']

In [83]:
bank_stocks = pd.concat([BAC, C, GS, JPM, MS, WFC],axis=1,keys=tickers)

In [84]:
bank_stocks.columns.names = ['Bank Ticker','Stock Info']

** Lets view bank_stocks dataFrame **

In [85]:
bank_stocks.head()

Bank Ticker,BAC,BAC,BAC,BAC,BAC,BAC,C,C,C,C,...,MS,MS,MS,MS,WFC,WFC,WFC,WFC,WFC,WFC
Stock Info,High,Low,Open,Close,Volume,Adj Close,High,Low,Open,Close,...,Open,Close,Volume,Adj Close,High,Low,Open,Close,Volume,Adj Close
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
2006-01-03,47.18,46.150002,46.919998,47.080002,16296700.0,36.535149,493.799988,481.100006,490.0,492.899994,...,57.169998,58.310001,5377000.0,39.379498,31.975,31.195,31.6,31.9,11016400.0,22.067434
2006-01-04,47.240002,46.450001,47.0,46.580002,17757900.0,36.147141,491.0,483.5,488.600006,483.799988,...,58.700001,58.349998,7977800.0,39.406509,31.82,31.365,31.799999,31.530001,10870000.0,21.811476
2006-01-05,46.830002,46.32,46.580002,46.639999,14970700.0,36.193695,487.799988,484.0,484.399994,486.200012,...,58.549999,58.509998,5778000.0,39.514561,31.555,31.309999,31.5,31.495001,10158000.0,21.787266
2006-01-06,46.91,46.349998,46.799999,46.57,12599800.0,36.13937,489.0,482.0,488.799988,486.200012,...,58.77,58.57,6889800.0,39.555077,31.775,31.385,31.58,31.68,8403800.0,21.915241
2006-01-09,46.970001,46.360001,46.720001,46.599998,15619400.0,36.162647,487.399994,483.0,486.0,483.899994,...,58.630001,59.189999,4144500.0,39.973793,31.825001,31.555,31.674999,31.674999,5619600.0,21.911785


## Let's analyze and visualize data

** 1.Max Close price for each stock throughout the time period **

In [67]:
bank_stocks.xs(key='Close',axis=1,level='Stock Info').max()

Bank Ticker
BAC     54.900002
C      564.099976
GS     247.919998
JPM     70.080002
MS      89.300003
WFC     58.520000
dtype: float64

** 2. Need to continue this analysis... **

## Things to note after analysis

After analysing and visualizing data, hear are the things we know