# Finance Data Project 

In this data project we will focus on exploratory data analysis of stock prices. Keep in mind, this project is just meant to practice visualization and pandas skills, it is not meant to be a robust financial analysis or be taken as financial advice.
____
We'll focus on bank stocks and see how they progressed throughout the [financial crisis](https://en.wikipedia.org/wiki/Financial_crisis_of_2007%E2%80%9308).

# The Imports

In [None]:
from pandas_datareader import data, wb
import pandas as pd
import numpy as np
import datetime
%matplotlib inline

# Data

We will get stock information for the following banks:
*  Bank of America
* CitiGroup
* Goldman Sachs
* JPMorgan Chase
* Morgan Stanley
* Wells Fargo

In [None]:
df = pd.read_pickle('../input/dataset/all_banks')
df['BAC']

In [None]:
# Bank of America
BAC = df['BAC']

# CitiGroup
C = df['C']

# Goldman Sachs
GS = df['GS']

# JPMorgan Chase
JPM = df['JPM']

# Morgan Stanley
MS = df['MS']

# Wells Fargo
WFC = df['WFC']

Create a list of the ticker symbols (as strings) in alphabetical order. Call this list: tickers

In [None]:
tickers = ['BAC', 'C', 'GS', 'JPM', 'MS', 'WFC']

Use pd.concat to concatenate the bank dataframes together to a single data frame called bank_stocks. Set the keys argument equal to the tickers list. Also pay attention to what axis you concatenate on.

In [None]:
bank_stocks = pd.concat([BAC, C, GS, JPM, MS, WFC],axis=1,keys=tickers)

Set the column name levels

In [None]:
bank_stocks.columns.names = ['Bank Ticker','Stock Info']

Check the head of the bank_stocks dataframe.

In [None]:
bank_stocks.head()

# EDA

Let's explore the data a bit! Before continuing,


What is the max Close price for each bank's stock throughout the time period?

In [None]:
bank_stocks.xs(key='Close',axis=1,level='Stock Info').max()

Create a new empty DataFrame called returns. This dataframe will contain the returns for each bank's stock. returns are typically defined by:

$$r_t = \frac{p_t - p_{t-1}}{p_{t-1}} = \frac{p_t}{p_{t-1}} - 1$$

In [None]:
returns = pd.DataFrame()

We can use pandas pct_change() method on the Close column to create a column representing this return value. Create a for loop that goes and for each Bank Stock Ticker creates this returns column and set's it as a column in the returns DataFrame.

In [None]:
for tick in tickers:
    returns[tick+' Return'] = bank_stocks[tick]['Close'].pct_change()
returns.head()

Create a pairplot using seaborn of the returns dataframe.

In [None]:
import seaborn as sns
sns.pairplot(returns[1:])

Using this returns DataFrame, figure out on what dates each bank stock had the best and worst single day returns. You should notice that 4 of the banks share the same day for the worst drop.

In [None]:
returns.idxmin()

In [None]:
returns.idxmax()

Take a look at the standard deviation of the returns, which stock would you classify as the riskiest over the entire time period? Which would you classify as the riskiest for the year 2015?

In [None]:
returns.std()

In [None]:
returns.loc['2015-01-01':'2015-12-31'].std()

Create a distplot using seaborn of the 2015 returns for Morgan Stanley

In [None]:
sns.distplot(returns.loc['2015-01-01':'2015-12-31']['MS Return'],color='green',bins=100)

Create a distplot using seaborn of the 2008 returns for CitiGroup

In [None]:
sns.distplot(returns.loc['2008-01-01':'2008-12-31']['C Return'],color='red',bins=100)

____
# More Visualization

We have many options to use a visualization libraries,like seaborn, matplotlib, plotly and cufflinks, or just pandas.

### Imports

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style('whitegrid')
%matplotlib inline

# Optional Plotly Method Imports
import plotly
import cufflinks as cf
cf.go_offline()

Create a line plot showing Close price for each bank for the entire index of time.(2 ways to do)

In [None]:
for tick in tickers:
    bank_stocks[tick]['Close'].plot(figsize=(12,4),label=tick)
plt.legend()

In [None]:
bank_stocks.xs(key='Close',axis=1,level='Stock Info').plot()

In [None]:
bank_stocks.xs(key='Close',axis=1,level='Stock Info').iplot()

# Moving Averages

Let's analyze the moving averages for these stocks in the year 2008. 

Plot the rolling 30 day average against the Close Price for Bank Of America's stock for the year 2008

In [None]:
plt.figure(figsize=(12,6))
BAC['Close'].loc['2008-01-01':'2009-01-01'].rolling(window=30).mean().plot(label='30 Day Avg')
BAC['Close'].loc['2008-01-01':'2009-01-01'].plot(label='BAC CLOSE')
plt.legend()

 Create a heatmap of the correlation between the stocks Close Price.

In [None]:
sns.heatmap(bank_stocks.xs(key='Close',axis=1,level='Stock Info').corr(),annot=True)

Use seaborn's clustermap to cluster the correlations together:

In [None]:
sns.clustermap(bank_stocks.xs(key='Close',axis=1,level='Stock Info').corr(),annot=True)

In [None]:
close_corr = bank_stocks.xs(key='Close',axis=1,level='Stock Info').corr()
close_corr.iplot(kind='heatmap',colorscale='rdylbu')

# Part 2 

We use cufflinks library to create some Technical Analysis plots.

Use .iplot(kind='candle) to create a candle plot of Bank of America's stock from Jan 1st 2015 to Jan 1st 2016.

In [None]:
BAC[['Open', 'High', 'Low', 'Close']].loc['2015-01-01':'2016-01-01'].iplot(kind='candle')

Use .ta_plot(study='sma') to create a Simple Moving Averages plot of Morgan Stanley for the year 2015.

In [None]:
MS['Close'].loc['2015-01-01':'2016-01-01'].ta_plot(study='sma',periods=[13,21,55],title='Simple Moving Averages')

Use .ta_plot(study='boll') to create a Bollinger Band Plot for Bank of America for the year 2015.

In [None]:
BAC['Close'].loc['2015-01-01':'2016-01-01'].ta_plot(study='boll')

# Data Analysis and Visualisation is Done
Still there are many ways to do so...





I would like to know your views about this notebook, I am new to this, if you have any good and diffrent approach of doing the analysis and visualisation, Please share!

If you find this Notebook useful? => Do Upvote!

