# New Hire Training - Python Fundamentals: Day 2 Lesson Codes - Visualizations
This Jupyter Notebook file is a summary of codes demonstrated in class by The Marquee Group during the J.P. Morgan New Hire Training.

## Session 1 – Visualization

This session will provide an overview of popular visualization packages in Python including matplotlib, Seaborn, and Plotly. During the session participants will:
- Develop the ability to create powerful visualizations using the matplotlib and seaborn packages
- Plot and interpret scatterplots and time series plots
- Format settings of graphs
- Learn to create and interpret more advanced graphs such as histograms, box plots and volatility surface charts
- Create interactive charts with Plotly Express that can be exported as shareable web html files or hosted online

### <font color = 'blue'> **Section 1 - Importing Packages and Data** </font>

In [4]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px

In [6]:
sp500 = pd.read_csv("ADAPT2021/StockData/SP500.csv", parse_dates=['Date'], index_col=['Date'])
aapl = pd.read_csv("ADAPT2021/StockData/AAPL.csv", parse_dates=['Date'], index_col=['Date'])

In [3]:
#%% Calculating Returns
sp500['Returns'] = sp500['Close'] / sp500['Close'].shift(1) - 1
    #daily return (Close/Prev Close) - 1

aapl['Returns'] = aapl['Close'].pct_change()

### <font color = 'blue'> **Section 2 - Simple Visualizations with Pandas** </font>

#### Line Graph from Pandas Column

In [4]:
#%% Visualizations
#pandas has a built-in formula
sp500['Close'].plot()

#### Multiple Line Graphs on same Plot
- You can keep adding .plot formulas to plot on the same graph

In [5]:
sp500['Close'].plot()
aapl['High'].plot()
plt.show()

### <font color = 'blue'> **Section 3 - Simple Visualizations with MatplotLib** </font>

- Using Matplotlib, you can customize the title, legend, axis labels, etc.
- plt.title(), plt.xlabel(), plt.ylabel(), plt.legend()
- https://matplotlib.org/3.1.0/gallery/showcase/anatomy.html

In [6]:
#%% Simple Visualization
# dataframe[column].plot()
# changes to the graph settings
# plt.show()
sp500['Close'].plot()
plt.ylabel("Index Close Price")
plt.title("S&P 500 (Yahoo Finance)")
plt.xlabel("Closing Date")
plt.show()

### <font color = 'blue'> **Section 4 - Annotated Line Chart with MatplotLib** </font>

In [7]:
#With Annotations
#max share price
maxPrice = sp500['Close'].max()
minPrice = sp500['Close'].min()
#dates when the max and min happened ---> use .idxmax() and .idxmin()
maxDate = sp500['Close'].idxmax()
minDate = sp500['Close'].idxmin()

In [8]:
sp500['Close'].plot()
# aapl['Close'].plot()
plt.ylabel("Index Close Price")
plt.title("S&P 500 (Yahoo Finance)")
plt.xlabel("Closing Date")

#Add some annotations
#Plot the max and min share price in this time period
    #need the x,y coordinates of the annotations
plt.plot(maxDate, maxPrice, color='green', marker='o')
plt.plot(minDate, minPrice, color='red', marker='D')   
    #matplotlib marker
    #https://matplotlib.org/3.1.0/api/markers_api.html
    
#plot the labels using .annotate(xy=(location of the marker), xytext=(location of the label))
        #DateOFfset adds #of days to a DateTime variable
plt.annotate("High",xy=(maxDate,maxPrice),xytext=(pd.to_datetime("2015-05-01"), 2800),
             arrowprops=dict(facecolor='black',shrink=0.05))
plt.annotate("Low",xy=(minDate,minPrice),
             xytext=(minDate + pd.DateOffset(50), minPrice + 10))

plt.show()

### <font color = 'blue'> **Section 5 - Multiple Line Charts** </font>

In [9]:
#%% Multiple line charts
sp500['Close'].plot()
# aapl['Close'].plot()
plt.plot(aapl.index, aapl['Close'], color='red')
plt.legend(['S&P 500','Apple Inc'])
plt.show()

In [10]:
#%% Multiple line charts seperate graphs
#create each individual graph and then plot them together
# plt.subplot(a, b, c); a = num of rows, b= num of cols, c = graph #

jnj = pd.read_csv("ADAPT2021/StockData/JNJ.csv", parse_dates=['Date'], index_col=['Date'])
jpm = pd.read_csv("ADAPT2021/StockData/JPM.csv", parse_dates=['Date'], index_col=['Date'])

numChartRows = 2
numChartCols = 2

plt.figure(figsize=(10, 6))

plt.subplot(numChartRows,numChartCols,1)
plt.plot(sp500.index,sp500['Close'])
plt.title("S&P Close")

plt.subplot(numChartRows,numChartCols,2)
plt.plot(aapl.index,aapl['Close'])
plt.title("AAPL")

plt.subplot(numChartRows,numChartCols,3)
plt.plot(jnj.index,jnj['Close'])
plt.title("JNJ")

plt.subplot(numChartRows,numChartCols,4)
plt.plot(jpm.index,jpm['Close'])
plt.title("JPM")

# plt.ylim(bottom=0,top=aapl['Close'].max())

plt.tight_layout() #tries to fit in the charts when resizing
plt.show()

In [11]:
#%% Multiple line charts diff Y-Axis

# x, y = 10, 20
# def fnCubeSquare(x):
#     return x**3, x**2

fig, ax1 = plt.subplots()
    #fig is the container, ax1 is the chart inside
# cube, square = fnCubeSquare(5)
ax1.plot(sp500.index, sp500['Close'],'blue')
ax1.set_ylabel("S&P 500",color='blue')

ax2 = ax1.twinx() #twinx copies the x-axis
    #makes sure the 2nd chart has the same ranges of dates
ax2.plot(aapl.index, aapl['Close'],'black', alpha=0.6)
ax2.set_ylabel("AAPL",color='black')
# fig.show()

In [12]:
#%% Multiple line charts diff - Yaxis using Pandas
aapl['Close'].plot()
sp500['Close'].plot(secondary_y=True)
plt.show()

### <font color = 'blue'> **Section 6 - Histograms** </font>

In [13]:
sp500['Returns'].hist(figsize=(7,7))
plt.show()

In [14]:
#Providing the # of bins
aapl['Returns'].hist(bins=100)
plt.show()

#### Histogram with Density Line

In [15]:
#Histogram with density line
from scipy.stats import norm
import numpy as np
returns = aapl['Returns'].dropna().values
rtn500 = sorted(returns) #good practice to sort when going to use as a distribution

#Find the mean() and std() of the sp500 returns and plot a normal distribution overlay
avgReturn = np.mean(returns)
stdReturn = np.std(returns)

normReturns = np.linspace(avgReturn - 3*stdReturn, avgReturn + 3*stdReturn, 100)
plt.hist(returns, bins=100)
plt.plot(normReturns, norm.pdf(normReturns, avgReturn, 0.005))
#plt.text(-0.03,60, r'$\mu$ = %.3f $\sigma$ = %.3f' % (mu500, std500))
plt.annotate(r'$\mu$ = %.3f $\sigma$ = %.3f' % (avgReturn, stdReturn), xy=(.02,60))
plt.title('Apple Returns')
plt.show()

### <font color = 'blue'> **Section 7 - Seaborn Package** </font>
- http://seaborn.pydata.org/

#### Box & Whiskers Plot (Boxplot)

In [1]:
#Seaborn box whiskers
import seaborn as sns
sp500['Month'] = sp500.index.month
sns.boxplot(x="Month",y="Returns",data=sp500)
plt.show()

In [17]:
sns.lineplot(x=sp500.index, y="Close", data=sp500)
plt.show()

### <font color = 'blue'> **Section 8 - Styles** </font>

In [2]:
import matplotlib.style as style
style.available

In [7]:
#Code to test out different styles
for x in style.available:
    style.use(x)
    print(x)
    plt.plot(aapl.index, aapl['Close'])
    plt.show()

### <font color = 'blue'> **Section 9 - Plotly Package** </font>
- allows for interactive charts
- https://plotly.com/python/plotly-express/

In [None]:
fig = px.line(sp500, y="Close", title='S&P 500')
fig.show()

In [None]:
sp500['Ticker'] = 'SP500'
aapl['Ticker'] = 'AAPL'

intc = pd.read_csv("ADAPT2021/StockData/INTC.csv", parse_dates=['Date'], index_col=['Date'])
intc['Returns'] = intc['Close'].pct_change()
intc['Ticker'] = 'INTC'

ibm = pd.read_csv("ADAPT2021/StockData/IBM.csv", parse_dates=['Date'], index_col=['Date'])
ibm['Returns'] = ibm['Close'].pct_change()
ibm['Ticker'] = 'IBM'

stockData = pd.concat([intc,aapl,ibm])
stockData

In [None]:
px.line(stockData, x=stockData.index, y="Close", color="Ticker")

In [None]:
px.line(stockData, x=stockData.index, y="Returns", color="Ticker")

#### Example of Stacked Bar Chart

In [None]:
finDeals = pd.read_excel("ADAPT2021/ExData/Data Manipulation Worksheet.xlsx",sheet_name='Financing Table Clean')
figBar = px.bar(finDeals,
             x='INDUSTRY',
             y='SIZE',
             color='INDUSTRY', hover_name="ISSUER",
            title="Total Deal Value by Industry")
figBar.show()

#code to save the graph as an html file:
#figBar.write_html(file="ADAPT2021/Output/plotlyExample.html",auto_open=True)

### <font color = 'blue'> **Section 10 - Surface Volatility Plot** </font>

In [11]:
#%% SPY
df = pd.read_csv("ADAPT2021/Week2Data/spx.csv")
df.set_index(['Strike'],inplace=True)
#df.info()
df.head(20)

In [12]:
Z=df.values * 100
Y=df.index.values
X=df.columns.astype(str).values

In [13]:
import plotly.graph_objs as go

fig = go.Figure(
    data=[ go.Surface(z=Z,y=Y,x=X)],
    
    layout=dict(
        title = 'Vol Surface',
        autosize = True,
        width = 900,
        height = 700,
        margin = dict(l = 65, r = 50, b = 65, t = 90),
        scene = dict(
            aspectratio = dict(x = 1, y = 1, z = .6),
            xaxis = dict(title='Maturity'),
            yaxis = dict(title='Strike'),
            zaxis = dict(title='Volatility')
            )
        )
    )

fig.show()