# **Finance Data Vizualization Project**

---



This project is an exploratory analysis of stock price data, designed to practice and demonstrate skills in data visualization and the pandas library. Please note that this is for educational purposes and should not be considered financial advice.

---



We'll be focusing on bank stocks and seeing how they have progressed through the financial crisis all the way to early 2025.

---



In [39]:
import pandas as pd
import numpy as np
import yfinance
from datetime import datetime

import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.io as pio
pio.renderers.default = "colab"

import warnings
warnings.filterwarnings("ignore", category=FutureWarning)

We need to fetch data using yfinance. We will get stock information of the following banks.

*   Bank of America
*   Morgan Stanley
*   Goldman Sachs
*   JPMorgan Chase & Co.

In [4]:
start = datetime(2005,1,1)
end = datetime(2025,1,1)

In [5]:
BAC = yfinance.Ticker("BAC")
BAC_data = BAC.history(start=start, end=end)
BAC_data.reset_index(inplace = True)

In [6]:
MS = yfinance.Ticker("MS")
MS_data = MS.history(start=start,end=end)
MS_data.reset_index(inplace=True)

In [7]:
GS = yfinance.Ticker("GS")
GS_data = GS.history(start=start,end=end)
GS_data.reset_index(inplace=True)

In [8]:
JPM = yfinance.Ticker("JPM")
JPM_data = JPM.history(start=start,end=end)
JPM_data.reset_index(inplace=True)

In [26]:
tickers = ["BAC","MS","GS","JPM"]

In [10]:
Bank_Stocks = pd.concat([BAC_data,MS_data,GS_data,JPM_data],axis = 1, keys = tickers)

In [11]:
Bank_Stocks.columns.names = ['Bank Ticker Symbol','Stock Info']

In [12]:
Bank_Stocks.head()

Bank Ticker Symbol,BAC,BAC,BAC,BAC,BAC,BAC,BAC,BAC,MS,MS,...,GS,GS,JPM,JPM,JPM,JPM,JPM,JPM,JPM,JPM
Stock Info,Date,Open,High,Low,Close,Volume,Dividends,Stock Splits,Date,Open,...,Dividends,Stock Splits,Date,Open,High,Low,Close,Volume,Dividends,Stock Splits
0,2005-01-03 00:00:00-05:00,29.29477,29.476843,28.99341,29.169203,10238100,0.0,0.0,2005-01-03 00:00:00-05:00,29.839254,...,0.0,0.0,2005-01-03 00:00:00-05:00,22.667167,22.787737,22.397319,22.477701,14957900,0.0,0.0
1,2005-01-04 00:00:00-05:00,29.137829,29.338736,28.717181,28.767406,10264100,0.0,0.0,2005-01-04 00:00:00-05:00,29.855239,...,0.0,0.0,2005-01-04 00:00:00-05:00,22.57615,22.622484,22.193896,22.246021,11360900,0.34,0.0
2,2005-01-05 00:00:00-05:00,28.654386,28.842736,28.409531,28.434645,14796100,0.0,0.0,2005-01-05 00:00:00-05:00,29.348776,...,0.0,0.0,2005-01-05 00:00:00-05:00,22.356073,22.541408,22.234447,22.292366,9770200,0.0,0.0
3,2005-01-06 00:00:00-05:00,28.516255,28.648102,28.246288,28.39069,14602200,0.0,0.0,2005-01-06 00:00:00-05:00,29.828589,...,0.0,0.0,2005-01-06 00:00:00-05:00,22.454532,22.576158,22.379239,22.419781,9115900,0.0,0.0
4,2005-01-07 00:00:00-05:00,28.440925,28.478596,28.076782,28.083059,10547200,0.0,0.0,2005-01-07 00:00:00-05:00,30.095142,...,0.0,0.0,2005-01-07 00:00:00-05:00,22.408191,22.512442,22.23444,22.240232,9971200,0.0,0.0


# **EDA - Exploratory Data Analysis**
---



Let's Explore the data!



**What's the Max Stock Info for the Banks?**

In [74]:
Bank_Stocks.xs('Open',level = 'Stock Info', axis = 1).max()

In [77]:
Bank_Stocks.xs('Close',level = 'Stock Info', axis = 1).max()

In [75]:
Bank_Stocks.xs('High',level = 'Stock Info', axis = 1).max()

In [76]:
Bank_Stocks.xs('Low',level = 'Stock Info', axis = 1).max()

Now we can calculate returns using percentage change

In [28]:
returns = pd.DataFrame()
for tick in tickers:
  returns[tick+' Return'] = Bank_Stocks[tick]['Close'].pct_change()
display(returns)


Unnamed: 0,BAC Return,MS Return,GS Return,JPM Return
0,,,,
1,-0.013775,-0.010734,-0.006480,-0.010307
2,-0.011567,-0.005786,-0.004507,0.002083
3,-0.001546,0.023645,0.013776,0.005716
4,-0.010836,-0.003909,-0.004276,-0.008008
...,...,...,...,...
5028,0.011164,0.020972,0.021041,0.016444
5029,0.003831,0.007634,-0.002677,0.003425
5030,-0.004714,-0.009920,-0.008688,-0.008102
5031,-0.009698,-0.007968,-0.004565,-0.007671


Let's Plot this DataFrame

In [30]:
returns_for_plotting = returns[1:].copy()

In [32]:
returns_for_plotting["Highest Stock Returns"] = returns_for_plotting.idxmax(axis=1)

In [72]:
fig_scatter = px.scatter_matrix(
    returns_for_plotting,
    dimensions=['BAC Return', 'MS Return', 'GS Return', 'JPM Return'],
    color="Highest Stock Returns",
    title="Daily Stock Returns"
)

fig_scatter.show()

# More Visualizations!

---



In [90]:
close_prices_all_stocks = Bank_Stocks.xs('Close', level='Stock Info', axis=1)
correlation_matrix = close_prices_all_stocks.corr()
fig_heatmap = px.imshow(
    correlation_matrix,
    text_auto=True,
    title='Correlation Heatmap of Bank Stock Close Prices',
    color_continuous_scale='electric',  # Changed colorscale to 'RdBu'
    range_color=[-1, 1] # Added range to center the diverging colormap
)
fig_heatmap.show()

In [87]:
fig_line = px.line(close_prices_all_stocks, title='Interactive Close Prices')
fig_line.show()