# Financial data analysis

Financial data analysis involves examining financial information, such as stock prices, market trends, and company performance, to derive insights that support decision-making. We analyze metrics like volatility, returns, and various risk assessment methods. In this article, I’ll walk you through financial data analysis with Python, which will help you understand how to analyze financial data and make decisions based on it.

## Ananlysis goal:

This analysis aims to explore financial data from NIFTY50 stocks to uncover insights that can guide investment strategies and risk management decisions. The dataset consists of 24 days of historical closing prices for 50 stocks, with the Date column representing trading days.

## Analysis scope:

The scope of the analysis includes calculating descriptive statistics to summarize stock behaviour, constructing and evaluating a portfolio for returns and risk, assessing volatility and Value at Risk (VaR), identifying trends through technical indicators like moving averages and Bollinger Bands, and forecasting future stock prices using Monte Carlo simulations.

In [1]:
# importing required library

import pandas as pd

In [3]:
pd.set_option("display.max_columns", None)
pd.set_option("display.max_rows", None)

In [4]:
# importing the dataset
df = pd.read_csv("data/nifty50_closing_prices.csv")
df.head()

Unnamed: 0,Date,RELIANCE.NS,HDFCBANK.NS,ICICIBANK.NS,INFY.NS,TCS.NS,KOTAKBANK.NS,HINDUNILVR.NS,ITC.NS,LT.NS,SBIN.NS,BAJFINANCE.NS,BHARTIARTL.NS,HCLTECH.NS,ASIANPAINT.NS,AXISBANK.NS,DMART.NS,MARUTI.NS,ULTRACEMCO.NS,HDFC.NS,TITAN.NS,SUNPHARMA.NS,M&M.NS,NESTLEIND.NS,WIPRO.NS,ADANIGREEN.NS,TATASTEEL.NS,JSWSTEEL.NS,POWERGRID.NS,ONGC.NS,NTPC.NS,COALINDIA.NS,BPCL.NS,IOC.NS,TECHM.NS,INDUSINDBK.NS,DIVISLAB.NS,GRASIM.NS,CIPLA.NS,BAJAJFINSV.NS,TATAMOTORS.NS,HEROMOTOCO.NS,DRREDDY.NS,SHREECEM.NS,BRITANNIA.NS,UPL.NS,EICHERMOT.NS,SBILIFE.NS,ADANIPORTS.NS,BAJAJ-AUTO.NS,HINDALCO.NS
0,2024-08-20 00:00:00+05:30,2991.899902,1637.699951,1179.449951,1872.199951,4523.299805,1805.650024,2751.050049,498.799988,3572.699951,820.299988,6722.200195,1449.150024,1686.75,3103.199951,1168.0,5079.200195,12214.950195,11349.700195,,3474.899902,1766.349976,2771.300049,2518.5,524.650024,1924.599976,153.929993,917.150024,340.5,327.555695,406.25,524.599976,349.399994,172.229996,1628.599976,1381.300049,4723.149902,2636.699951,1562.849976,1602.099976,1086.900024,5244.399902,6965.350098,24730.550781,5765.799805,566.150024,4883.25,1761.300049,1492.550049,9779.700195,672.900024
1,2024-08-21 00:00:00+05:30,2997.350098,1625.800049,1174.849976,1872.699951,4551.5,1812.949951,2791.199951,505.399994,3596.050049,815.549988,6735.350098,1463.449951,1677.25,3151.550049,1174.400024,5099.450195,12220.950195,11200.900391,,3560.399902,1764.650024,2769.399902,2551.75,526.349976,1920.849976,151.919998,925.799988,336.649994,325.174194,408.950012,532.200012,351.200012,173.889999,1604.650024,1384.0,4900.799805,2684.850098,1594.599976,1620.949951,1085.199951,5284.700195,7062.450195,24808.050781,5837.350098,568.299988,4913.549805,1800.599976,1503.5,9852.0,685.599976
2,2024-08-22 00:00:00+05:30,2996.25,1631.300049,1191.099976,1880.25,4502.0,1821.5,2792.800049,504.549988,3606.5,820.299988,6743.600098,1486.349976,1676.150024,3186.600098,1169.949951,5057.850098,12276.349609,11309.400391,,3604.399902,1750.650024,2732.949951,2551.0,519.0,1886.349976,154.139999,933.25,334.0,321.850006,403.350006,528.849976,350.100006,173.789993,1611.25,1381.900024,4911.450195,2755.149902,1585.800049,1625.699951,1068.449951,5329.950195,6969.049805,25012.400391,5836.799805,579.150024,4933.549805,1795.25,1492.300049,9914.200195,685.549988
3,2024-08-23 00:00:00+05:30,2999.949951,1625.050049,1203.5,1862.099976,4463.899902,1818.0,2815.600098,505.799988,3598.550049,815.349976,6735.850098,1506.75,1661.449951,3154.649902,1165.949951,4901.5,12302.299805,11341.799805,,3570.0,1775.75,2759.0,2529.199951,512.400024,1900.900024,154.199997,941.049988,336.25,318.899994,401.950012,538.849976,352.200012,173.130005,1598.400024,1388.550049,4855.950195,2748.550049,1574.550049,1639.900024,1085.150024,5384.899902,6954.5,24706.050781,5792.649902,573.700012,4898.100098,1789.300049,1491.300049,10406.450195,685.099976
4,2024-08-26 00:00:00+05:30,3025.199951,1639.949951,1213.300049,1876.150024,4502.450195,1812.5,2821.149902,505.700012,3641.899902,815.049988,6778.350098,1513.550049,1719.449951,3171.350098,1170.300049,4959.399902,12243.799805,11337.099609,,3630.199951,1772.449951,2793.100098,2519.550049,520.0,1888.900024,155.699997,963.5,338.25,327.850006,414.850006,538.099976,351.149994,173.460007,1640.150024,1384.5,4926.25,2736.600098,1593.949951,1686.199951,1092.400024,5343.75,6943.299805,24906.449219,5796.950195,577.450012,4875.200195,1796.25,1482.550049,10432.549805,711.849976


## Data preprocessing 

In [5]:
# checking missing values
missing_values = df.isnull().sum()
missing_values

Date              0
RELIANCE.NS       0
HDFCBANK.NS       0
ICICIBANK.NS      0
INFY.NS           0
TCS.NS            0
KOTAKBANK.NS      0
HINDUNILVR.NS     0
ITC.NS            0
LT.NS             0
SBIN.NS           0
BAJFINANCE.NS     0
BHARTIARTL.NS     0
HCLTECH.NS        0
ASIANPAINT.NS     0
AXISBANK.NS       0
DMART.NS          0
MARUTI.NS         0
ULTRACEMCO.NS     0
HDFC.NS          24
TITAN.NS          0
SUNPHARMA.NS      0
M&M.NS            0
NESTLEIND.NS      0
WIPRO.NS          0
ADANIGREEN.NS     0
TATASTEEL.NS      0
JSWSTEEL.NS       0
POWERGRID.NS      0
ONGC.NS           0
NTPC.NS           0
COALINDIA.NS      0
BPCL.NS           0
IOC.NS            0
TECHM.NS          0
INDUSINDBK.NS     0
DIVISLAB.NS       0
GRASIM.NS         0
CIPLA.NS          0
BAJAJFINSV.NS     0
TATAMOTORS.NS     0
HEROMOTOCO.NS     0
DRREDDY.NS        0
SHREECEM.NS       0
BRITANNIA.NS      0
UPL.NS            0
EICHERMOT.NS      0
SBILIFE.NS        0
ADANIPORTS.NS     0
BAJAJ-AUTO.NS     0


In [6]:
# check for date column format
date_format_check = pd.to_datetime(df['Date'], errors='coerce').notna().all()
date_format_check

True

In [7]:
# check if data has sufficient rows for time-series analysis(at least 20 rows)
sufficient_rows = df.shape[0] >= 20
sufficient_rows

True

In [9]:
# preparing a summary of the checks 
data_preparation_status = {
    "Missing Values in columns": missing_values[missing_values > 0].to_dict(),
    "Date Column format valid": date_format_check,
    "Sufficient rows for time-series analysis": sufficient_rows
}

data_preparation_status


{'Missing Values in columns': {'HDFC.NS': 24},
 'Date Column format valid': True,
 'Sufficient rows for time-series analysis': True}