# Normality and empirical distribution of USD-INR

## Checklist
- [x] Fetch data
- [x] Check normality
    - [x] QQ-plot (visual)
    - [x] Shapiro-wilk test
- [x] Empirical distribution for the data, if not normal

## Data fetching

In [None]:
# Load libraries
import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
import statsmodels.api as sm
from statsmodels.distributions.empirical_distribution import ECDF


plt.rcParams["figure.figsize"] = (16, 9)  # Standard size

In [None]:
# Fetching the data from yahoo finance
ticker = "INR=X" 
start_date = "2015-05-27"
end_date = "2025-05-28" # 5 years data + 1 day

data = yf.download(ticker, start=start_date, end=end_date)
data = data[['Close']]

print(data.head())

## Normality check

In [None]:
# Calculating daily returns
data['Log_Return'] = np.log(data['Close'] / data['Close'].shift(1))
returns = data['Log_Return'].dropna() # Dropping first row
#print(data.head())
print(returns.head())

In [None]:
# EDA

# Histogram
sns.histplot(returns, kde=True, bins=50)
plt.title('Histogram of INR-USD Daily Log Returns')
plt.xlabel('Log Return')
plt.ylabel('Frequency')
plt.grid(True)
plt.show()

From the histogram it does seem like the data follows a normal distribution.
We will investigate further with QQ-plot and the test

In [None]:
# QQ-plot

sm.qqplot(returns, line='s', fit=True) # 
plt.title('Q-Q Plot of INR-USD Daily Log Returns')
plt.xlabel('Theoretical Quantiles')
plt.ylabel('Sample Quantiles')
plt.grid(True)
plt.show()

From the qq plot we can see the data is not perfectly normal, it has a lot of outlier data.

In [None]:
# Shapiro-wilk test

shapiro_test = stats.shapiro(returns)
print(shapiro_test.statistic)
print(shapiro_test.pvalue)
print(f"\n")

if shapiro_test.pvalue < 0.05:
    print("Conclusion: Reject the null hypothesis. The data is NOT normally distributed.")
else:
    print("Conclusion: Fail to reject the null hypothesis. The data MAY be normally distributed.")


So we can conclude the data is **not normally distributed**.


## Empirical distribution

In [None]:
# Creating an empirical distribution
ecdf = ECDF(returns)

plt.plot(ecdf.x[1:], ecdf.y[1:], drawstyle="steps-post")
plt.xlabel("X")
plt.ylabel("Empirical Distribution")
plt.show()