# Mini Project 2

**2025 Introduction to Quantiative Methods in Finance**

**The Erdös Institute**


###  Hypothesis Testing of Standard Assumptions Theoretical Financial Mathematics

In the theory of mathematical finance, it is common to assume the log returns of a stock/index are normally distributed.


Investigate if the log returns of stocks or indexes of your choosing are normally distributed. Some suggestions for exploration include:

    1) Test if there are period of times when the log-returns of a stock/index have evidence of normal distribution.
    
    2) Test if removing extremal return data creates a distribution with evidence of being normal.
    
    3) Create a personalized portfolio of stocks with historical log return data that is normally distributed.
    
    4) Test if the portfolio you created in the first mini-project has significant periods of time with evidence of normally distributed log returns.
    
    5) Gather x-number of historical stock data and just perform a normality test on their log return data to see if any of the stocks exhibit evidence of log returns that are normally distributed.

In [2]:
# Package imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import scipy.stats as stats
import seaborn as sns
import yfinance as yf
import datetime as dt
sns.set_style('darkgrid')

Let's observe if there's any period of time where the log of Tesla stock returns have evidence of normal distribution.

In [3]:
def t_logreturns(num_years=1):
    """
    Estimates pi using the Monte Carlo method.

    Args:
      num_points: The number of random points to generate.

    Returns:
      An approximation of pi.
    """
    n = num_years
    start_date = dt.datetime.today() - dt.timedelta(days = n*365)
    end_date = dt.datetime.today()
    
    TSLA_stock = yf.download('TSLA', start = start_date, end = end_date)
    TSLA_returns = TSLA_stock['Close']/TSLA_stock['Close'].shift(1)
    TSLA_logreturns = np.log(TSLA_returns.dropna())['TSLA'].values
    
    return TSLA_logreturns

In [5]:
t_logreturns(1)

YF.download() has changed argument auto_adjust default to True


[*********************100%***********************]  1 of 1 completed


array([-3.17333272e-03,  1.46489232e-02, -3.97899740e-03, -1.01025696e-02,
       -8.65947999e-03,  1.31512570e-03,  1.66604546e-02, -2.58852610e-03,
       -2.10102673e-02, -1.81743406e-02,  3.81134775e-02,  2.87990106e-02,
       -2.47460850e-02,  5.16191096e-02, -1.38600221e-02, -1.79574905e-02,
        7.89947152e-03, -2.35232298e-03,  2.57901147e-02,  4.70220521e-02,
        5.33281982e-03,  2.32738132e-03,  5.87798336e-02,  9.71019503e-02,
        6.33730490e-02,  2.06068823e-02,  5.62978988e-03,  3.64508859e-02,
        3.53897209e-03, -8.82207721e-02,  2.94343165e-02,  1.76098296e-02,
        1.53969978e-02, -3.19197003e-02,  2.93330222e-03, -4.10761384e-02,
        5.01827190e-02, -2.06076490e-02, -1.31642944e-01,  1.95311291e-02,
       -2.04520896e-03,  5.44502837e-02, -4.17020794e-02,  4.15728217e-02,
       -6.77870950e-02, -4.33017143e-02, -4.32486202e-02,  8.81060209e-03,
       -4.52676929e-02,  3.62559013e-02,  5.81690374e-03, -1.26293886e-02,
        5.10324687e-02, -

In [12]:
start_date = dt.datetime.today() - dt.timedelta(days = 365)
end_date = dt.datetime.today()
    
TSLA_stock = yf.download('TSLA', start = start_date, end = end_date)
TSLA_returns = TSLA_stock['Close']/TSLA_stock['Close'].shift(1)
TSLA_logreturns = np.log(TSLA_returns.dropna())

[*********************100%***********************]  1 of 1 completed


In [13]:
TSLA_logreturns

Ticker,TSLA
Date,Unnamed: 1_level_1
2024-05-29,-0.003173
2024-05-30,0.014649
2024-05-31,-0.003979
2024-06-03,-0.010103
2024-06-04,-0.008659
...,...
2025-05-20,0.005044
2025-05-21,-0.027123
2025-05-22,0.019004
2025-05-23,-0.004997


In [12]:
#Collect p-values of normality tests
p_tsla=stats.normaltest(t_logreturns(1))[1]


#Print evidence/non-evidence of normality
print(f"TSLA log return distribution: p-value = {p_tsla:.4f}")
if p_tsla < 0.05:
    print("→ Statistically significant evidence that the data is NOT normally distributed.")
else:
    print("→ No statistically significant evidence against normality.")
    
print('--'*40) 
print('--'*40) 

[*********************100%***********************]  1 of 1 completed

TSLA log return distribution: p-value = 0.0000
→ Statistically significant evidence that the data is NOT normally distributed.
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------



