# Hypothesis Testing
### To statistically test whether the option expiry of Nifty50 causes the market to close on the downside 
##### Null Hypothesis (H0): The stock market does not close lower on options expiry days compared to non-expiry days. 
##### Alternative Hypothesis (H1): The stock market closes lower on options expiry days compared to non-expiry days.

In [1]:
# Import necessary libraries
from dateutil.relativedelta import relativedelta
from datetime import date, timedelta
from scipy import stats
import yfinance as yf
import pandas as pd
import numpy as np

### Data Collection
#### Source: Historical stock data is fetched from Yahoo Finance using yfinance.
#### Time Period: Last 3 years of Nifty50 data.

In [2]:

# Define the ticker symbol for Nifty 50
ticker_symbol = '^NSEI'

# Define the start and end dates for the data
end_date = date.today()  # Set the end date as today's date
start_date = end_date - relativedelta(years=3)  # Set the start date as 2 years before the end date

# Download historical data
df = yf.download(ticker_symbol, start=start_date, end=end_date)

[*********************100%***********************]  1 of 1 completed


### Data Preparation
#### Resetting Index: The date index is reset for easier manipulation.
#### Day Identification: The day of the week is identified to determine if it's an expiry day (Thursday).
#### Market Direction: A new column is created to indicate whether the market moved up or down based on the percentage change in closing price.

In [8]:
# Reset the index of the DataFrame to the default integer index and modify the DataFrame in place
df.reset_index(inplace=True)

# Create a new column 'Day of Week' that contains the name of the day (e.g., 'Monday', 'Tuesday') from the 'Date' column
df['Day of Week'] = df['Date'].dt.day_name()

# Create a new column 'No of WeekDay' that contains the day of the week as a number (0=Monday, 6=Sunday)
df['No of WeekDay'] = df['Date'].dt.weekday

# Calculate the percentage change in the 'Close' column from the previous day and create a new column '%_CloseChange'
df['%_CloseChange'] = df['Close'].pct_change()

# Create a new column 'Market_Direction' to indicate whether the market moved up or down based on '%_CloseChange'
df['Market_Direction'] = np.where(df['%_CloseChange'] > 0, 'UP', 'DOWN')


### Hypothesis Testing
#### Groups: The data is split into two groups—expiry days and non-expiry days.
#### T-Test: A one-tailed t-test is performed to see if the returns on expiry days are significantly lower.

In [17]:
import numpy as np
from scipy import stats

def test_expiry_effect(df, expiry_day='Thursday', non_expiry_day='Monday'):
    conditions = [
        (df['Day of Week'] == expiry_day),
        (df['Day of Week'] == non_expiry_day)
    ]
    choices = ['Expiry', 'Non - Expiry']

    # Create the column which will divide the data into expiry days and non-expiry days
    df['Is_Expiry'] = np.select(conditions, choices, default=np.nan)

    # Separate the data into two groups
    expiry_returns = df[df['Is_Expiry'] == 'Expiry']['%_CloseChange'].dropna()
    non_expiry_returns = df[df['Is_Expiry'] == 'Non - Expiry']['%_CloseChange'].dropna()

    # Perform a t-test
    t_stat, p_value = stats.ttest_ind(expiry_returns, non_expiry_returns, alternative='less')

    # Conclusion
    if p_value < 0.05:
        print(f"T-statistic: {t_stat}, P-value: {p_value}")
        print("Reject the null hypothesis. Option expiry days have a significantly lower return.")
    else:
        print(f"T-statistic: {t_stat}, P-value: {p_value}")
        print("Fail to reject the null hypothesis. No significant difference in returns.")

test_expiry_effect(df , 'Monday' , 'Tuesday')

T-statistic: -0.599127120703716, P-value: 0.2747771252728836
Fail to reject the null hypothesis. No significant difference in returns.


In [6]:
df.to_csv('nifty_data.csv')