# Hypothesis Testing
### To statistically test whether the option expiry of Nifty50 causes the market to close on the downside 
##### Null Hypothesis (H0): The stock market does not close lower on options expiry days compared to non-expiry days. 
##### Alternative Hypothesis (H1): The stock market closes lower on options expiry days compared to non-expiry days.

In [None]:
# Import necessary libraries
from dateutil.relativedelta import relativedelta
from datetime import date, timedelta
from scipy import stats
import yfinance as yf
import pandas as pd
import numpy as np

### Data Collection
#### Source: Historical stock data is fetched from Yahoo Finance using yfinance.
#### Time Period: Last 3 years of Nifty50 data.

In [None]:

# Define the ticker symbol for Nifty 50
ticker_symbol = '^NSEI'

# Define the start and end dates for the data
end_date = date.today()  # Set the end date as today's date
start_date = end_date - relativedelta(years=3)  # Set the start date as 2 years before the end date

# Download historical data
df = yf.download(ticker_symbol, start=start_date, end=end_date)

### Data Preparation
#### Resetting Index: The date index is reset for easier manipulation.
#### Day Identification: The day of the week is identified to determine if it's an expiry day (Thursday).
#### Market Direction: A new column is created to indicate whether the market moved up or down based on the percentage change in closing price.

In [None]:
# Reset the index of the DataFrame to the default integer index and modify the DataFrame in place
df.reset_index(inplace=True)

# Create a new column 'Day of Week' that contains the name of the day (e.g., 'Monday', 'Tuesday') from the 'Date' column
df['Day of Week'] = df['Date'].dt.day_name()

# Create a new column 'No of WeekDay' that contains the day of the week as a number (0=Monday, 6=Sunday)
df['No of WeekDay'] = df['Date'].dt.weekday

# Calculate the percentage change in the 'Close' column from the previous day and create a new column '%_CloseChange'
df['%_CloseChange'] = df['Close'].pct_change()

# Create a new column 'Market_Direction' to indicate whether the market moved up or down based on '%_CloseChange'
df['Market_Direction'] = np.where(df['%_CloseChange'] > 0, 'UP', 'DOWN')


### Hypothesis Testing
#### Groups: The data is split into two groups—expiry days and non-expiry days.
#### T-Test: A one-tailed t-test is performed to see if the returns on expiry days are significantly lower.

In [26]:
import numpy as np
from scipy import stats

def test_expiry_effect(df, expiry_day='Thursday', non_expiry_day='Monday'):
    # Define conditions to classify days as 'Expiry' or 'Non - Expiry'
    conditions = [
        (df['Day of Week'] == expiry_day),        # Condition for expiry day (e.g., Thursday)
        (df['Day of Week'] == non_expiry_day)     # Condition for non-expiry day (e.g., Monday)
    ]
    
    # Define the corresponding labels for the conditions
    choices = ['Expiry', 'Non - Expiry']

    # Create a new column 'Is_Expiry' in the DataFrame based on the conditions and choices
    df['Is_Expiry'] = np.select(conditions, choices, default=np.nan)

    # Filter the DataFrame to separate returns for expiry and non-expiry days
    expiry_returns = df[df['Is_Expiry'] == 'Expiry']['%_CloseChange'].dropna()          # Returns for expiry days
    non_expiry_returns = df[df['Is_Expiry'] == 'Non - Expiry']['%_CloseChange'].dropna() # Returns for non-expiry days

    # Perform a one-tailed t-test to check if expiry day returns are significantly lower than non-expiry day returns
    t_stat, p_value = stats.ttest_ind(expiry_returns, non_expiry_returns, alternative='less')

    # Interpret the p-value to determine if the null hypothesis should be rejected
    if p_value < 0.05:
        # If p-value is less than 0.05, reject the null hypothesis
        print(f"T-statistic: {t_stat}, P-value: {p_value}")
        print("---------------------***---------------------")
        print("Reject the null hypothesis. Option expiry days have a significantly lower return.")
    else:
        # If p-value is greater than or equal to 0.05, fail to reject the null hypothesis
        print(f"T-statistic: {t_stat}, P-value: {p_value}")
        print("---------------------***---------------------")
        print("Fail to reject the null hypothesis. No significant difference in returns.")

#  Test for 'Monday' as expiry day and 'Tuesday' as non-expiry day
test_expiry_effect(df, expiry_day='Monday', non_expiry_day='Tuesday')


T-statistic: -0.599127120703716, P-value: 0.2747771252728836
---------------------***---------------------
Fail to reject the null hypothesis. No significant difference in returns.


In [None]:
df.to_csv('nifty_data.csv')