Base Model: Z-score Trading System:

The Z-score trading system is a statistical approach used for identifying trading opportunities based on deviations from the mean.
In this model, the spread between Bank Nifty IV and Nifty IV is calculated. The spread represents the relative volatility difference between the two indices.
The Z-score is then computed for the spread, indicating how many standard deviations the spread is from its mean.
Based on predefined threshold levels, trading positions are determined:
If the Z-score exceeds the entry threshold, a long position is initiated.
If the Z-score falls below the negative entry threshold, a short position is initiated.
If the absolute Z-score is below the exit threshold, the position is closed.
The profit or loss (P/L) is calculated based on the spread multiplied by the time to expiry raised to the power of 0.7.
Advanced Model: Linear Regression:

The advanced model utilizes linear regression to predict future values of the spread.
Lagged spread values are used as features, and the next period's spread value is the target variable.
The dataset is split into training and testing sets for model training and evaluation.
After fitting the linear regression model, spread predictions are generated.
Trading positions are determined based on the comparison between the actual spread and the predicted spread:
If the actual spread is greater than the predicted spread, a short position is taken.
Otherwise, a long position is taken.
Similar to the base model, the P/L is calculated using the spread, but this time based on the predicted spread and adjusted by the time to expiry.
Both models are evaluated based on their absolute P/L, Sharpe Ratio, and Drawdown metrics to assess their performance. The goal is to compare the effectiveness of the two approaches in generating profits while managing risk. Adjustments and improvements to these models can be made iteratively based on empirical results and further research.

In [10]:
import pandas as pd
import numpy as np
from scipy.stats import zscore
from sklearn.linear_model import LinearRegression
# Load the dataset
data = pd.read_parquet("data.parquet")
data.columns
# Data Preprocessing
# Ensure datetime index
data.index = pd.to_datetime(data.index)
# Fill missing values if any
data.fillna(method='ffill', inplace=True)

# Check if the required columns are present in the dataset
if 'banknifty' not in data.columns or 'nifty' not in data.columns:
    raise KeyError("Columns 'BankNiftyIV' and/or 'NiftyIV' are not present in the dataset.")

# Calculate spread if columns are present
data['Spread'] = data['banknifty'] - data['nifty']

# Base Model: Z-score Trading System
# Base Model: Z-score Trading System
def z_score_trading(data, threshold_entry=1, threshold_exit=0):
    data = data.copy()  # Create a copy of the DataFrame to avoid modifying the original DataFrame
    data['Z_Score'] = zscore(data['Spread'])
    data['Position'] = 0
    data.loc[data['Z_Score'] > threshold_entry, 'Position'] = 1  # Long position
    data.loc[data['Z_Score'] < -threshold_entry, 'Position'] = -1  # Short position
    data.loc[abs(data['Z_Score']) < threshold_exit, 'Position'] = 0  # Exit position

    # Calculate P/L
    data['PL'] = data['Spread'] * (data['tte'] ** 0.7)

    return data


# Advanced Model: Linear Regression
def linear_regression_model(data):
    # Create lagged spread for prediction
    data['Spread_Lagged'] = data['Spread'].shift(-1)

    # Drop last row as it will have NaN due to shifting
    data.dropna(inplace=True)

    # Features and target variable
    X = data[['Spread']]
    y = data['Spread_Lagged']

    # Train-test split (assuming 80-20 split)
    split_index = int(len(data) * 0.8)
    X_train, X_test = X[:split_index], X[split_index:]
    y_train, y_test = y[:split_index], y[split_index:]

    # Fit linear regression model
    model = LinearRegression()
    model.fit(X_train, y_train)

    # Predictions
    data['Spread_Predicted'] = model.predict(X)

    # Trading strategy based on predictions
    data['Position_Adv'] = np.where(data['Spread'] > data['Spread_Predicted'], -1, 1)

    # Calculate P/L
    data['PL_Adv'] = data['Position_Adv'] * data['Spread_Lagged'] * (data['tte'] ** 0.7)

    return data

# Evaluate strategy performance
def evaluate_strategy(data):
    # Absolute P/L
    absolute_PL = data['PL'].sum()

    # Sharpe Ratio
    returns = data['PL'].pct_change().dropna()
    sharpe_ratio = np.sqrt(252) * returns.mean() / returns.std()  # Assuming 252 trading days in a year

    # Drawdown
    cumulative_PL = data['PL'].cumsum()
    drawdown = (cumulative_PL - cumulative_PL.expanding().max()).min()

    return absolute_PL, sharpe_ratio, drawdown

# Run base model
data = z_score_trading(data)

# Evaluate base model
absolute_PL_base, sharpe_ratio_base, drawdown_base = evaluate_strategy(data)

print("Base Model Results:")
print("Absolute P/L:", absolute_PL_base)
print("Sharpe Ratio:", sharpe_ratio_base)
print("Drawdown:", drawdown_base)

# Run advanced model
data = linear_regression_model(data)

# Evaluate advanced model
absolute_PL_adv, sharpe_ratio_adv, drawdown_adv = evaluate_strategy(data)

print("\nAdvanced Model Results:")
print("Absolute P/L:", absolute_PL_adv)
print("Sharpe Ratio:", sharpe_ratio_adv)
print("Drawdown:", drawdown_adv)


Base Model Results:
Absolute P/L: 319282.3419873928
Sharpe Ratio: 0.15550860081937026
Drawdown: -0.0767731094674673

Advanced Model Results:
Absolute P/L: 319282.0758630607
Sharpe Ratio: 0.1555116124780812
Drawdown: -0.0767731094674673
