In [None]:
# Mean-Reverting Trading Strategy Using Cointegration in Python

Cointegration is a statistical property of a pair of time series, where two or more time series move together over time and are proportional to each other. In finance, this property is often used to develop mean-reverting algorithmic trading strategies.

In [2]:
!pip install statsmodels 

Collecting statsmodels
  Downloading statsmodels-0.14.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.5 kB)
Collecting patsy>=0.5.4 (from statsmodels)
  Downloading patsy-0.5.6-py2.py3-none-any.whl.metadata (3.5 kB)
Downloading statsmodels-0.14.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (10.8 MB)
[2K   [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.8/10.8 MB[0m [31m1.3 MB/s[0m eta [36m0:00:00[0mm eta [36m0:00:01[0m[36m0:00:01[0mm
[?25hDownloading patsy-0.5.6-py2.py3-none-any.whl (233 kB)
[2K   [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m233.9/233.9 kB[0m [31m1.2 MB/s[0m eta [36m0:00:00[0m[31m1.6 MB/s[0m eta [36m0:00:01[0m
[?25hInstalling collected packages: patsy, statsmodels
Successfully installed patsy-0.5.6 statsmodels-0.14.1


In [8]:
!pip install tkinter

[31mERROR: Could not find a version that satisfies the requirement tkinter (from versions: none)[0m[31m
[0m[31mERROR: No matching distribution found for tkinter[0m[31m
[0m

It is very important that there are no NaN values in the DataFrame, or else the code will fail. To do that we use the dropna() method. Then we run the Engel-Granger test, and check the p-value. We look for a p-value of below 0.05 which means that the stocks are cointegrated.

Great, our p-value is 0.0 which means that the stocks are highly cointegrated.



### Fit a regression and form the spread
Okay, now that we have confirmed that Apple and Tesla are cointegrated, we move to the next step. We have to fit an Ordinary Least Squares Regression with the returns of both stocks to find the beta.

After that when we have the beta ready, we use it to calculate the spread.

In [7]:
import numpy as np
import pandas as pd
import yfinance as yf
from scipy.stats import norm
import matplotlib.pyplot as plt
from tkinter import *
from tkinter import ttk
from math import log10 , floor


# round to significant figures
def round_it(x, sig):
    return round(x, sig-int(floor(log10(abs(x))))-1)


# Display the result of simulations
def display_future_fx():
    ticker = stock_entry.get().upper()
    
    # Retrieve stock data for the selected stock
    stock = yf.Ticker(ticker)
    data = stock.history(period=realize_period_entry.get())
    
    # Define initial price
    initial_price = data.iloc[-1].Close
    
    # Define initial mean and volatility
    ret = data.Close.pct_change(1).dropna()
    mu = ret.mean() if stock_entry.get() == 'True' else 0
    volatility = ret.std()

    # Define time horizon and number of simulations
    time_horizon = forecast_period_entry.get()
    time_horizon = int(time_horizon[:-1]) # in days
    num_simulations = int(num_sim_entry.get())
    
    # Generate random normal distribution of daily returns
    daily_returns = np.random.normal(mu, volatility, (time_horizon, num_simulations))

    # Compute cumulative returns and forecasted prices for each simulation
    cumulative_returns = np.cumprod(1 + daily_returns, axis=0)
    forecast_prices = initial_price * np.r_[np.ones((1, num_simulations)), cumulative_returns]
    
    # # Compute cumulative returns and forecasted prices for each simulation (log return version)
    # cumulative_returns = np.cumsum(daily_returns, axis=0)
    # forecast_prices = initial_price * np.exp(cumulative_returns)

    # Compute mean and standard deviation of forecasted prices
    mean_price = np.mean(forecast_prices[-1])
    std_dev_price = np.std(forecast_prices[-1])

    # Compute 95% confidence interval for forecasted prices
    confidence_interval = (mean_price - 1.96 * std_dev_price, mean_price + 1.96 * std_dev_price)
    
    # Compute normal pdf of daily return
    x = np.linspace(mu - 3 * volatility, mu + 3 * volatility, 100)
    norm_pdf = norm.pdf(x, mu, volatility)

    # Plot the result
    fig, ax = plt.subplots(nrows=3, ncols=1, figsize=(12,24))
    ax[0].plot(range(1 + time_horizon), forecast_prices, '--o')
    ax[0].set_xlabel('Day')
    ax[0].set_ylabel('Price ($)')
    price_formated = round_it(initial_price, 3) if initial_price < 1 else f'{initial_price:.2f}'
    ax[0].set_title(f'Monte Carlo Simulation of Price Movement of {ticker} in the Future {time_horizon} Days (Current Close {price_formated})')
    ax[1].hist(forecast_prices[-1], bins=50, density=True)
    ax[1].set_xlabel('Price ($)')
    ax[1].set_ylabel('Probability Density')
    ax[1].set_title(f'Monte Carlo Simulation of {ticker} After {time_horizon} Days (Using Close Price on {data.index[-1].strftime("%m/%d/%Y")})')
    ax[1].axvline(mean_price, color='r', linestyle='-', label='Mean')
    ax[1].axvline(confidence_interval[0], color='g', linestyle='--', label='95% Confidence Interval')
    ax[1].axvline(confidence_interval[1], color='g', linestyle='--')
    ax[1].legend()
    ax[2].plot(x, norm_pdf)
    ax[2].set_xlabel(f'Daily Return (%)')
    ax[2].set_ylabel('Probability Density')
    ax[2].set_title(f'Distribution of {ticker} Daily Return (Assuming Normal Distribution)')
    ax[2].axvline(mu, color='g', linestyle='--', label='Mean')
    ax[2].axvline(-1.65 * volatility, color='orange', linestyle='--', label='95% VaR')
    ax[2].axvline(-2.33 * volatility, color='red', linestyle='--', label='99% VaR')
    ax[2].legend()
    plt.tight_layout(pad=10)
    plt.show()
    
# Create GUI window
root = Tk()
root.title("Monte Carlo Simulation GUI")

# Create input fields and labels
Label(root, text="Stock Symbol (Yahoo Finance): ").grid(row=0)
Label(root, text="Volatiliy Reference Period (Normally: 1mo or 3mo): ").grid(row=1)
Label(root, text="Forecast Period (Format: _ _d): ").grid(row=2)
Label(root, text="No. of Simulations: ").grid(row=3)
Label(root, text="Use historical return to estimate the mean: ").grid(row=4)

# Input the ticker
stock_entry = Entry(root, width = 40)

# Input the no. of sim
num_sim_entry = Entry(root, width = 40)
num_sim_entry.insert(END, '1000')

# Combobox creation
n = StringVar()
realize_period_entry = ttk.Combobox(root, width = 37,textvariable = n)
# Adding combobox drop down list
realize_period_entry['values'] = ('1mo', '3mo', '6mo', '1y')
realize_period_entry.insert(END, '1mo')

# Combobox creation
n = StringVar()
forecast_period_entry = ttk.Combobox(root, width = 37,textvariable = n)
# Adding combobox drop down list
forecast_period_entry['values'] = ('3d', '5d', '7d', '10d', '14d', '15d', '23d', '30d')
forecast_period_entry.insert(END, '3d')

# Combobox creation
n = StringVar()
est_ret_entry = ttk.Combobox(root, width = 37,textvariable = n)
# Adding combobox drop down list
est_ret_entry['values'] = ('True', 'False')
est_ret_entry.insert(END, 'False')

stock_entry.grid(row=0, column=1)
realize_period_entry.grid(row=1, column=1)
forecast_period_entry.grid(row=2, column=1)
num_sim_entry.grid(row=3, column=1)
est_ret_entry.grid(row=4, column=1)

# Create button to retrieve data, run simulation and display the result
display_button = Button(root, text="Run", command=display_future_fx)
display_button.grid(row=5, columnspan=2)

# Start GUI loop
root.mainloop()

ModuleNotFoundError: No module named 'tkinter'

#### Monte Carlo Simulation
To predict the future price movements of a stock, we need to generate a range of possible future prices. We can achieve this by using Monte Carlo simulation. The following steps outline the methodology behind the Monte Carlo simulation:

1. Collect historical data: We need to collect historical data on the stock we want to analyze. We can use this data to calculate the average return and standard deviation of the stock.
2. Generate random numbers: We can use a random number generator to generate a large number of random numbers. These random numbers will be used to simulate the future prices of the stock.
3. Calculate future prices: We can use the random numbers generated in step 2 to calculate the future prices of the stock. We can calculate the future price for each time period in the future.
4. Repeat steps 2–3: We can repeat steps 2–3 many times to generate a large number of possible future prices.
Analyze the results: We can analyze the results to determine the range of possible future prices for the stock. We can use this information to make an educated guess about the future trends of the stock.

### How much historical data is enough?
The choice of the volatility reference period for Monte Carlo simulation depends on various factors, such as the frequency of data updates, the volatility pattern of the asset, and the investment horizon of the user.

A 30-day reference period is commonly used in financial markets to represent the short-term volatility of an asset. It can capture the recent market conditions and is suitable for traders or investors with a short-term investment horizon. On the other hand, a 90-day reference period can provide a broader view of the asset’s volatility and may be suitable for investors with a longer-term investment horizon.

However, there is no one-size-fits-all solution for choosing the volatility reference period. It is recommended to experiment with different reference periods and observe how they affect the Monte Carlo simulation results. Additionally, it is essential to keep in mind that the choice of the reference period should be consistent with the assumptions of the underlying asset’s volatility.

ref- https://medium.com/@yatshunlee/1-in-14-million-monte-carlo-simulation-gui-app-to-predict-the-future-price-movements-with-python-66c187c10091