<a href="https://colab.research.google.com/github/mjgpinheiro/Physics_models/blob/main/TimeSeriesForecasting_Stocks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# Install the necessary libraries
!pip install pandas_ta yfinance pmdarima
!pip install prophet

import yfinance as yf
import pandas as pd
import numpy as np
import pandas_ta as ta

from prophet import Prophet
from statsmodels.tsa.arima_model import ARIMA
from pmdarima import auto_arima
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense, LSTM
from sklearn.metrics import mean_squared_error, mean_absolute_error

# Data Gathering
symbol = "EURUSD=X"
start_date = "2010-01-01"
end_date = pd.Timestamp.now().strftime('%Y-%m-%d')  # Keeping the end date fixed
df = yf.download(symbol, start=start_date, end=end_date, interval="1d")

# Technical Indicators
df['RSI'] = ta.rsi(df['Close'])
df[['MACD', 'MACD_SIGNAL', 'MACD_HIST']] = ta.macd(df['Close'])
df.dropna(inplace=True)

# ARIMA Model for Forecasting
model_arima = auto_arima(df['Close'], seasonal=True, m=7)
forecast_arima = model_arima.predict(n_periods=7)

# Prophet Model for Forecasting
df_prophet = df.reset_index()[['Date', 'Close']].rename(columns={'Date':'ds', 'Close':'y'})
model_prophet = Prophet()
model_prophet.fit(df_prophet)
future = model_prophet.make_future_dataframe(periods=7)
forecast = model_prophet.predict(future)
forecast_tail = forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail(7)

prophet_forecast = forecast['yhat'].tail(7).values

# LSTM Parameters and Model Training
LSTM_UNITS = 50
EPOCHS = 5
BATCH_SIZE = 32
SEQUENCE_LENGTH = 60

scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(df['Close'].values.reshape(-1, 1))
training_data_len = int(np.ceil(len(scaled_data) * .95))
train_data = scaled_data[0:int(training_data_len), :]
x_train, y_train = [], []

for i in range(SEQUENCE_LENGTH, len(train_data)):
    x_train.append(train_data[i-SEQUENCE_LENGTH:i, 0])
    y_train.append(train_data[i, 0])
x_train, y_train = np.array(x_train), np.array(y_train)
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))

model = Sequential()
model.add(LSTM(units=LSTM_UNITS, return_sequences=True, input_shape=(x_train.shape[1], 1)))
model.add(LSTM(units=LSTM_UNITS))
model.add(Dense(units=1))
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(x_train, y_train, epochs=EPOCHS, batch_size=BATCH_SIZE)

# LSTM Forecasting
x_forecast = np.array(scaled_data[len(scaled_data) - SEQUENCE_LENGTH:])
x_forecast = np.reshape(x_forecast, (1, x_forecast.shape[0], 1))
forecasted_lstm = []

for _ in range(7):
    pred = model.predict(x_forecast)
    forecasted_lstm.append(pred)
    x_forecast = np.roll(x_forecast, shift=-1, axis=1)
    x_forecast[0, -1, 0] = pred

forecasted_lstm = scaler.inverse_transform(np.array(forecasted_lstm).reshape(-1,1))

# Aggregating the Forecasts
average_forecast = (forecast['yhat'].tail(7).values + forecast_arima + forecasted_lstm.ravel()) / 3

# Debugging Information
print("forecast shape:", forecast.shape)
print("forecast head:\n", forecast.head())
print("forecast index:\n", forecast.index)

# Decision Thresholds
BUY_THRESHOLD = 1.002
SELL_THRESHOLD = 0.998

# Aggregating the Forecasts
# Ensuring all the data are raveled (flattened into 1D) before computing the average
average_forecast = (forecast['yhat'].tail(7).values.ravel() + forecast_arima.ravel() + forecasted_lstm.ravel()) / 3
average_forecast = np.array(average_forecast)  # Ensuring it's a numpy array

# Decision Making
action = []
for i in range(len(average_forecast) - 1):
    if average_forecast[i+1] > average_forecast[i] * BUY_THRESHOLD:
        action.append('Buy')
    elif average_forecast[i+1] < average_forecast[i] * SELL_THRESHOLD:
        action.append('Sell')
    else:
        action.append('Hold')

print(action)

# Validation (optional: we can add this to understand the model's performance)
# Assuming we have a y_test data for validation
y_test = df['Close'].values[training_data_len:].reshape(-1,1)[:len(predictions)]
print(f"Mean Squared Error (LSTM): {mean_squared_error(y_test, predictions)}")
print(f"Mean Absolute Error (LSTM): {mean_absolute_error(y_test, predictions)}")


Description:
This Python script is designed to forecast stock prices using a combination of three prominent time series forecasting methods: ARIMA, Prophet, and LSTM neural networks. Post forecasting, it provides a simple trading strategy (buy, sell, or hold) based on the average of the predicted values from these three models. The code also computes and displays error metrics for the LSTM model to gauge its performance.

Detailed Steps:
Library Installation and Importation:

Necessary packages for data acquisition, forecasting, and neural network modeling are installed and imported.
Data Gathering:

Data for the chosen stock symbol (in this case, EURUSD=X - the Euro to US Dollar exchange rate) is fetched from Yahoo Finance from the specified start date till the current date.
Technical Indicator Calculation:

The Relative Strength Index (RSI) and the Moving Average Convergence Divergence (MACD) indicators are computed and added to the dataframe using the pandas_ta library.
ARIMA Model for Forecasting:

The code leverages the auto_arima function from pmdarima to automatically select the best ARIMA model. This model is then used to forecast the next 7 periods.
Prophet Model for Forecasting:

The data is prepared to be fed into the Prophet model by renaming columns. The Prophet model is then trained on this data and forecasts are generated for the next 7 periods.
LSTM Neural Network Model Setup and Forecasting:

The data is scaled to the range [0,1] using the MinMaxScaler.
Training data is prepared by considering sequences of a specified length to predict the next value.
A simple LSTM neural network is designed with two LSTM layers followed by a dense layer.
The network is trained, and forecasts for the next 7 periods are generated iteratively, one at a time, by feeding the most recent prediction back into the model as part of the input sequence.
Aggregating the Forecasts:

Predictions from the three models (ARIMA, Prophet, and LSTM) are averaged to produce a combined forecast.
Decision Making:

Trading decisions (buy, sell, or hold) are determined based on thresholds set for the averaged forecast. If the forecast for the next period increases by more than a certain percentage (BUY_THRESHOLD), the action is 'Buy', and if it decreases by more than a set percentage (SELL_THRESHOLD), the action is 'Sell'. Otherwise, the action is 'Hold'.
Validation:

The performance of the LSTM model is evaluated using Mean Squared Error (MSE) and Mean Absolute Error (MAE), assuming validation data (y_test) is available. This section is optional and needs proper data setup to work.
Usage:
This script provides a comprehensive forecasting tool for a stock (or other time series data like currency exchange rates) using a trio of models. By averaging forecasts and setting buy/sell thresholds, it offers a simple trading strategy for the user. However, it's crucial to remember that trading based on any model comes with risks, and users should exercise caution and further validate the strategy before deploying it in real-world trading.