# Predicting Bitcoin Prices via Mathematical and Financial Models: A Study of the Time-Series Analysis of the economic and macroeconomic factors and of the technical indicators and sentiment analysis

Boulouma A., 2023

### Problem

The problem is to develop a predictive model for the bitcoin market prices that can accurately forecast the price at time $t+1$ based on the price at time $t$.

### Research Question

What is the best predictive model to use for forecasting bitcoin market prices, and what is the predictive power of each model?



## Model 2 - VARMA-NN: A Hybrid Model for Multivariate Time Series Forecasting Using Neural Networks and Vector Autoregressive Moving Average Models

Time series forecasting is an essential problem in several fields such as finance, economics, and engineering, and has been an active research topic for several decades. Among the existing forecasting models, the VARMA-NN model is a more powerful and flexible extension of the ARIMA-NN model, as it can handle multivariate time series data and capture both linear and non-linear dependencies between variables. The VARMA-NN model combines the VARMA model with a neural network model, similar to the ARIMA-NN model. This paper discusses the VARMA-NN model proposed by Zhang, G. P. (2003) in detail and its implementation to forecast economic and macroeconomic indicators, technical indicators, and sentiment analysis for predicting future values.

The VARMA model is a generalization of the VAR model, where it includes the moving average terms in addition to the autoregressive terms. The VARMA model is specified by two parameters: the order of the autoregressive terms $p$ and the order of the moving average terms $q$. The VARMA model can be written as:

$$\sum_{i=1}^{p} \Phi_i L^i \Delta Y_t = \sum_{j=1}^{q} \Theta_j L^j \epsilon_t$$

where $\Delta Y_t = (Y_t - Y_{t-1})$ is the differenced time series, $L$ is the lag operator, $\Phi_i$ and $\Theta_j$ are the autoregressive and moving average coefficient matrices, respectively, and $\epsilon_t$ is the white noise error term at time $t$.

The VARMA-NN model combines the VARMA model with a neural network model in a similar way to the ARIMA-NN model. The VARMA-NN model can be formalized as follows:

1. Preprocessing: The time series data is preprocessed to remove any outliers or missing values. The data is then split into training and testing sets.

2. VARMA modeling: The VARMA model is fit to the training data. The VARMA model is specified by two parameters: $p$ and $q$. The VARMA model captures the linear dependencies between past and future values of the multivariate time series.

3. Neural network modeling: A neural network model is built using the training data. The neural network can be a feedforward neural network or a recurrent neural network (RNN). The neural network captures the non-linear dependencies that may exist in the data.

4. Hybrid modeling: The predictions from the VARMA model and the neural network model are combined using a weighted average. The weights are determined by the relative performance of the two models on the training data.

5. Evaluation: The performance of the VARMA-NN model is evaluated using the testing data. The evaluation metrics used can include mean squared error (MSE), root mean squared error (RMSE), and mean absolute percentage error (MAPE).

6. Interpretation: Interpret the model results and use them to inform decision-making. It is important to keep in mind that the model results are not a crystal ball and should be used in conjunction with other factors and expert judgment.

7. Updating the model: As new data becomes available, the model should be updated and refined to ensure it remains accurate and relevant.

The equations used in the VARMA-NN model can be represented as:

### VARMA model:

$$\sum_{i=1}^{p} \Phi_i L^i \Delta Y_t = \sum_{j=1}^{q} \Theta_j L^j \epsilon_t$$

where $\Delta Y_t$ is the differenced multivariate time series, $\Phi_i$ and $\Theta_j$ are the autoregressive and moving average coefficient matrices, respectively, and $\epsilon_t$ is the white noise error term at time $t$.

### Neural network model:

$$Y_t = f(WX_t + b)$$

where $Y_t$ is the predicted value at time $t$, $X_t$ is the input vector at time $t$, $W$ is the weight matrix, $b$ is the bias vector, and $f$ is the activation function.

### Hybrid model:

$$Y_t = \alpha Y_t^{VARMA} + (1 - \alpha) Y_t^{NN}$$

where $Y_t^{ARIMA}$ is the prediction from the ARIMA model, $Y_t^{NN}$ is the prediction from the neural network model, and $\alpha$ is the weight assigned to the ARIMA prediction.

The specific formulas for each indicator are as follows:

The moving average indicator ($MA$): The formula for $MA$ with a window size of $k$ can be written as:
$$Y_{MA,t} = \frac{1}{k} \sum_{i=t-k+1}^{t} X_i$$

where $X_i$ is the value of the variable at time $i$.

The relative strength index ($RSI$): The formula for $RSI$ with a window size of $k$ can be written as:
$$Y_{RSI,t} = 100 - \frac{100}{1 + RS}$$

where $RS$ is the relative strength at time $t$, which is calculated as:

$$RS = \frac{\sum_{i=t-k+1}^{t} Max(X_i - X_{i-1}, 0)}{\sum_{i=t-k+1}^{t} |X_i - X_{i-1}|}$$

The stochastic oscillator ($SO$): The formula for $SO$ with a window size of $k$ can be written as:
$$Y_{SO,t} = \frac{X_t - Min_{k}(X)}{Max_{k}(X) - Min_{k}(X)} \times 100$$

where $Min_{k}(X)$ and $Max_{k}(X)$ are the minimum and maximum values of the variable over the past $k$ periods, respectively.

The Google Trend indicator $f_{GT}(Q_t)$: The formula for $f_{GT}(Q_t)$ is:
$$Y_{GT,t} = f_{GT}(Q_t)$$

where $Q_t$ represents the search query related to Bitcoin at time $t$, and $f_{GT}$ is a function that processes the search data to generate a Google Trends score.

Overall, the VARMA-NN model offers a powerful and flexible approach to time series forecasting, particularly for multivariate data with both linear and non-linear dependencies. It combines the strengths of both the VARMA and neural network models, allowing it to capture complex relationships between variables.

However, it is important to note that the model is not a one-size-fits-all solution and must be tailored to the specific data and problem at hand. It also requires a significant amount of data and computational resources to train and optimize, so it may not be suitable for all applications.

Nonetheless, the VARMA-NN model represents a significant advancement in time series forecasting and has the potential to greatly improve our ability to predict future trends and outcomes. Further research and development in this area are likely to yield even more powerful and effective forecasting methods in the future.

### Seasonal decomposition of time series (STL) 

- Zhang, G. P. (2003). Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, 50, 159-175.

- Box, G. E. P., & Jenkins, G. M. (1976). Time series analysis: Forecasting and control. San Francisco, CA: Holden-Day.

- Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: Principles and practice. OTexts.

- Chollet, F. (2018). Deep learning with Python. Manning Publications.

- Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.

## Build a new dataset based on the model and time entry

In [563]:
import tensorflow as tf

# Define the function to generate new data
def generate_new_data(model, time_input):
    with open('models/best_ltsm_model.pkl', 'rb') as file:
        n_features = len(df.columns)
        model = pickle.load(file)
        # Convert the time input to the same format as the timestamps used in the training data
        time_input = pd.Timestamp(time_input)

        # Create a numpy array with the same shape as the input data used during training
        input_data = np.empty((1, n_steps, n_features))
        input_data[:] = np.nan

        # Replace the values in the numpy array for the timestamp corresponding to the time input with 0
        index = df.index.get_loc(time_input)
        input_data[0, -n_steps+index:, :] = 0

        # Use the loaded model to make a prediction on the numpy array
        predictions_scaled = model.predict(input_data).flatten()

        # Convert the predicted values back to their original scale
        predictions = scaler.inverse_transform(predictions_scaled.reshape(-1, n_features))[0]

        # Return the predicted values for each feature
        return predictions

In [565]:

new_data_df = pd.DataFrame()

with open('models/best_ltsm_model.pkl', 'rb') as file:
    model = pickle.load(file)
    new_data = generate_new_data(model, '2022-03-01')
    new_data_df = pd.DataFrame([new_data], columns=df.columns)
new_data_df

Keras model archive loading:
File Name                                             Modified             Size
config.json                                    2023-02-28 15:50:00         1797
metadata.json                                  2023-02-28 15:50:00           64
variables.h5                                   2023-02-28 15:50:00       160928
Keras weights file (<HDF5 file "variables.h5" (mode r)>) loading:
...layers
......dense
.........vars
............0
............1
......lstm
.........cell
............vars
...............0
...............1
...............2
.........vars
...metrics
......mean
.........vars
............0
............1
...optimizer
......vars
.........0
.........1
.........10
.........2
.........3
.........4
.........5
.........6
.........7
.........8
.........9
...vars
Keras model archive loading:
File Name                                             Modified             Size
config.json                                    2023-02-28 15:50:00         1797
metadat

In [566]:
new_data_df

Unnamed: 0,Price,hash_rate,transaction_volume,mining_difficulty,inflation_rate,bitcoin_trend
0,,,,,,


### 1.2. VARMAX

In [429]:
import pandas as pd
import numpy as np
from statsmodels.tsa.statespace.varmax import VARMAX
from sklearn.preprocessing import MinMaxScaler

# Load the data
df = btc_df = clean_and_transform_data(read_data("datasets/btc.csv"), read_data("datasets/btc_google_trend.csv"))

# Set the date column as the index
df.set_index('time', inplace=True)

# Scale the data
scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(df)

# Define the number of time steps for the input data
n_steps = 3

# Split the data into training and testing sets
train_size = int(len(scaled_data) * 0.8)
train_data = scaled_data[:train_size,:]
test_data = scaled_data[train_size-n_steps:]

# Define a function to prepare the input and output data for the model
def prepare_data(data, n_steps):
    X, y = [], []
    for i in range(n_steps, len(data)):
        X.append(data[i-n_steps:i,:])
        y.append(data[i,:])
    X, y = np.array(X), np.array(y)
    return X, y

train_X, train_y = prepare_data(train_data, n_steps)
test_X, test_y = prepare_data(test_data, n_steps)

# Train the VARMAX model
model = VARMAX(endog=train_y, exog=train_X, order=(1,1))
result = model.fit()

# Define a function to make predictions for any variable
def predict_variable(variable, time):
    # Prepare the input data
    input_data = df.loc[time-np.timedelta64(n_steps-1, 'D'):time,:]
    input_data_scaled = scaler.transform(input_data)
    input_data_reshaped = input_data_scaled.reshape(1, n_steps, train_X.shape[2])
    input_data_reshaped = input_data_reshaped[:, :, :-1]

    # Make predictions for the variable
    predictions_scaled = result.forecast(exog=input_data_reshaped, steps=n_steps)
    predictions = scaler.inverse_transform(predictions_scaled)
    
    return predictions[0][df.columns.get_loc(variable)]

# Example usage:
predicted_price = predict_variable("Price", pd.Timestamp("2022-01-01"))
print(f"Predicted BTC price on 2022-01-01: {predicted_price:.2f}")


  df = btc_df = clean_and_transform_data(read_data("datasets/btc.csv"), read_data("datasets/btc_google_trend.csv"))
  warn('Estimation of VARMA(p,q) models is not generically robust,'


ValueError: exog is not 1d or 2d