# **Predict Bitcoin Close Prices using Foundational Models**


## Introduction


This project is focused on **Time Series Forecasting** using the **Tiny Time Mixer (TTM)** foundation model provided by IBM. We will download and prepare the dataset, set up the environment for the `tsfm` model, and perform forecasting tasks.

Refer to the following links for detailed documentation and resources:

- [IBM Granite TSFM Repository](https://github.com/ibm-granite/granite-tsfm/tree/main)
- [IBM Granite Time Series Documentation](https://www.ibm.com/granite/docs/models/time-series/?utm_source=skills_network&utm_content=in_lab_content_link&utm_id=Lab-TSFM_FINAL-v1_1729695227)
- [IBM Foundation Model Time Series Forecasting Tutorial](https://developer.ibm.com/tutorials/awb-foundation-model-time-series-forecasting/#step-1-set-up-your-environment3) from Joshua Noble.


![Forecast Process](https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/vpEkXD--XRkEbnkiZZOfYw/image-1%20-1-.jpg)


## <a id='objectives'></a>[Objectives](#toc)

By the end of this project, you will be able to:

1. **Install and set up the environment** for time series forecasting using IBM’s Tiny Time Mixer (TTM) model.
   
2. **Download and prepare the dataset**: Download the Bitcoin historical data and perform the necessary data preprocessing.

3. **Understand the architecture of the TTM model**: Learn how the TTM model is designed for efficient time series forecasting, leveraging its specialized architecture for temporal data.

4. **Train and evaluate the model**: Train the TTM model using the processed dataset, then use it to forecast Bitcoin prices on test data, and visualize the results.

5. **Perform model evaluation**: Analyze the performance of the TTM model by comparing actual and predicted prices using various metrics and visualizations.

6. **Customize and fine-tune**: Explore options for fine-tuning the model and experimenting with different time series datasets for extended forecasting capabilities.

7. **Leverage the IBM Granite repository**: Gain practical experience with IBM’s Granite repository for handling real-world forecasting tasks, enabling you to apply the model to other time series forecasting projects.


----


## <a id='setup'></a>[Setup](#toc)

For this project, we will be using the following libraries:

*  `pandas` for data manipulation and analysis, specifically for handling time series data.
*  `numpy` for numerical computations and array operations.
*  `matplotlib` for visualizing time series trends and the forecasting results.
*  `granite` provides the **Tiny Time Mixer (TTM)** model for time series forecasting.
*  `os` for interacting with the file system and setting up the project environment.

Ensure that these libraries are installed and properly configured before proceeding with the forecasting tasks.


### <a id='installing-required-libraries'></a>[Installing required libraries](#toc)



This step could take **several minutes**, please be patient.



In [None]:
import requests

# URL of the models ZIP file
url = 'https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/8Ptx0Quj9P2MoYqLrv5X7g/models.zip'

# Download the models file
response = requests.get(url)


# Save the file locally
zip_filename = 'models.zip'
with open(zip_filename, 'wb') as f:
    f.write(response.content)

In [None]:
!unzip -o models.zip

### <a id='cloning-the-ibm-tiny-time-mixer-ttm-repository'></a>[Cloning the IBM Tiny Time Mixer (TTM) repository](#toc)

We need to clone the **IBM Tiny Time Mixer (TTM)** repository, which contains the pre-trained model and utilities for time series forecasting.


In [None]:
# Clone the ibm/tsfm
! [ -d "tsfm" ] && rm -rf tsfm
! git clone --depth 1 --branch v0.2.9 https://github.com/IBM/tsfm.git

Installing other required libraries. Will take **~6 minutes** to install the libraries. Please be patient.


In [None]:
# Change directory. Move inside the tsfm repo.
%cd tsfm

# Install the tsfm library
! pip install ".[notebooks]" seaborn==0.13.2

%cd ../

print("All the required libraries are installed.")

In [None]:
 !pip install urllib3  --upgrade

### <a id='importing-required-libraries'></a>[Importing required libraries](#toc)

We will now import all the necessary libraries for this project, which include libraries for numerical computations, deep learning, and time series forecasting.


In [None]:
import os
import math
import tempfile
import torch
from torch.optim import AdamW
from torch.optim.lr_scheduler import OneCycleLR
from transformers import EarlyStoppingCallback, Trainer, TrainingArguments, set_seed
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import mean_squared_error

# TSFM libraries
from tsfm_public.toolkit.time_series_forecasting_pipeline import TimeSeriesForecastingPipeline
from tsfm_public.models.tinytimemixer import TinyTimeMixerForPrediction
from tsfm_public.toolkit.callbacks import TrackingCallback

## <a id='data-loading-and-preprocessing'></a>[Data loading and preprocessing](#toc)

We use the pandas library to load the Bitcoin historical dataset, which contains minute-level trading data for Bitcoin. The data is stored in a CSV file named `btcusd_1-min_data.csv`.

We will now load the Bitcoin dataset and inspect the first few rows of the data using pandas `.head()` function.


In [None]:
bitcoin_data = pd.read_csv('/kaggle/input/btc-usd/btcusd-data.csv')
bitcoin_data.head()

### <a id='dataset-description'></a>[Dataset description](#toc)

The `info()` function provides a concise summary of the dataset, including the number of non-null entries, data types, and memory usage. Below is a breakdown of the dataset:

The dataset contains historical data about Bitcoin prices. Each row represents Bitcoin's trading information for a specific minute, including the timestamp and various price metrics. Below is a description of the columns:

| Column    | Type   | Description                                                                                       |
|-----------|--------|---------------------------------------------------------------------------------------------------|
| Timestamp | Integer| Unix timestamp representing the specific time of recording. Later converted to human-readable format. |
| Open      | Float  | The price of Bitcoin at the start of the minute.                                                  |
| High      | Float  | The highest price that Bitcoin reached during the minute.                                         |
| Low       | Float  | The lowest price that Bitcoin fell to during the minute.                                          |
| Close     | Float  | The price of Bitcoin at the end of the minute, typically the target variable for forecasting.     |
| Volume    | Float  | The total volume of Bitcoin traded during that minute.                                            |


In [None]:
bitcoin_data.info()

In [None]:
# Check the number of entries in the dataset
len(bitcoin_data)

### <a id='checking-for-missing-values'></a>[Checking for missing values](#toc)

To ensure the quality of our data, we need to check for any missing values in the dataset. This step is crucial because missing data can negatively impact the performance of our time series forecasting model.

**Handling Missing Data:** Missing values are common in time series datasets. We can use the `isna()` function to detect missing entries, and `sum()` to count how many missing values exist in each column.


In [None]:
bitcoin_data.isna().sum()

We observe that there is 1 missing value in the `Timestamp` column, while all other columns are complete. We will handle this missing timestamp appropriately. But first, let's have a look at the `Timestamp` column.


In [None]:
bitcoin_data['Timestamp']

The output of accessing the `Timestamp` column shows that the last row (`6694280`) contains a missing value (`NaN`).


### <a id='handling-missing-timestamps'></a>[Handling missing timestamps](#toc)

To handle the missing value in the `Timestamp` column, we use the `ffill()` method. This method forward-fills the missing value by propagating the last valid observation forward, ensuring that there are no gaps in the time series data.


In [None]:
bitcoin_data['Timestamp'] = bitcoin_data['Timestamp'].ffill()
bitcoin_data['Timestamp']

Just to be sure, we can check if there are any remaining `NaN` values in the dataset.


In [None]:
bitcoin_data.isna().sum()

### <a id='converting-timestamp-to-datetime-format'></a>[Converting 'Timestamp' to DateTime format](#toc)

The `Timestamp` column is currently in UNIX time format. To make it easier to work with and understand, we will convert the `Timestamp` column into a human-readable datetime format using the `pd.to_datetime` function.


In [None]:
bitcoin_data['Timestamp'] = pd.to_datetime(bitcoin_data['Timestamp'], unit='s')
bitcoin_data

### <a id='resampling-the-dataset-to-reduce-size'></a>[Resampling the dataset to reduce size](#toc)

Since the original dataset contains minute-level data, it can be too large and time-consuming to train on. To address this issue, we will **resample** the dataset to **hourly data** by averaging the values over each hour. This will reduce the size of the dataset and speed up the training process, while still retaining the essential time-series patterns.

The `resample()` function in pandas allows us to aggregate data over specific time intervals. In this case, we are resampling the data to hourly intervals (`'1h'`) using the `Timestamp` column as the time reference. The `mean()` function is applied to compute the average value for each hour.

- **Input**: The `Timestamp` column to use for resampling, along with the `'1h'` interval.
- **Output**: A pandas DataFrame containing hourly-aggregated Bitcoin data.

Additionally, we use the following methods:
- **dropna()**: To remove any rows containing missing values after resampling.
- **reset_index()**: To reset the index after resampling and ensure that the DataFrame structure is intact.

This resampling step significantly reduces the dataset size, making it more efficient for training the time series forecasting model without losing key temporal patterns.


In [None]:
bt_data_resampled = bitcoin_data.resample('1h', on='Timestamp').mean().dropna().reset_index()
bt_data_resampled

### <a id='checking-for-missing-values-and-dataset-size-after-resampling'></a>[Checking for missing values and dataset size after resampling](#toc)

After resampling the dataset to hourly intervals, we perform a check to ensure there are no missing values in the resampled data.


In [None]:
print(f"Checking for NA values after resampling:\n{bt_data_resampled.isna().sum()}\n")
print(f"Number of entries after resampling: {len(bt_data_resampled)}")

As seen from the output, there are no missing values in the resampled dataset.

Dataset size comparison:

| Dataset Level       | Number of Entries | Description                                 |
|---------------------|-------------------|---------------------------------------------|
| Before Resampling   | 6,694,281         | Original dataset at minute-level granularity. |
| After Resampling    | 111,586           | Resampled dataset at hourly-level granularity. |


## <a id='data-preparation'></a>[Data preparation](#toc)


### <a id='preparing-data-for-zero-shot-prediction-and-fine-tuning'></a>[Preparing data for zero-shot prediction and fine-tuning](#toc)

Next, we prepare our data to perform **zero-shot prediction** and later **fine-tuning**. In this context, we are defining the key columns from the dataset that will be used for the model's prediction task.

- **Timestamp column**: We use the `"Timestamp"` column as the time reference for our time series data.

- **Target column**: The `["Close"]` column will be our **target variable**. This represents the Bitcoin close price, which is the value we want to predict using our model.

- **Observable columns**: The columns `["Open", "High", "Low"]` are used as **observable variables**. The `Volume` column was left out from the observable columns because volume data does not directly affect price movement in the same way as the price-related variables.

#### What is zero-shot learning?

**Zero-shot learning** refers to the ability of a model to make predictions on tasks it has not been specifically trained for. In this context, the **Tiny Time Mixer** model can predict the **Bitcoin close price** without having seen any previous examples from this specific dataset. 

By defining the timestamp, target, and observable columns, we are setting up the data for the next phase, where we will apply zero-shot prediction and fine-tuning with the **Tiny Time Mixer** model.


In [None]:
timestamp_column = "Timestamp"
target_columns = ["Close"]
observable_columns = ["Open","High","Low"]

In [None]:
# Set seed for reproducibility
SEED = 42
set_seed(SEED)

# Forecasting parameters
context_length = 512 # TTM can use 512 time points into the past
forecast_length = 96 # TTM can predict 96 time points into the future

### <a id='splitting-the-dataset-for-training-validation-and-testing'></a>[Splitting the dataset for training, validation, and testing](#toc)

We split our resampled dataset into **training**, **validation**, and **testing** sets. This is essential for evaluating the model's performance at each stage of training.

Explanation of the split:

- **Training Set**: The first 80% of the data is used to train the model.
- **Validation Set**: The validation period starts just after the training data. We shift the evaluation start back by the `context_length` to ensure the model has enough historical data for making predictions.
- **Test Set**: The final 10% of the data is reserved for testing. Similar to the validation set, the test start is also shifted back by the `context_length`.

By defining this split configuration, we ensure that the model is trained and evaluated using separate time periods, which allows for reliable performance assessment.


In [None]:
from tsfm_public import (
    TimeSeriesPreprocessor,
    TinyTimeMixerForPrediction,
    get_datasets,
)

# Get the length of the resampled data
data_length = len(bt_data_resampled)

# Define the indices for the train, validation, and test splits
train_start_index = 0
train_end_index = round(data_length * 0.8)  # First 80% for training

# Shift the start of evaluation back by the context length for proper prediction
eval_start_index = round(data_length * 0.8) - context_length  # Next 10% for validation
eval_end_index = round(data_length * 0.9)

# Same adjustment for the test set
test_start_index = round(data_length * 0.9) - context_length  # Final 10% for testing
test_end_index = data_length

# Store the split configuration for easy access
split_config = {
    "train": [train_start_index, train_end_index],
    "valid": [eval_start_index, eval_end_index],
    "test": [test_start_index, test_end_index],
}

In [None]:
print(f"train_start_index: {train_start_index}, train_end_index: {train_end_index}")
print(f"eval_start_index: {eval_start_index}, eval_end_index: {eval_end_index}")
print(f"test_start_index: {test_start_index}, test_start_index: {test_end_index}")

### <a id='specifying-columns-and-configuring-the-timeseriespreprocessor'></a>[Specifying columns and configuring the TimeSeriesPreprocessor](#toc)

We now define the key columns for the time series forecasting task and set up the `TimeSeriesPreprocessor` to standardize the data, which is essential for accurate predictions.

**Column specifiers**:

- **timestamp_column**: The `Timestamp` column, which marks the time for each data entry.
- **target_columns**: The `Close` column, representing the value we aim to predict.
- **observable_columns**: The `Open`, `High`, and `Low` columns, providing additional context for the model.

#### What is TimeSeriesPreprocessor?

The `TimeSeriesPreprocessor` is a utility that prepares the time series data for the model by scaling it and setting context and prediction lengths. Specifically, it:

- **Scales the data** using a standard scaler to normalize values.
- **Sets context length** to 512 (past time points) and **prediction length** to 96 (future time points).
- **Handles scaling** while leaving out categorical encoding since our data is numerical.

After configuring the `TimeSeriesPreprocessor`, we use it to generate training, validation, and test datasets, ready for model training and evaluation.


In [None]:
# Define the column specifiers for the time series data
column_specifiers = {
    "timestamp_column": timestamp_column,  # Time reference column
    "target_columns": target_columns,  # Target variable (Close price)
    "observable_columns": observable_columns  # Observable variables (Open, High, Low prices)
}

# Configure the TimeSeriesPreprocessor
tsp = TimeSeriesPreprocessor(
    **column_specifiers,
    context_length=context_length,  # Model will use 512 past time points
    prediction_length=forecast_length,  # Model will predict 96 future time points
    scaling=True,  # Apply scaling to normalize the data
    encode_categorical=False,  # No categorical encoding needed
    scaler_type="standard"  # Use standard scaling
)

# Generate training, validation, and test datasets
# This method returns torch vectors for training and validation
train_dataset, valid_dataset, test_dataset = get_datasets(
    tsp, bt_data_resampled, split_config
)

## <a id='loading-and-evaluation-of-zero-shot-model'></a>[Loading and evaluation of zero-shot model](#toc)

In this step, we load the pretrained **Tiny Time Mixer** model for zero-shot prediction. A **zero-shot model** allows us to make predictions without any additional fine-tuning, leveraging the knowledge it has already acquired from pretraining on similar tasks.

**Pretrained model:**

- The `TinyTimeMixerForPrediction` class provides a pre-built architecture for time series forecasting, and we use the `from_pretrained` method to load the model. 
- Here, to save time, we use the same model that was trained on the dataset, namely `zero_shot_model`.


In [None]:
# The commented code loads the Tiny Time Mixer model for zero-shot predictions  from IBM's pre-trained repository ("ibm/TTM") with a revision of "main". 
# It also uses a prediction filter length of 24, meaning only 24 predicted values are shown per forecast window.

# zeroshot_model = TinyTimeMixerForPrediction.from_pretrained("ibm/TTM", revision="main", prediction_filter_length=24)

zeroshot_model = TinyTimeMixerForPrediction.from_pretrained("models/zero_shot_model")

### <a id='evaluating-the-zero-shot-model'></a>[Evaluating the zero-shot model](#toc)

Once the pretrained zero-shot model is loaded, we use the `Trainer` class from the Hugging Face library to evaluate the model's performance on the test dataset. The `Trainer` class from Hugging Face simplifies the process of model training and evaluation by handling tasks such as forward passes, loss calculation, and metric evaluation.

Trainer configuration:

- **Model**: We pass the pretrained `zeroshot_model` to the `Trainer` class, which provides a high-level API for training and evaluating models.
- **Test dataset**: The evaluation is performed on the test dataset that we generated earlier using the `get_datasets` method.

Evaluation process:

The `evaluate()` method is used to assess the model's performance on the test set. Since this is a zero-shot model, the evaluation will give us insights into how well the model performs out-of-the-box, without additional training or customization.

This step helps us understand the initial forecasting capability of the model on our dataset.

Evaluation will take **~2 minutes** to generate forecasts.


In [None]:
# zeroshot_trainer
zeroshot_trainer = Trainer(
    model=zeroshot_model,
)

zeroshot_trainer.evaluate(test_dataset)

### <a id='setting-up-the-time-series-forecasting-pipeline'></a>[Setting up the time series forecasting pipeline](#toc)

To make predictions using the zero-shot model, we configure a **Time Series Forecasting Pipeline**. This pipeline streamlines the process of generating forecasts by integrating the model and specifying key configurations.


#### Purpose:

This pipeline simplifies the forecasting workflow, enabling us to efficiently generate predictions from our zero-shot model on the time series data. By configuring this pipeline, we prepare the model for making future predictions based on the input time series data.


In [None]:
# Set up the time series forecasting pipeline using the zero-shot model
zs_forecast_pipeline = TimeSeriesForecastingPipeline(
    # Use the zero-shot Tiny Time Mixer model
    model=zeroshot_model,
    # Specify the device (CPU in this case)
    device="cpu",
    # Time reference column
    timestamp_column=timestamp_column,
    # No additional identifier columns
    id_columns=[],
    # Target variable (Close price)
    target_columns=target_columns,
    # Frequency of time series data (hourly)
    freq="1h"                     
)

## <a id='making-predictions-with-the-forecasting-pipeline'></a>[Making predictions with the forecasting pipeline](#toc)

In this step, we generate predictions using the **Time Series Forecasting Pipeline**. However, instead of recalculating the forecast, we will load a pre-generated forecast from a pickle file for efficiency.

#### Explanation of the process:

- **Commented code**: 
    - The commented code is used to generate predictions by passing the preprocessed test dataset to the forecasting pipeline.
    - This approach would allow the zero-shot model to make predictions directly on the test data. Do not run it here, as it will take a lot of time.

- **Loading pre-generated forecast**: 
    - Instead of running the pipeline again, we will load a pre-generated forecast from the `zs_forecast.pkl` file using `pd.read_pickle()`.

By loading the pre-generated forecast, we can immediately inspect the predictions made by the model.


In [None]:
# Run the forecasting pipeline on the preprocessed test dataset
# This generates predictions for the 'Close' prices and stores the result in zs_forecast
# zs_forecast contains both predicted ('Close_prediction') and actual ('Close') values

# zs_forecast = zs_forecast_pipeline(tsp.preprocess(bt_data_resampled[test_start_index:test_end_index]))

### <a id='understanding-the-forecast-process-and-missing-values'></a>[Understanding the forecast process and missing values](#toc)

#### How the forecast works

The **Tiny Time Mixer (TTM)** model is used to forecast Bitcoin's `Close` prices. The model takes historical data to predict the next 24 time steps (hours), and stores these predictions in the `Close_prediction` column. Meanwhile, actual observed values for 96 time steps are stored in the `Close` column.

#### Step-by-step forecast process:

1. **Input data (t1 to t512)**: 
   - The model looks at the last **512 time points** (from `t1` to `t512`) as context, including values like `Open`, `High`, `Low`, and `Close`.

2. **Model prediction (t513 to t609)**: 
   - Based on the past 512 data points, the model predicts **96 future points** (from `pt513` to `pt609`). These are the forecasted `Close` prices, which will be stored in the `Close_prediction` column.

3. **Filtered prediction (t513 to t537)**: 
   - The model only shows **24 out of the 96 predicted points** (`pt513` to `pt537`) in the `Close_prediction` column due to the `prediction_filter_length=24` set during the model training. The rest of the 96 points are not shown.

4. **Compare predicted t513 to t537 with actual t513 to t609**:
   - The **predicted values** (from `pt513` to `pt537`) are compared with the **actual values** (from `t513` to `t537` that is first 24 out of 96 time steps). This allows for evaluation of the model’s accuracy for those 24 steps.

#### Window shifts forward for the next prediction:

After the first prediction, the window shifts by one timestamp:

- **New input (t2 to t513)**: The model now looks at data from `t2` to `t513` and predicts the future steps.
- **New prediction (t514 to t610)**: It predicts **96 future points** (from `pt514` to `pt610`).
- **Filtered prediction (t514 to t538)**: Only the first **24 predicted points** (`pt514` to `pt538`) are shown in `Close_prediction` and are compared with actual values.

#### Process repeats:

This sliding window repeats:

- The model shifts one time point forward each time.
- For each window, it predicts **96 values** and shows only **24 predictions**, comparing them against the actual values for the same time range.

#### When does forecasting stop?

- The forecasting process continues **until the end of the dataset**. Once the model reaches the end of the available historical data, it can no longer create a 512-point context window to predict further time steps.
- For example, if the dataset has 10,000 time steps, the model will make predictions up to **t10,096** (with the last window using data from `t9,488` to `t10,000` as context).
- After the model uses the final 512 data points available for input, it will no longer be able to predict the future since there’s no more data to feed into the model for additional context. At this point, the forecasting process stops.


![Forecast Process](https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/SAv1DljdRu3JMRbLZV1TSg/Forecast%20Process.png)


In [None]:
zs_forecast = pd.read_pickle("models/zs_forecast.pkl")
zs_forecast

#### Why do `NaN` values occur toward the end of the forecast?

The `NaN` values in the `Close` column (actual values) occur because the actual data for future time steps is not yet available. Here’s how it works:

- **Older rows (further back in the past)**: For rows that are further back in time, actual `Close` prices are available for 96 hours after the timestamp. Therefore, fewer `NaN` values are present in these rows because more of the future data is known.
- **Recent rows (closer to the present)**: As you approach the present time, fewer future `Close` prices are available, leading to more `NaN` values. The most recent rows may have all 96 values as `NaN`, simply because those future time steps haven’t occurred yet, and thus, no actual data is available.

For example, consider a row with a timestamp of `2024-09-24 00:00:00`. If only a few future hours have passed, most of the 96 actual `Close` values will be `NaN` because we haven’t yet observed those future hours.

#### Is this expected?

Yes, this is completely expected. In time series forecasting, the model predicts future values, but actual data for those time steps is only available as time progresses. The increasing number of `NaN` values in the `Close` column is a natural result of forecasting into the future, where the actual data isn’t available yet.

#### Key takeaway

The model predicts the next 24 hours of `Close` prices based on historical data, while actual values for up to 96 hours are stored for comparison. Over time, these `NaN` values will be replaced with real data as those future hours occur, allowing for better comparison of predicted vs. actual values.


## <a id='comparing-predicted-vs-actual-values'></a>[Comparing predicted vs actual values](#toc)

When we look at the forecasted results, it's important to note that we have a window of 24 timestamps for predictions in the forecast object. This is because the model was configured to predict 24 time steps into the future for each prediction.

To compare the model's predictions to the actual values, we can plot both the predicted and actual **Close** prices for a given time window. In this example, we use **Row 11** from the forecast data to visualize the predicted versus actual values.


### <a id='plotting-the-results'></a>[Plotting the results](#toc):
- By plotting the **Predicted Values** and **Actual Values**, we can visually assess the accuracy of the model's predictions over the 24-timestep forecast window.


In [None]:
# Create a DataFrame to compare the predicted and actual Close prices for Row 11
fcast_df = pd.DataFrame({
    "pred": zs_forecast.loc[11]['Close_prediction'],  # Predicted values for the next 24 time steps
    "actual": zs_forecast.loc[11]['Close'][:24]       # Actual values for the same time period
})

# Plot the predicted vs actual Close prices
ax = fcast_df.plot()

# Set labels and title for the plot
ax.set_xlabel("Time Steps")  # Label for the X-axis (time steps)
ax.set_ylabel("Close Price")  # Label for the Y-axis (Close price in USD)
ax.set_title("Predicted vs Actual Close Price for Row 11")  # Title of the plot

#### Row 11 analysis (above image)

The above image shows the predicted vs actual `Close` prices. In this case, the predictions for the next 24 time steps (hours) are compared to the actual values for the same period.

- **Observation**: The actual values (orange line) fluctuate more significantly than the predicted values (blue line), which remain relatively steady. The model's predictions are higher and do not capture the sharp drops seen in the actual prices.
  
- **Explanation**: This discrepancy can be attributed to the fact that the model is trained to predict general trends and may not capture short-term fluctuations precisely. For **Row 11**, it appears the model missed the volatility of the actual data, resulting in a notable difference between the predicted and actual values.


### <a id='evaluating-the-forecast-results-over-a-specific-time-horizon'></a>[Evaluating the forecast results over a specific time horizon](#toc)

To evaluate the model's performance over a specific time horizon, we can compare the predicted values to the actual values at a specified number of hours into the future. This custom function, `compare_forecast`, helps automate this comparison by focusing on a single prediction point, such as the value forecasted for the next day.

#### Explanation of the function:

- **Inputs**:
  
  - `forecast`: The forecast DataFrame containing the predictions and actual values.
  - `date_col`: The name of the column containing the date or timestamp information.
  - `prediction_col`: The column containing the model's forecasted values.
  - `actual_col`: The column containing the actual observed values.
  - `hours_out`: The number of hours into the future to compare (e.g., 24 hours for next-day prediction).
  
- **Process**:
  
  - The function initializes two lists (`actual` and `pred`) to store the actual and predicted values for the specified horizon (e.g., 24 hours).
  - It loops through each row of the forecast data and extracts the value corresponding to `hours_out` from both the `prediction_col` and the `actual_col`.
  - The result is a new DataFrame (`comparisons`) containing the timestamps, actual values, and predicted values for the specified horizon.
  
#### Output:

The function returns a DataFrame containing the timestamp, actual values, and predicted values for comparison at the specified future timestamp. This allows us to directly compare the model's forecast against the true observed value for a given prediction horizon.


In [None]:
def compare_forecast(forecast, date_col, prediction_col, actual_col, hours_out):
    comparisons = pd.DataFrame()  # Initialize a new DataFrame to store comparisons
    comparisons[date_col] = forecast[date_col]  # Store the date or timestamp column

    # Initialize lists to store actual and predicted values
    actual = []
    pred = []

    # Loop through the forecast data
    for i in range(len(forecast)):
        # Append the prediction for the specified hour (e.g., 24 hours into the future)
        pred.append(forecast[prediction_col].values[i][hours_out - 1])  
        # Append the actual value for the same hour
        actual.append(forecast[actual_col].values[i][hours_out - 1])

    # Add the actual and predicted values to the comparisons DataFrame
    comparisons['actual'] = actual
    comparisons['pred'] = pred

    return comparisons  # Return the comparisons DataFrame

Above function allows us to align the forecasted values with the true observed values, and we can calculate the **Root Mean Squared Error (RMSE)** to assess the model's performance.

#### Process:

1. **Extract predictions and actual values**: We compare the predictions and actual values for 12 hours out by using the `compare_forecast` function.
2. **Handle missing data**: Any rows where either `pred` or `actual` values contain `NaN` are dropped to ensure that the RMSE calculation only considers valid data points.
3. **Calculate RMSE**: The **Root Mean Squared Error** is computed to evaluate how close the predicted values are to the actual values. RMSE is chosen because it works well with data containing small or negative values, providing a robust measure of prediction accuracy.
4. **Plot Results**: The predicted and actual values are plotted against time, allowing us to visually compare the performance of the model.

#### Why RMSE?

While other metrics such as **Mean Absolute Error (MAE)** or **Mean Absolute Percentage Error (MAPE)** can be used, RMSE is particularly useful when the dataset contains very small or negative values, as it penalizes larger errors more heavily.

The plot gives us a visual representation of how well the model predicts the next 12 hours of Bitcoin **Close** prices.


In [None]:
# Get the predictions and actual values for 12 hours out
one_day_out_predictions = compare_forecast(zs_forecast, "Timestamp", "Close_prediction", "Close", 12)

# Drop rows where either 'pred' or 'actual' contains NaN values to ensure valid data
out = one_day_out_predictions.dropna(subset=["actual", "pred"])

# Calculate Root Mean Squared Error (RMSE)
rms = '{:.10f}'.format(mean_squared_error(out['actual'], out['pred'], squared=False))

# Print the RMSE result
print(f"Root Mean Squared Error (RMSE): {rms}")

# Plot the predicted vs actual Close prices over time
out.plot(x="Timestamp", y=["pred", "actual"], figsize=(20, 5), title=str(rms))

#### Overall trend analysis (above image)

The above image plots the **predicted vs actual `Close` prices over time**, across the entire dataset. The **Root Mean Squared Error (RMSE)** for this forecast is `0.0663050094`.

- **Observation**: Although there are differences between the predicted and actual values at each time step, the model does a good job of capturing the **overall trend** of the data. The predicted values (blue line) closely follow the actual values (orange line), with both showing similar directional changes over time.

- **Explanation**: Despite missing some short-term fluctuations (like in Row 11), the model captures the broader **upward and downward trends** over longer periods. This indicates that while the model may not be perfect for short-term predictions, it still has the ability to capture the **general direction** of the price movements over time.

#### Key takeaway

- The model may struggle with predicting exact short-term price movements, as seen in **Row 11**, where the predicted values are quite different from the actual ones. 
- However, when evaluated over the entire dataset, the model successfully follows the general trend, even if there are some discrepancies at individual points. This shows that while the model may not always capture short-term volatility, it performs well in forecasting the overall movement of Bitcoin's `Close` prices.
  
Overall, this is expected behavior in time series forecasting, where capturing the broader trend is often more important than predicting exact values for each time step.


## <a id='fine-tuning-the-ttm-model'></a>[Fine tuning the TTM model](#toc)

Now let's fine-tune the model weights using our training data and evaluate the model on the test set. To do this, we need to configure the **Training Arguments**.


### <a id='setting-up-parameters-for-fine-tuning'></a>[Setting up parameters for fine tuning](#toc)

#### Key parameters:

- **Learning Rate**: The learning rate controls how much the model weights are updated during each iteration. In this case, we set it to `0.0001`.
- **Number of Epochs**: The model will be trained for `10` epochs.
- **Batch Size**: Both the training and evaluation batch size are set to `32`, determining how many samples are processed at a time.
- **Early Stopping**: The model will load the best version at the end of training, based on the lowest evaluation loss (`eval_loss`), ensuring we use the best-performing model for predictions.

#### Additional configuration:

- **Logging and Saving**: Logs and model checkpoints will be saved at the end of each epoch, and only the best model (based on `eval_loss`) will be kept to save space.
- **Parallelism**: The `dataloader_num_workers=8` parameter is set to improve the efficiency of data loading during training.

These training arguments ensure that the model is fine-tuned efficiently, with mechanisms in place to monitor and save the best model based on its performance.


In [None]:
OUT_DIR = ""

# Important parameters for fine-tuning
learning_rate = 0.0001  # Set the learning rate for weight updates
num_epochs = 10  # Number of training epochs
batch_size = 32  # Number of samples processed in each batch

# Configure the TrainingArguments for fine-tuning
finetune_forecast_args = TrainingArguments(
    output_dir=os.path.join(OUT_DIR, "output"),  # Directory to save model checkpoints and outputs
    overwrite_output_dir=True,  # Overwrite existing output directory
    learning_rate=learning_rate,  # Learning rate for the optimizer
    num_train_epochs=num_epochs,  # Total number of training epochs
    do_eval=True,  # Perform evaluation at the end of each epoch
    eval_strategy="epoch",  # Evaluate at the end of every epoch
    per_device_train_batch_size=batch_size,  # Training batch size
    per_device_eval_batch_size=batch_size,  # Evaluation batch size
    dataloader_num_workers=8,  # Number of workers for data loading
    save_strategy="epoch",  # Save model at the end of every epoch
    logging_strategy="epoch",  # Log metrics at the end of every epoch
    save_total_limit=1,  # Keep only the best model to save space
    logging_dir=os.path.join(OUT_DIR, "logs"),  # Directory to store logs
    load_best_model_at_end=True,  # Load the best model based on evaluation loss at the end
    metric_for_best_model="eval_loss",  # Monitor evaluation loss to choose the best model
    greater_is_better=False,  # For loss metrics, lower is better
)

### <a id='loading-the-fine-tuned-forecast-model'></a>[Loading the fine-tuned forecast model](#toc)

Before we proceed with fine-tuning, we load the previously fine-tuned **Tiny Time Mixer (TTM)** model using `from_pretrained()`. This allows us to initialize the model with weights that were adjusted during an earlier fine-tuning process.

#### Purpose:

- **Fine-tuned Model**: Unlike the zero-shot model, which makes predictions without further training, the fine-tuned model has been trained on our specific time series data. This allows it to generate more accurate forecasts that better reflect the patterns in the Bitcoin dataset.
  
This step is crucial before configuring the training process, as we need a model instance to pass to the **Trainer**.


In [None]:
finetune_forecast_model = TinyTimeMixerForPrediction.from_pretrained("models/finetuned_forecast_model")

Once the fine-tuned model is loaded, we set up important components for training, such as early stopping, tracking, the optimizer, and the learning rate scheduler.

To prevent the model from overfitting during fine-tuning, we implement **early stopping**. Additionally, we configure the **optimizer** and **scheduler** that will be used during training to adjust the model's learning rate dynamically.


In [None]:
# Create the early stopping callback to prevent overfitting
early_stopping_callback = EarlyStoppingCallback(
    early_stopping_patience=2,  # Stop training after 2 epochs of no improvement
    early_stopping_threshold=0.001,  # Minimum improvement required to continue training
)

# Callback for tracking model training
tracking_callback = TrackingCallback()

# Configure the optimizer (AdamW) and learning rate scheduler (OneCycleLR)
optimizer = AdamW(finetune_forecast_model.parameters(), lr=learning_rate)  # Optimizer for weight updates
scheduler = OneCycleLR(
    optimizer,  
    learning_rate,  
    epochs=num_epochs,  
    steps_per_epoch=math.ceil(len(train_dataset) / (batch_size)),
)

Once we have configured all the training parameters, we can begin fine-tuning our model using the **Trainer** class from the Hugging Face library.


In [None]:
# Configure the Trainer for fine-tuning the model
finetune_forecast_trainer = Trainer(
    model=finetune_forecast_model,  # Fine-tuned version of the Tiny Time Mixer model
    args=finetune_forecast_args,  # Training arguments (learning rate, batch size, epochs, etc.)
    train_dataset=train_dataset,  # Training dataset
    eval_dataset=valid_dataset,  # Validation dataset for evaluation during training
    callbacks=[early_stopping_callback, tracking_callback],  # Early stopping and tracking callbacks
    optimizers=(optimizer, scheduler),  # Optimizer (AdamW) and learning rate scheduler (OneCycleLR)
)

After setting up the model, optimizer, scheduler, and trainer, the training process begins by calling the `train()` method. However, to ease the process, we just use the model already trained and load the fine tuned forecast in the later steps.


In [None]:
# finetune_forecast_trainer.train()

After fine-tuning the Tiny Time Mixer model, we configure the **Time Series Forecasting Pipeline** to generate predictions using the newly fine-tuned model. The pipeline automates the forecasting process, enabling us to efficiently make predictions based on the input data.


In [None]:
forecast_pipeline = TimeSeriesForecastingPipeline(
    model=finetune_forecast_model,
    device="cpu",
    timestamp_column=timestamp_column,
    id_columns=[],
    target_columns=target_columns,
    observable_columns=observable_columns,
    freq="1h"
)

### <a id='evaulating-fine-tuned-forecast'></a>[Evaluating fine-tuned forecast](#toc)

After fine-tuning the model, we can evaluate its performance on the test dataset. The commented-out section shows how we would run the evaluation process using the **Trainer** class, but for efficiency, we load pre-generated evaluation results from a pickle file using `pd.read_pickle()` to save time and computational resources.

By loading the fine-tuned forecast, we can immediately evaluate the model's predictions and compare them to the actual values.


In [None]:
# finetune_forecast_trainer.evaluate(tsp.preprocess(bt_data_resampled[test_start_index:test_end_index]))

In [None]:
forecast_finetuned = pd.read_pickle("models/forecast_finetuned.pkl")
forecast_finetuned

To assess the performance of the fine-tuned model, we can compare its predictions with the actual values for a specific row in the test dataset. In this case, we focus on **Row 11**, and we plot the predicted and actual **Close** prices for the next 24 time steps.


In [None]:
# Create a DataFrame to compare the predicted and actual Close prices for Row 11
fcast_df = pd.DataFrame({
    "pred": forecast_finetuned.loc[11]['Close_prediction'],  # Predicted Close prices for the next 24 time steps
    "actual": forecast_finetuned.loc[11]['Close'][:24]       # Actual Close prices for the same period
})

# Plot the predicted vs actual Close prices
ax = fcast_df.plot()

# Set labels and title for the plot
ax.set_xlabel("Time Steps")  # X-axis represents the time steps (hours)
ax.set_ylabel("Close Price")  # Y-axis represents the Close price in USD
ax.set_title("Predicted vs Actual Close Price for Row 11")  # Title of the plot

We can further evaluate the model by comparing the predicted values with the actual values at a specific time horizon, in this case, **12 hours out**. The following steps involve comparing the predictions, calculating the **Root Mean Squared Error (RMSE)**, and plotting the predictions against the actual values.


In [None]:
# Get the predictions and actual values for 12 hours out (you can change this to other values if needed)
forecast_predictions = compare_forecast(forecast_finetuned, "Timestamp", "Close_prediction", "Close", 12)

# Drop rows where either 'pred' or 'actual' contains NaN values to ensure valid data for RMSE calculation
forecast_out = forecast_predictions.dropna(subset=["actual", "pred"])

# Calculate Root Mean Squared Error (RMSE) between predicted and actual values
rms = '{:.10f}'.format(mean_squared_error(forecast_out['actual'], forecast_out['pred'], squared=False))

# Print the calculated RMSE
print(f"Root Mean Squared Error (RMSE): {rms}")

# Plot the predicted vs actual values over time, with RMSE in the title
forecast_out.plot(x="Timestamp", y=["pred", "actual"], figsize=(20, 5), title=f"RMSE: {rms}")
plt.show()

## <a id='conclusions-from-zero-shot-and-fine-tuned-model-performance'></a>[Conclusions from zero-shot and fine-tuned model performance](#toc)

### Zero-shot learning (first image)

In the first image, we observe the performance of the model before any fine-tuning, known as **zero-shot learning**. The **Root Mean Squared Error (RMSE)** is shown to be approximately **0.0663**, which indicates that the model performs reasonably well out-of-the-box, capturing the general trend of the actual values over time. However, several key observations can be made:

- The **predicted values** (blue line) closely follow the **actual values** (orange line), especially in areas with less volatility.
- However, in regions with sharp changes or fluctuations, the model struggles to accurately predict the peaks and troughs of the actual values.
- Despite this, the model captures the overall **directional trend** well, indicating that the underlying architecture of the model is able to forecast the broader patterns in the data.

### Fine-tuned model (second image)

In the second image, we see the results after **fine-tuning** the model on the training dataset. The **RMSE** has improved to **0.0655**, reflecting better overall performance. Key improvements include:

- The fine-tuned model significantly reduces the prediction error, especially in areas where the zero-shot model showed a gap between the predicted and actual values.
- The **predicted line** (blue) more closely aligns with the **actual values** (orange), particularly in regions with more volatility and sudden changes in the time series data.
- Fine-tuning allows the model to better adapt to the specific characteristics of the dataset, improving its ability to generalize and capture complex fluctuations that the zero-shot model missed.

### Overall conclusion

The fine-tuned model outperforms the zero-shot model in terms of **RMSE** and visual alignment of predicted and actual values. The fine-tuning process allowed the model to:

- **Reduce prediction error** across the board, with particularly noticeable improvements in volatile regions.
- **Better capture short-term fluctuations** in the time series while maintaining accuracy on long-term trends.
- Achieve an overall **better fit** to the actual data, as indicated by the reduction in RMSE from **0.0663** to **0.0655**.

In conclusion, fine-tuning the Tiny Time Mixer model significantly improves its performance, making it better suited for real-world forecasting tasks that involve complex and volatile time series data.


## [Time series forecasting using the TSFM model](#toc)




 We will generate a synthetic time series dataset and use it to train the **Tiny Time Mixer (TTM)** model for time series forecasting. The model will be fine-tuned to predict future values based on historical data, and we will evaluate its performance using metrics such as **Root Mean Squared Error (RMSE)**.


### Objectives
By the end of this exercise, you will be able to:
1. **Create a time series dataset** using sine wave data.
2. **Set up a time series forecasting model** using the **Tiny Time Mixer (TTM)** model.
3. **Split the dataset** into training, validation, and testing sets for model training.
4. **Train the model** using the **Trainer** class from Hugging Face.
5. **Make predictions** on unseen test data and evaluate the performance using **RMSE**.

### Dataset description

The dataset is generated using a combination of a sine wave and random noise to simulate fluctuations over time. The dataset consists of the following columns:

- **Timestamp**: Represents hourly intervals starting from '2023-01-01'.
- **Value**: A time series value generated from a sine function with added noise.


In [None]:
timestamps = pd.date_range('2023-01-01', periods=1000, freq='h')
values = np.sin(np.linspace(0, 100, 1000)) + np.random.normal(0, 0.1, 1000)

# Create the DataFrame
df = pd.DataFrame({'Timestamp': timestamps, 'Value': values})
df

### [ Split dataset for time series forecasting](#toc)


#### **Objective:**

In this exercise, you will learn how to split a time series dataset into train, validation, and test sets. These splits are critical for training a machine learning model, validating its performance, and testing how well the model generalizes to unseen data.

#### **Instructions:**

1. **Define parameters:**
   You will first define the necessary parameters for your time series model. This includes specifying the column that contains timestamps and the column that contains the target values you want to predict.

   - The column that contains the timestamps is `"Timestamp"`.
   - The column that contains the target values is `"Value"`.

2. **Set the random seed:**
   To ensure reproducibility in your results, set a seed value using `set_seed(SEED)` where `SEED` is predefined.

3. **Set forecasting parameters:**
   - `context_length`: For this exercise, set the context length to 512 time points.
   - `forecast_length`: Set the forecast length is 96 time points.

4. **Dataset length:**
   Retrieve the length of the dataset using `data_length = len(df)`.

5. **Dataset splitting:**
   Define indices to split the dataset into training (70%), validation (10%), and testing (20%) sets. 
   
6. **Split configuration:**
   Store the indices for each dataset split in the `split_config` dictionary. You’ll need to ensure that your validation and test sets include the context length so that the model can make predictions.

7. **Output:**
   Print the start and end indices for the train, validation, and test sets to verify the dataset splits.


In [None]:
# Step 1: Set up parameters
timestamp_column = "Timestamp"
target_column = ["Value"]

# Set seed for reproducibility
SEED = 42
set_seed(SEED)

# Forecasting parameters
context_length = 512  # Use 512 time points from the past
forecast_length = 96   # Predict 96 time points into the future

# Step 2: Get the length of the dataset
data_length = len(df)

# Step 3: Define the indices for the train, validation, and test splits
train_start_index = 0
train_end_index = round(data_length * 0.7)  # First 70% for training
eval_start_index = round(data_length * 0.7) - context_length  # Next 10% for validation
eval_end_index = round(data_length * 0.8)
test_start_index = round(data_length * 0.8) - context_length  # Final 20% for testing
test_end_index = data_length

# Store the split configuration
split_config = {
    "train": [train_start_index, train_end_index],
    "valid": [eval_start_index, eval_end_index],
    "test": [test_start_index, test_end_index],
}

# Print the split indices
print(f"Train: {train_start_index} to {train_end_index}")
print(f"Validation: {eval_start_index} to {eval_end_index}")
print(f"Test: {test_start_index} to {test_end_index}")

### [ Preprocess and create dataset](#toc)


#### **Objective:*
You will preprocess the time series data and create the train, validation, and test datasets. Preprocessing ensures the data is in the correct format and scaled appropriately for model training. You will also split the preprocessed data into the train, validation, and test sets based on the configuration from the previous exercise.

#### **Instructions:**

1. **Define column specifiers:**
   
   The first step is to define the column specifiers that tell the preprocessor which columns to use for timestamps and target values.
   
   - `timestamp_column`: The column containing timestamps (e.g., `"Timestamp"`).
   - `target_columns`: The column containing the target values you want to predict (e.g., `"Value"`).

   You will store these in a dictionary named `column_specifiers`.

2. **Initialize the preprocessor:**
   
   You will use the `TimeSeriesPreprocessor` to initialize the preprocessing pipeline. The preprocessor will ensure that the data is appropriately scaled and structured for model training.

3. **Preprocess the data:**
   
   Using the `get_datasets()` function, apply the preprocessor to the dataset (`df`) and split the data into train, validation, and test sets based on the configuration created in **Exercise 1**. The function will return three datasets:
   
   - `train_dataset`: Contains the training data.
   - `valid_dataset`: Contains the validation data.
   - `test_dataset`: Contains the test data.


In [None]:
# Step 1: Define the column specifiers for time series data
column_specifiers = {
    "timestamp_column": timestamp_column,
    "target_columns": target_column,
}

# Step 2: Initialize the preprocessor
tsp = TimeSeriesPreprocessor(
    **column_specifiers,
    context_length=context_length,
    prediction_length=forecast_length,
    scaling=True,
    encode_categorical=False,
    scaler_type="standard",
)

# Step 3: Preprocess the data and get the train, validation, and test datasets
train_dataset, valid_dataset, test_dataset = get_datasets(
    tsp, df, split_config
)

### [ Train the model](#toc)


You will train a `TinyTimeMixerForPrediction` model on the preprocessed train dataset. You will set up the training configuration, define the optimizer and scheduler, and train the model using the `Trainer` from Hugging Face's Transformers library.

#### **Instructions:**

1. **Set important parameters:**
   You will start by defining key parameters for model training such as learning rate, number of epochs and batch size.

2. **Set up training arguments:**
   Define the training configuration using `TrainingArguments`. Key arguments include:
   
   - `output_dir`: Directory where model outputs will be saved.
   - `learning_rate`: The learning rate for the training process.
   - `num_train_epochs`: Number of epochs for training (set to 10).
   - `eval_strategy`: Perform evaluation at the end of each epoch.
   - `save_strategy`: Save model checkpoints after each epoch.
   - `logging_dir`: Directory where training logs will be saved.
   - `metric_for_best_model`: Use `eval_loss` to track and save the best model.
     

3. **Load the model:**
   Load the `TinyTimeMixerForPrediction` model from IBM’s pretrained model repository. This will serve as the starting point for training.

4. **Set up callbacks:**
   Create two callbacks to monitor the training process:
   
   - `early_stopping_callback`: Set it to 2 in this case.
   - `tracking_callback`: Tracks the training process.

5. **Optimizer and scheduler:**
   Define the optimizer (`AdamW`) and the learning rate scheduler (`OneCycleLR`).

6. **Set up the trainer:**
   Use the `Trainer` class to bring together the model, training arguments, optimizer, scheduler, datasets, and callbacks for training.

7. **Train the model:**
   Train the model using the `.train()` method.


In [None]:
OUT_DIR = "ttm_trained_models_practice/"

# Important parameters
learning_rate = 0.0001
num_epochs = 10
batch_size = 32

# Step 1: Set up training arguments
train_forecast_args = TrainingArguments(
    output_dir=os.path.join(OUT_DIR, "output"),
    overwrite_output_dir=True,
    learning_rate=learning_rate,
    num_train_epochs=num_epochs,
    do_eval=True,
    eval_strategy="epoch",
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    dataloader_num_workers=8,
    save_strategy="epoch",
    logging_strategy="epoch",
    save_total_limit=1,
    logging_dir=os.path.join(OUT_DIR, "logs"),
    load_best_model_at_end=True,
    metric_for_best_model="eval_loss",
    greater_is_better=False,
)

# Step 2: Load the model for training
train_forecast_model = TinyTimeMixerForPrediction.from_pretrained(
    "ibm/TTM", revision="main", prediction_filter_length=24
)

# Step 3: Create early stopping callback and tracking callback
early_stopping_callback = EarlyStoppingCallback(
    early_stopping_patience=2,
    early_stopping_threshold=0.001,
)
tracking_callback = TrackingCallback()

# Optimizer and scheduler
optimizer = AdamW(train_forecast_model.parameters(), lr=learning_rate)
scheduler = OneCycleLR(
    optimizer,
    learning_rate,
    epochs=num_epochs,
    steps_per_epoch=math.ceil(len(train_dataset) / batch_size),
)

# Step 4: Set up the Trainer for training
train_forecast_trainer = Trainer(
    model=train_forecast_model,
    args=train_forecast_args,
    train_dataset=train_dataset,
    eval_dataset=valid_dataset,
    callbacks=[early_stopping_callback, tracking_callback],
    optimizers=(optimizer, scheduler),
)

# Step 5: Train the model
train_forecast_trainer.train()

### [ Make predictions and evaluating the model](#toc)


You will use the trained model to make predictions on the test dataset. You will then evaluate the model’s performance by comparing the predicted values with the actual values using the Root Mean Squared Error (RMSE) metric.

#### **Instructions:**

1. **Preprocess the test data:**
   
   - Use the `TimeSeriesPreprocessor` (tsp) to preprocess the test portion of the dataset. 
   - Select the appropriate data points based on the test indices defined earlier (`test_start_index` to `test_end_index`).
   - The preprocessed data will be used to generate forecasts.

2. **Set up the forecasting Pipeline:**
   
   - Create a `TimeSeriesForecastingPipeline` using the trained model.
   - Set the `device` to `"cpu"` to perform inference on the CPU.
   - Define the column names for required columns.
   - Set the frequency of the time series for hourly data.

4. **Make forecasts on the test dataset:**
   
5. **Compare forecasts and calculate RMSE:**
   
   - Compare the predicted values from the model with the actual values in the test set using the `compare_forecast()` function.
   - Calculate the Root Mean Squared Error (RMSE) to quantify the error between the predicted and actual values.

7. **Plot predictions vs actual values:**
   
   - Plot the predicted values against the actual values over time using `matplotlib`.


In [None]:
# Step 1: Preprocess the test data
test_data = tsp.preprocess(df[test_start_index:test_end_index])

# Step 2: Set up the forecasting pipeline with the trained model
forecast_pipeline = TimeSeriesForecastingPipeline(
    model=train_forecast_model,  # Use the trained model here
    device="cpu",
    timestamp_column=timestamp_column,
    id_columns=[],
    target_columns=target_column,
    freq="1h"
)

# Step 3: Make forecasts on the test dataset
forecasts = forecast_pipeline(test_data)

# Step 4: Compare forecasts and calculate RMSE
forecast_predictions = compare_forecast(forecasts, "Timestamp", "Value_prediction", "Value", 12)

# Drop rows with NaN values
forecast_out = forecast_predictions.dropna(subset=["actual", "pred"])

# Calculate RMSE
rms = '{:.10f}'.format(mean_squared_error(forecast_out['actual'], forecast_out['pred'], squared=False))
print(f"Root Mean Squared Error (RMSE): {rms}")

# Step 5: Plot the predictions vs actual values
forecast_out.plot(x="Timestamp", y=["pred", "actual"], figsize=(20, 5), title=f"RMSE: {rms}")
plt.show()