In [None]:
#| hide
!pip install -Uqq nixtla

In [None]:
#| hide 
from nixtla.utils import in_colab

In [None]:
#| hide 
IN_COLAB = in_colab()

In [None]:
#| hide
if not IN_COLAB:
    from nixtla.utils import colab_badge
    from dotenv import load_dotenv

# Improve Forecast Accuracy with TimeGPT

In this notebook, we demonstrate how to use TimeGPT for forecasting and explore three common strategies to enhance forecast accuracy. We use the hourly electricity price data from Germany as our example dataset. Before running the notebook, please initiate a NixtlaClient object with your api_key in the code snippet below.

### Result Summary

| Steps | Description                  | MAE  | MAE Improvement (%) | RMSE  | RMSE Improvement (%) |
|-------|------------------------------|------|---------------------|-------|----------------------|
| 0     | Benchmark Model              | 18.5 | N/A                 | 20.0  | N/A                  |
| 1     | Add Fine-Tuning Steps        | 12.0 | 35.14%              | 13.3  | 33.5%                |
| 2     | Adjust Fine-Tuning Loss      | 9.2  | 50.27%              | 12.0  | 40.0%                |
| 3     | Add Exogenous Variables      | 10.1 | 45.41%              | 11.4  | 43.0%                |
| 4     | Switch to Long-Horizon Model  | 6.4  | 65.38%              | 7.7   | 61.50%               |


In [None]:
#| echo: false
if not IN_COLAB:
    load_dotenv()
    colab_badge('docs/frequently-asked-questions/01_how_to_improve_forecast_accuracy')

First, we install and import the required packages, initialize the Nixtla client and create a function for calculating evaluation metrics.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import mean_absolute_error, root_mean_squared_error
from utilsforecast.plotting import plot_series
from nixtla import NixtlaClient

In [None]:
nixtla_client = NixtlaClient(
    api_key = 'LYxnW0M1FvKMaIS6yFwP2wnG3IqWFP86NfOsreLs4Us0LBYgG3xTWOzzNgm6lYiOHm85ff6WIyplg6yHqvr2FrFGvxtazVJIpFN3f9ICpV2VIpGMFA7GHnYMBVyHlFmVJc5UbjSvNgYduDUiJi0BkzUxxle3t4Q933NYN8K4d3Xt36qRTKN3OjRgbNs4ycT21IJLqDV8F4OxvthzIy82Q0TOS7An4QShmgm8sl5uarui782lQ6SwLTHq4r1MZYts'
    # api_key = 'my_api_key_provided_by_nixtla'
)

In [None]:
def evaluate_performance(y_true, y_pred):
    mae = mean_absolute_error(y_true, y_pred)
    rmse = root_mean_squared_error(y_true, y_pred)
    return mae, rmse

## 1. load in dataset
In this notebook, we use hourly electricity prices as our example dataset, which consists of 5 time series, each with approximately 1700 data points. For demonstration purposes, we focus on the German electricity price series. The time series is split, with the last 48 steps (2 days) set aside as the test set.

In [None]:
df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short-with-ex-vars.csv')
df['ds'] = pd.to_datetime(df['ds'])
df_sub = df.query('unique_id == "DE"')

In [None]:
df_train = df_sub.query('ds < "2017-12-29"')
df_test = df_sub.query('ds >= "2017-12-29"')
df_train.shape, df_test.shape

In [None]:
plot_series(df_train[['unique_id','ds','y']][-200:], forecasts_df= df_test[['unique_id','ds','y']].rename(columns={'y': 'test'}))

## 2. Benchmark Forecasting using TimeGPT
We used TimeGPT to generate a one-shot forecast for the time series. As illustrated in the plot, TimeGPT captures the overall trend reasonably well, but it falls short in modeling the short-term fluctuations and cyclical patterns present in the actual data. During the test period, the model achieved a Mean Absolute Error (MAE) of 18.5 and a Root Mean Square Error (RMSE) of 20. This forecast serves as a baseline for further comparison and optimization.

In [None]:
fcst_timegpt = nixtla_client.forecast(df = df_train[['unique_id','ds','y']],
                                      h=2*24,
                                      target_col = 'y',
                                      level = [90, 95])

In [None]:
mae, rmse = evaluate_performance(df_test['y'], fcst_timegpt['TimeGPT'])
mae, rmse

In [None]:
plot_series(df_sub.iloc[-150:], forecasts_df= fcst_timegpt, level = [90])

## 3. Methods to Improve Forecast Accuracy
### 3a. Add Finetune Steps
The first approach to enhance forecast accuracy is to increase the number of fine-tuning steps. The fine-tuning process adjusts the weights within the TimeGPT model, allowing it to better fit your customized data. This adjustment enables TimeGPT to learn the nuances of your time series more effectively, leading to more accurate forecasts. With 30 fine-tuning steps, we observe that the MAE decreases to 12 and the RMSE drops to 13.2.

In [None]:
fcst_finetune_df = nixtla_client.forecast(df=df_train[['unique_id', 'ds', 'y']],
                                          h=24*2,
                                          finetune_steps = 30,
                                          level=[90, 95])

In [None]:
mae, rmse = evaluate_performance(df_test['y'], fcst_finetune_df['TimeGPT'])
mae, rmse

In [None]:
plot_series(df_sub[-200:], forecasts_df= fcst_finetune_df, level = [90])

### 3b. Finetune with Different Loss Function
The second way to further reduce forecast error is to adjust the loss function used during fine-tuning. You can specify your customized loss function using the `finetune_loss` parameter. By modifying the loss function, we observe that the MAE decreases to 10 and the RMSE reduces to 11.4.

In [None]:
fcst_finetune_mae_df = nixtla_client.forecast(df=df_train[['unique_id', 'ds', 'y']],
                                          h=24*2,
                                          finetune_steps = 30,
                                          finetune_loss = 'mae',
                                          level=[90, 95])

In [None]:
mae, rmse = evaluate_performance(df_test['y'], fcst_finetune_mae_df['TimeGPT'])
mae, rmse

In [None]:
plot_series(df_sub[-200:], forecasts_df= fcst_finetune_mae_df, level = [90])

### 3c. Forecast with Exogenous Variables
Exogenous variables are external factors or predictors that are not part of the target time series but can influence its behavior. Incorporating these variables can provide the model with additional context, improving its ability to understand complex relationships and patterns in the data.

To use exogenous variables in TimeGPT, pair each point in your input time series with the corresponding external data. If you have future values available for these variables during the forecast period, include them using the X_df parameter. Otherwise, you can omit this parameter and still see improvements using only historical values. In the example below, we incorporate 8 historical exogenous variables along with their values during the test period, which reduces the MAE and RMSE to 9.2 and 11.9, respectively.

In [None]:
df_train.head()

In [None]:
future_ex_vars_df = df_test.drop(columns = ['y'])
future_ex_vars_df.head()

In [None]:
fcst_ex_vars_df = nixtla_client.forecast(df=df_train,
                                         X_df=future_ex_vars_df,
                                         h=24*2,
                                         level=[90, 95])

In [None]:
mae, rmse = evaluate_performance(df_test['y'], fcst_ex_vars_df['TimeGPT'])
mae, rmse

In [None]:
plot_series(df_sub[-200:], forecasts_df= fcst_ex_vars_df, level = [90])

### 3d. TimeGPT for Long Horizon Forecasting
When the forecasting period is too long, the predicted results may not be as accurate. TimeGPT performs best with forecast periods that are shorter than one complete cycle of the time series. For longer forecast periods, switching to the timegpt-1-long-horizon model can yield better results. You can specify this model by using the model parameter.

In the electricity price time series used here, one cycle is 24 steps (representing one day). Since we’re forecasting two days (48 steps) into the future, using timegpt-1-long-horizon significantly improves the forecasting accuracy, reducing the MAE to 5.7 and RMSE to 7.0.

In [None]:
fcst_long_df = nixtla_client.forecast(df=df_train[['unique_id', 'ds', 'y']],
                                          h=24*2,
                                          model = 'timegpt-1-long-horizon',
                                          level=[90, 95])

In [None]:
mae, rmse = evaluate_performance(df_test['y'], fcst_long_df['TimeGPT'])
mae, rmse

In [None]:
plot_series(df_sub[-200:], forecasts_df= fcst_long_df, level = [90])

## 4. Conclusion and Next Steps

In this notebook, we demonstrated four effective strategies for enhancing forecast accuracy with TimeGPT:

1. **Increasing the number of fine-tuning steps.**
2. **Adjusting the fine-tuning loss function.**
3. **Incorporating exogenous variables.**
4. **Switching to the long-horizon model for extended forecasting periods.**

We encourage you to experiment with these hyperparameters to identify the optimal settings that best suit your specific needs. Additionally, please refer to our documentation for further features, such as **model explainability** and more.

In the examples provided, after applying these methods, we observed significant improvements in forecast accuracy metrics, as summarized below.

### Result Summary

| Steps | Description                  | MAE  | MAE Improvement (%) | RMSE  | RMSE Improvement (%) |
|-------|------------------------------|------|---------------------|-------|----------------------|
| 0     | Benchmark Model              | 18.5 | N/A                 | 20.0  | N/A                  |
| 1     | Add Fine-Tuning Steps        | 12.0 | 35.14%              | 13.3  | 33.5%                |
| 2     | Adjust Fine-Tuning Loss      | 9.2  | 50.27%              | 12.0  | 40.0%                |
| 3     | Add Exogenous Variables      | 10.1 | 45.41%              | 11.4  | 43.0%                |
| 4     | Switch to Long-Horizon Model  | 6.4  | 65.38%              | 7.7   | 61.50%               |


