# Forecasting using Large Language Model (Amazon Chronos-2)

## Install Required Packages and import libraries
### We are using chronos2 model from Amazon

In [None]:
!pip install 'pandas[pyarrow]
!pip install "chronos-forecasting>=2.0"
!pip install matplotlib

In [None]:
import pandas as pd  
from chronos import Chronos2Pipeline

## Load Historical Timeseries data
### In this example, we are using hourly room temprature data 
### the dataset should have minimum three columns
    - key/id - item for which forecast is to be made
    - datatime - timeseries column
    - target - value for which prediction is to be made
### In the example below, we are modifying the timeseries data to add key/id field and remove unused fields

In [None]:
mydf = pd.read_csv("room_temperature.csv")
mydf.head()

In [None]:
mydf = mydf.drop(columns=['id', 'Datetime1'])
mydf["key"] = "room"
mydf.head()

## Prepare context Data
### Context data is used to provide context to LLM model to learn pattern to make forecast
### In this example, out of 6676 rows, we are taking  first 6652 rows as context data

In [None]:
mydf.count()

In [None]:
context_df = mydf.head(6652)

In [None]:
context_df.head()

## Prepare Feature Data
### feature data is the data for which prediction or forecast is made by the LLM
### In this example, out of 6676 rows, we are taking last 24 hows rows as feature data
### since, LLM has to make forecast, we remove target field from the feature data
### We still keep test_df (different data frame than feature) as it holds last 24 rows with target field and it can be used to compare actual vs. prediction

In [None]:
test_df = mydf.tail(24)

future_df = test_df.drop(columns="target")
future_df.head()

## Load Model Weights and Make Prediction from HuggingFace (alternative, you can deploy model to Amazon SageMaker and infer from there)
### Prediction takes the following parameter
    - Context data frame - data frame holding context data
    - feature data frame - data frame with rows for which predicion is made
    - prediction_length - prediction forcast timeline. For instance, in this example, we need prediction for 24 rows, so prediction lenght is 24.
    - quantile_levels - from 0.1 to 0.9. In this example - we take three - [0.1, 0.5, 0.9]
    - id_column - item for which forecast is to be made
    - timestamp_column - timeseries column
    - target - Column(s) with timeseries values to predict

In [None]:
pipeline = Chronos2Pipeline.from_pretrained("amazon/chronos-2")

In [None]:
pred_df = pipeline.predict_df(
    context_df,
    future_df=future_df,
    prediction_length=24,  
    quantile_levels=[0.1, 0.5, 0.9],  
    id_column="key",  
    timestamp_column="datetime", 
    target="target" 
)

In [None]:
pred_df.head()

In [None]:
test_df.head()

## Compare Actual vs Prediction
### Merging both test_df and pred_df into one data frame (df_plot) to compare actual vs prediction
### In this example, we are comparing target (actual) vs prediction. But you can also compare with quantiles like 0.5 or 0.9

In [None]:
import matplotlib.pyplot as plt

# Ensure datetime columns are datetime type
pred_df["datetime"] = pd.to_datetime(pred_df["datetime"])
test_df["datetime"] = pd.to_datetime(test_df["datetime"])

# Merge dataframes on datetime and key
df_plot = pd.merge(
    test_df,
    pred_df,
    on=["datetime", "key"],
    how="inner"
)

# Plot
plt.figure()
plt.plot(df_plot["datetime"], df_plot["target"], label="Actual")
plt.plot(df_plot["datetime"], df_plot["predictions"], label="Prediction")



plt.xlabel("Datetime")
plt.ylabel("Value")
plt.legend()
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()


In [None]:
df_plot.head()