# LLM Co-pilot
This walkthrough covers the `functime.llm` module, which contains namespaced polars dataframe methods to interoperate Large Language Models (LLMs) with functime.

Let's use OpenAI's GPT models to analyze commodity price forecasts created by a functime forecaster. By default we use `gpt-3.5-turbo`.

### Load data

In [9]:
import os
os.environ["OPENAI_API_KEY"] = "..."  # Your API key here

In [2]:
%%capture
import polars as pl

from functime.cross_validation import train_test_split
from functime.forecasting import knn
import functime.llm     # We must import this to override the `llm` namespace for pl.DataFrame

In [3]:
y = pl.read_parquet("../../data/commodities.parquet")
entity_col, time_col, target_col = y.columns
test_size = 30
freq = "1mo"
y_train, y_test = train_test_split(test_size)(y)
print("🎯 Target variable (y) -- train set:")
y_train.collect()

🎯 Target variable (y) -- train set:


commodity_type,time,price
str,datetime[ns],f64
"""Palm kernel oi…",1996-01-01 00:00:00,686.0
"""Palm kernel oi…",1996-02-01 00:00:00,716.0
"""Palm kernel oi…",1996-03-01 00:00:00,715.0
"""Palm kernel oi…",1996-04-01 00:00:00,755.0
"""Palm kernel oi…",1996-05-01 00:00:00,775.0
"""Palm kernel oi…",1996-06-01 00:00:00,762.0
"""Palm kernel oi…",1996-07-01 00:00:00,734.0
"""Palm kernel oi…",1996-08-01 00:00:00,725.0
"""Palm kernel oi…",1996-09-01 00:00:00,707.0
"""Palm kernel oi…",1996-10-01 00:00:00,693.0


We'll make a prediction using a knn forecaster.

In [4]:
# Univariate time-series fit with automated lags
forecaster = knn(freq="1mo", lags=24)
forecaster.fit(y=y_train)
y_pred = forecaster.predict(fh=test_size)
y_pred.head()

commodity_type,time,price
str,datetime[μs],f64
"""Cocoa""",2020-10-01 00:00:00,2.41
"""Cocoa""",2020-11-01 00:00:00,2.42
"""Cocoa""",2020-12-01 00:00:00,2.408
"""Cocoa""",2021-01-01 00:00:00,2.372
"""Cocoa""",2021-02-01 00:00:00,2.322


We'll also provide a short description of the dataset to aid the LLM in its analysis.

In [5]:
dataset_context = "This dataset comprises of forecasted commodity prices between 2020 to 2023."

### Analyze Forecasts

Let's take a look at aluminum and European banana prices. You can select multiple (or just one) entity / time-series to analyze through the `basket` variable.

In [6]:
analysis = y_pred.llm.analyze(
    context=dataset_context,
    basket=["Aluminum", "Banana, Europe"]
)
print("📊 Analysis:\n", analysis)

📊 Analysis:
 - The Aluminum price shows a slight decreasing trend from October 2020 to February 2021, with a decrease of 6.3%. However, from February 2021 to March 2021, there is a sudden increase of 1.2%. Overall, the Aluminum price remains relatively stable with small fluctuations throughout the period.
- The Banana price in Europe exhibits a decreasing trend from October 2020 to August 2021, with a decrease of 7.7%. From August 2021 to October 2021, there is a slight increase of 2.2%. From October 2021 to November 2021, there is a significant increase of 5.9%, followed by a steady increase until March 2023. The Banana price shows a strong upward trend, with an overall increase of 31.6%.
- Both Aluminum and Banana prices show seasonality, with similar patterns repeating each year. The lowest Aluminum prices occur in February, while the highest prices occur in December. For Banana prices, the lowest prices occur in July, and the highest prices occur in March.
- An anomaly can be obser

### Compare Forecasts

Let's now compare the previous selection with a new one. We'll refer to these as baskets A and B.

In [7]:
basket_a = ["Aluminum", "Banana, Europe"]
basket_b = ["Chicken", "Cocoa"]

Now compare!

In [8]:
comparison = y_pred.llm.compare(
    basket=basket_a,
    other_basket=basket_b
)
print("📊 Comparison:\n", comparison)

📊 Comparison:
 The provided time series data consists of two dataframes: "This" and "Other". Let's compare and contrast these dataframes in terms of trend, seasonality, and anomalies.

**Trend:**
- Aluminum in the "This" dataframe shows a decreasing trend over time, with a decrease of 12.01% from October 2020 to March 2023.
- Chicken in the "Other" dataframe does not show a clear trend, fluctuating within a narrow range over the given time period.

**Seasonality:**
- The "This" dataframe does not exhibit any apparent seasonality in either Aluminum or Banana, Europe.
- The "Other" dataframe also does not show any significant seasonality in Chicken or Cocoa.

**Anomalies:**
- In the "This" dataframe, there are no clear anomalies in the Aluminum or Banana, Europe data.
- Similarly, in the "Other" dataframe, there are no notable anomalies in the Chicken or Cocoa data.

In summary, the Aluminum prices in the "This" dataframe exhibit a decreasing trend over time, while the Chicken prices in 