# Getting Started with Time Series Models on IBM WatsonX

This notebook demonstrates using the WatsonX SDK to perform inference calls against a model hosted remotely on [WatsonX](https://www.ibm.com/products/watsonx-ai).

### Install dependencies

> **NOTE**: When running this recipe in [Colab](https://colab.research.google.com/), you may see an error about dependency conflicts with `google-colab 1.0.0`. You can safely ignore this error.

In [None]:
!pip install git+https://github.com/ibm-granite-community/utils
!pip install ibm-watsonx-ai

In [None]:
import pandas as pd
from ibm_granite_community.notebook_utils import get_env_var
from ibm_watsonx_ai import APIClient, Credentials
from ibm_watsonx_ai.foundation_models import TSModelInference
from ibm_watsonx_ai.foundation_models.schema import TSForecastParameters

### Provide the environment variables

There are three ways to provide the environment variables required. In order of precedence:

1. Directly as an environment variable in the python environment where the jupyter notebook is running.
2. As a Google Colab secret, if you are running the notebook in Colab.
3. Supplied by the user in a prompt during execution of the notebook.

#### Provide your API Key

Obtain your `WATSONX_APIKEY` by generating a [Platform API Key](https://www.ibm.com/docs/en/watsonx/watsonxdata/1.0.x?topic=started-generating-api-keys) on the watsonx.data web client.

#### Provide your Project Id

Get your `WATSONX_PROJECT_ID` from the [WatsonX](https://www.ibm.com/watsonx) web client by following [these instructions](https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-project-id.html?context=wx).

#### Provide your base WatsonX URL

Get your `WATSONX_URL` by viewing the details for the service instance from the Cloud Pak for Data web client, as described in [these watsonx.ai setup instructions](https://ibm.github.io/watsonx-ai-python-sdk/setup_cpd.html).

As an example, your `WATSONX_URL` may be `https://us-south.ml.cloud.ibm.com` for the Dallas zone.

In [None]:
credentials = Credentials(
    api_key=get_env_var("WATSONX_APIKEY"),
    url=get_env_var("WATSONX_URL"),
)
client = APIClient(credentials)
client.set.default_project(get_env_var("WATSONX_PROJECT_ID"))

In [None]:
for model in client.foundation_models.get_time_series_model_specs()["resources"]:
    print("--------------------------------------------------")
    print(f'model_id: {model["model_id"]}')
    print(f'functions: {model["functions"]}')
    print(f'long_description: {model["long_description"]}')
    print(f'label: {model["label"]}')

In [None]:
ts_model_id = client.foundation_models.TimeSeriesModels.GRANITE_TTM_512_96_R2

ts_model = TSModelInference(model_id=ts_model_id, api_client=client)
context_length = 512

### Download the data

We'll work with a [bike sharing dataset](https://archive.ics.uci.edu/dataset/275/bike+sharing+dataset) available from the UCI Machine learning repository. This dataset includes the count of rental bikes between the years 2011 and 2012 in the Capital bike share system with the corresponding weather and seasonal information.

You can download the source code to a temporary directory by running the following commands. Later you can clean up any downloaded files by removing the `temp` folder.

In [None]:
%%bash
# curl https://archive.ics.uci.edu/static/public/275/$BIKE_SHARING -o $BIKE_SHARING && \
BIKE_SHARING=bike+sharing+dataset.zip
test -d temp || ( \
  mkdir -p temp && \
  cd temp && \
    wget https://archive.ics.uci.edu/static/public/275/$BIKE_SHARING -O $BIKE_SHARING && \
    unzip -o $BIKE_SHARING && \
  rm -f $BIKE_SHARING && \
  cd - \
) && ls -l temp/

In [None]:
DATA_FILE_PATH = "temp/hour.csv"

### Read in the data

We parse the CSV into a pandas dataframe, filling in any null values, and create a single window containing `context_length` time points. We ensure the timestamp column is a UTC datetime.

In [None]:
timestamp_column = "dteday"
target_columns = ["casual", "registered"]

# Read in the data from the downloaded file.
input_df = pd.read_csv(DATA_FILE_PATH, parse_dates=[timestamp_column])

# Fix missing hours in original dataset date column
input_df[timestamp_column] = input_df[timestamp_column] + input_df.hr.apply(lambda x: pd.Timedelta(x, unit="hr"))

# Show the last few rows of the dataset.
input_df[timestamp_column] = input_df[timestamp_column].apply(lambda x: x.isoformat())
input_df.tail()

In [None]:
forecasting_params = TSForecastParameters(
    timestamp_column=timestamp_column, freq="1h", target_columns=target_columns, prediction_length=20
)

In [None]:
results = ts_model.forecast(data=input_df[:context_length], params=forecasting_params)

In [None]:
pd.DataFrame(results["results"][0], columns=[timestamp_column] + target_columns)