feat: add tigemen docs (#3196)

* feat: add timegen nbs * feat: add codeowner * feat: add prerequisites to all tutorials * feat: ci errors * fix: rm outputs --------- Co-authored-by: Kriti <53083330+fkriti@users.noreply.github.com>
Azure · May 20, 2024 · ccc9f8d · ccc9f8d
1 parent 9a48258
commit ccc9f8d
Show file tree

Hide file tree

Showing 6 changed files with 1,596 additions and 0 deletions.
diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS
@@ -17,6 +17,7 @@
 /sdk/python/foundation-models/cohere/cohere-aisearch-langchain-rag.ipynb @stewart-co @kseniia-cohere  
 sdk/python/foundation-models/cohere/command_faiss_langchain.ipynb @stewart-co @kseniia-cohere  
 sdk/python/foundation-models/cohere/command_tools-langchain.ipynb @stewart-co @kseniia-cohere
+/sdk/python/foundation-models/nixtla/ @AzulGarza
 
 #### files referenced in docs (DO NOT EDIT, except for Docs team!!!) #############################################################################################
 /cli/assets/component/train.yml @sdgilley @msakande @Blackmist @ssalgadodev @lgayhardt @fbsolo-ms1  

diff --git a/sdk/python/foundation-models/nixtla/01_quickstart_forecast.ipynb b/sdk/python/foundation-models/nixtla/01_quickstart_forecast.ipynb
@@ -0,0 +1,112 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Prerequisites\n",
+    "\n",
+    "Please make sure to follow these steps to start using TimeGEN: \n",
+    "\n",
+    "* Register for a valid Azure account with subscription \n",
+    "* Make sure you have access to [Azure AI Studio](https://learn.microsoft.com/en-us/azure/ai-studio/what-is-ai-studio?tabs=home)\n",
+    "* Create a project and resource group\n",
+    "* Select `TimeGEN-1`.\n",
+    "\n",
+    "    > Notice that some models may not be available in all the regions in Azure AI and Azure Machine Learning. On those cases, you can create a workspace or project in the region where the models are available and then consume it with a connection from a different one. To learn more about using connections see [Consume models with connections](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deployments-connections)\n",
+    "\n",
+    "* Deploy with \"Pay-as-you-go\"\n",
+    "\n",
+    "Once deployed successfully, you should be assigned for an API endpoint and a security key for inference.\n",
+    "\n",
+    "To complete this tutorial, you will need to:\n",
+    "\n",
+    "* Install `nixtla` and `pandas`:\n",
+    "\n",
+    "    ```bash\n",
+    "    pip install nixtla pandas\n",
+    "    ```"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Quickstart\n",
+    "\n",
+    "To forecast with TimeGEN, simply call the `forecast` method, making sure that you pass your DataFrame, and specify your target and time column names. Then you can plot the predictions using the `plot` method."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "from nixtla import NixtlaClient"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Instantiate the Nixtla Client\n",
+    "nixtla_client = NixtlaClient(\n",
+    "    base_url=\"you azure ai endpoint\",\n",
+    "    api_key=\"your api_key\",\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Read the data\n",
+    "df = pd.read_csv(\n",
+    "    \"https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/air_passengers.csv\"\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Forecast\n",
+    "forecast_df = nixtla_client.forecast(\n",
+    "    df=df,\n",
+    "    h=12,\n",
+    "    time_col=\"timestamp\",\n",
+    "    target_col=\"value\",\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Plot predictions\n",
+    "nixtla_client.plot(\n",
+    "    df=df, forecasts_df=forecast_df, time_col=\"timestamp\", target_col=\"value\"\n",
+    ")"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "python3",
+   "language": "python",
+   "name": "python3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/sdk/python/foundation-models/nixtla/02_finetuning.ipynb b/sdk/python/foundation-models/nixtla/02_finetuning.ipynb
@@ -0,0 +1,181 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "a3e70828-d972-4231-aa21-89e5ede59366",
+   "metadata": {},
+   "source": [
+    "# Prerequisites\n",
+    "\n",
+    "Please make sure to follow these steps to start using TimeGEN: \n",
+    "\n",
+    "* Register for a valid Azure account with subscription \n",
+    "* Make sure you have access to [Azure AI Studio](https://learn.microsoft.com/en-us/azure/ai-studio/what-is-ai-studio?tabs=home)\n",
+    "* Create a project and resource group\n",
+    "* Select `TimeGEN-1`.\n",
+    "\n",
+    "    > Notice that some models may not be available in all the regions in Azure AI and Azure Machine Learning. On those cases, you can create a workspace or project in the region where the models are available and then consume it with a connection from a different one. To learn more about using connections see [Consume models with connections](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deployments-connections)\n",
+    "\n",
+    "* Deploy with \"Pay-as-you-go\"\n",
+    "\n",
+    "Once deployed successfully, you should be assigned for an API endpoint and a security key for inference.\n",
+    "\n",
+    "To complete this tutorial, you will need to:\n",
+    "\n",
+    "* Install `nixtla` and `pandas`:\n",
+    "\n",
+    "    ```bash\n",
+    "    pip install nixtla pandas\n",
+    "    ```"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "da753996-54f8-4244-a34e-7316b0c01827",
+   "metadata": {},
+   "source": [
+    "# Fine-tuning"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "75a62889-d81e-462e-b235-c1eba1096da9",
+   "metadata": {},
+   "source": [
+    "Fine-tuning is a powerful process for utilizing TimeGEN more effectively. Foundation models such as TimeGEN are pre-trained on vast amounts of data, capturing wide-ranging features and patterns. These models can then be specialized for specific contexts or domains. With fine-tuning, the model's parameters are refined to forecast a new task, allowing it to tailor its vast pre-existing knowledge towards the requirements of the new data. Fine-tuning thus serves as a crucial bridge, linking TimeGEN's broad capabilities to your tasks specificities.\n",
+    "\n",
+    "Concretely, the process of fine-tuning consists of performing a certain number of training iterations on your input data minimizing the forecasting error. The forecasts will then be produced with the updated model. To control the number of iterations, use the `finetune_steps` argument of the `forecast` method.\n",
+    "\n",
+    "To complete this tutorial, you will need to:\n",
+    "\n",
+    "* Install `nixtla` and `pandas`:\n",
+    "\n",
+    "    ```bash\n",
+    "    pip install nixtla pandas\n",
+    "    ```"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "10ec4f03",
+   "metadata": {},
+   "source": [
+    "## 1. Import packages\n",
+    "First, we import the required packages and initialize the Nixtla client"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "98942108-d427-42d6-81f8-fa0bb5859395",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "from nixtla import NixtlaClient"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "64178d1c-957e-4a04-ab64-fde332b1840c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "nixtla_client = NixtlaClient(\n",
+    "    base_url=\"you azure ai endpoint\",\n",
+    "    api_key=\"your api_key\",\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8c2e5387",
+   "metadata": {},
+   "source": [
+    "## 2. Load data"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b78cc83e-7d34-4c37-906d-8c7ed1a977fb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "df = pd.read_csv(\n",
+    "    \"https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/air_passengers.csv\"\n",
+    ")\n",
+    "df.head()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "09be4766",
+   "metadata": {},
+   "source": [
+    "## 3. Fine-tuning"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a683abc7-190c-40a6-a4e8-41a4c64bd773",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "timegpt_fcst_finetune_df = nixtla_client.forecast(\n",
+    "    df=df,\n",
+    "    h=12,\n",
+    "    finetune_steps=10,\n",
+    "    time_col=\"timestamp\",\n",
+    "    target_col=\"value\",\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "545ffdac-f166-417b-993f-78f51b0db6a1",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "nixtla_client.plot(\n",
+    "    df,\n",
+    "    timegpt_fcst_finetune_df,\n",
+    "    time_col=\"timestamp\",\n",
+    "    target_col=\"value\",\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "62fc9cba-7c6e-4aef-9c68-e05d4fe8f7ba",
+   "metadata": {},
+   "source": [
+    "In this code, `finetune_steps=10` means the model will go through 10 iterations of training on your time series data.\n",
+    "\n",
+    "Keep in mind that fine-tuning can be a bit of trial and error. You might need to adjust the number of `finetune_steps` based on your specific needs and the complexity of your data. It's recommended to monitor the model's performance during fine-tuning and adjust as needed. Be aware that more `finetune_steps` may lead to longer training times and could potentially lead to overfitting if not managed properly. \n",
+    "\n",
+    "Remember, fine-tuning is a powerful feature, but it should be used thoughtfully and carefully."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8c546351",
+   "metadata": {},
+   "source": [
+    "For a detailed guide on using a specific loss function for fine-tuning, check out the [Fine-tuning with a specific loss function](https://docs.nixtla.io/docs/tutorials-fine_tuning_with_a_specific_loss_function) tutorial."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "python3",
+   "language": "python",
+   "name": "python3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}