Jupyter Notebooks for NeuralChat (#277)

* Jupyter Notebooks for NeuralChat Signed-off-by: Lv, Liang1 <liang1.lv@intel.com> * update build and deploy chatbot Signed-off-by: Lv, Liang1 <liang1.lv@intel.com> * added NeuralChat optimization notebooks. Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com> * Update tts.py * Add Notebooks for finetuning chatbot on various platforms (#309) * fix config Signed-off-by: XuhuiRen <xuhui.ren@intel.com> * add notebook Signed-off-by: XuhuiRen <xuhui.ren@intel.com> --------- Signed-off-by: XuhuiRen <xuhui.ren@intel.com> * fix as suggestions Signed-off-by: XuhuiRen <xuhui.ren@intel.com> * Update tts.py * Update build_chatbot_on_spr.ipynb * Update build_chatbot_on_spr.ipynb * Update tts.py * update notebook Signed-off-by: Lv, Liang1 <liang1.lv@intel.com> * update notebook Signed-off-by: Lv, Liang1 <liang1.lv@intel.com> * fix pylint issue Signed-off-by: Lv, Liang1 <liang1.lv@intel.com> --------- Signed-off-by: Lv, Liang1 <liang1.lv@intel.com> Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com> Signed-off-by: XuhuiRen <xuhui.ren@intel.com> Co-authored-by: Ye, Xinyu <xinyu.ye@intel.com> Co-authored-by: Liangyx2 <106130696+Liangyx2@users.noreply.github.com> Co-authored-by: Haihao Shen <haihao.shen@intel.com> Co-authored-by: XuhuiRen <44249229+XuhuiRen@users.noreply.github.com> Co-authored-by: XuhuiRen <xuhui.ren@intel.com>
intel · Sep 14, 2023 · 52f9f74 · 52f9f74
1 parent 276f889
commit 52f9f74
Show file tree

Hide file tree

Showing 26 changed files with 3,293 additions and 1,176 deletions.
diff --git a/intel_extension_for_transformers/llm/finetuning/finetuning.py b/intel_extension_for_transformers/llm/finetuning/finetuning.py
@@ -513,7 +513,7 @@ def concatenate_data(dataset, max_seq_length):
                     data_collator=data_collator,
                 )
             else:
-                from optimum.habana import GaudiConfig, GaudiTrainer # pylint: disable=E0611
+                from optimum.habana import GaudiConfig, GaudiTrainer # pylint: disable=E0611 E0401
 
                 gaudi_config = GaudiConfig()
                 gaudi_config.use_fused_adam = True

diff --git a/intel_extension_for_transformers/neural_chat/README.md b/intel_extension_for_transformers/neural_chat/README.md
@@ -170,12 +170,38 @@ The table below displays the validated model list in NeuralChat for both inferen
 
 ## Jupyter Notebooks 
 
-Check out the latest notebooks to know how to build and customize a chatbot on different platforms.
-
-| **Notebook** | **Description** |
-| :----------: | :-------------: |
-| [build chatbot on Intel Xeon Platforms](./docs/notebooks/chatbot_on_intel_cpu.ipynb) | create a chatbot on Intel Xeon Platforms|
-| [build chatbot on Intel Habana Platforms](./docs/notebooks/chatbot_on_intel_habana_hpu.ipynb) | create a chatbot on Intel Habana Platforms|
-| [build chatbot on Nvidia GPU Platforms](./docs/notebooks/chatbot_on_nv_gpu.ipynb) | create a chatbot on Nvidia GPU Platforms|
-| [finetune on Nvidia GPU Platforms](./examples/instruction_tuning/finetune_on_Nvidia_GPU.ipynb) | fine-tune LLaMA2 and MPT on Nvidia GPU Platforms|
+Welcome to use Jupyter Notebooks to explore how to build and customize chatbots across a wide range of platforms, including Intel Xeon CPU(ICX and SPR), Intel XPU, Intel Habana Gaudi1/Gaudi2, and Nvidia GPU. Dive into our detailed guide to discover how to develop chatbots on these various computing platforms.
+
+| Chapter | Section                                       | Description                                                | Notebook Link                                           |
+| ------- | --------------------------------------------- | ---------------------------------------------------------- | ------------------------------------------------------- |
+| 1       | Building a Chatbot on different Platforms   |                                                            |                                                         |
+| 1.1     | Building a Chatbot on Intel CPU ICX         | Learn how to create a chatbot on ICX.                      | [Notebook](./docs/notebooks/build_chatbot_on_icx.ipynb) |
+| 1.2     | Building a Chatbot on Intel CPU SPR         | Learn how to create a chatbot on SPR.                      | [Notebook](./docs/notebooks/build_chatbot_on_spr.ipynb) |
+| 1.3     | Building a Chatbot on Intel XPU             | Learn how to create a chatbot on XPU.                      | [Notebook](./docs/notebooks/build_chatbot_on_xpu.ipynb) |
+| 1.4     | Building a Chatbot on Habana Gaudi1/Gaudi2  | Instructions for building a chatbot on Intel Habana Gaudi1/Gaudi2. | [Notebook](./docs/notebooks/build_chatbot_on_habana_gaudi.ipynb) |
+| 1.5     | Building a Chatbot on Nvidia A100           | Learn how to create a chatbot on Nvidia A100 platforms.   | [Notebook](./docs/notebooks/build_chatbot_on_nv_a100.ipynb)   |
+| 2       | Deploying Chatbots as Services on Different Platforms |                                                  |                                                         |
+| 2.1     | Deploying a Chatbot on Intel CPU ICX        | Instructions for deploying a chatbot on ICX.               | [Notebook](./docs/notebooks/deploy_chatbot_on_icx.ipynb) |
+| 2.2     | Deploying a Chatbot on Intel CPU SPR        | Instructions for deploying a chatbot on SPR.               | [Notebook](./docs/notebooks/deploy_chatbot_on_spr.ipynb) |
+| 2.3     | Deploying a Chatbot on Intel XPU            | Learn how to deploy a chatbot on Intel XPU.                | [Notebook](./docs/notebooks/deploy_chatbot_on_xpu.ipynb) |
+| 2.4     | Deploying a Chatbot on Habana Gaudi1/Gaudi2 | Instructions for deploying a chatbot on Intel Habana Gaudi1/Gaudi2. | [Notebook](./docs/notebooks/deploy_chatbot_on_habana_gaudi.ipynb) |
+| 2.5     | Deploying a Chatbot on Nvidia A100          | Learn how to deploy a chatbot as a service on Nvidia A100 platforms. | [Notebook](./docs/notebooks/deploy_chatbot_on_nv_a100.ipynb) |
+| 2.6     | Deploying Chatbot with load balance         | Learn how to deploy a chatbot as a service with load balance. | [Notebook](./docs/notebooks/chatbot_with_load_balance.ipynb) |
+| 3       | Optimizing Chatbots on Different Platforms  |                                                            |                                                         |
+| 3.1     | AMP Optimization on SPR                     | Optimize your chatbot using Automatic Mixed Precision (AMP) on SPR platforms. | [Notebook](./docs/notebooks/amp_optimization_on_spr.ipynb) |
+| 3.2     | AMP Optimization on Habana Gaudi1/Gaudi2    | Learn how to optimize your chatbot with AMP on Intel Habana Gaudi1/Gaudi2 platforms. | [Notebook](./docs/notebooks/amp_optimization_on_habana_gaudi.ipynb) |
+| 3.3     | Weight-Only Optimization on Nvidia A100     | Optimize your chatbot using Weight-Only optimization on Nvidia A100. | [Notebook](./docs/notebooks/weight_only_optimization_on_nv_a100.ipynb) |
+| 4       | Fine-Tuning Chatbots on Different Platforms |                                                            |                                                         |
+| 4.1     | Single Node Fine-Tuning on SPR               | Fine-tune your chatbot on SPR platforms using single node. | [Notebook](./docs/notebooks/single_node_finetuning_on_spr.ipynb) |
+| 4.2     | Multi-Node Fine-Tuning on SPR                | Fine-tune your chatbot on SPR platforms using multiple nodes. | [Notebook](./docs/notebooks/multi_node_finetuning_on_spr.ipynb) |
+| 4.3     | Single-Card Fine-Tuning on Habana Gaudi1/Gaudi2 | Instructions for single-card fine-tuning on Intel Habana Gaudi1/Gaudi2. | [Notebook](./docs/notebooks/single_card_finetuning_on_habana_gaudi.ipynb) |
+| 4.4     | Multi-Card Fine-Tuning on Habana Gaudi1/Gaudi2 | Learn how to perform multi-card fine-tuning on Intel Habana Gaudi1/Gaudi2. | [Notebook](./docs/notebooks/multi_card_finetuning_on_habana_gaudi.ipynb) |
+| 4.5     | Fine-Tuning on Nvidia A100                  | Fine-tune your chatbot on Nvidia A100 platforms.          | [Notebook](./docs/notebooks/finetuning_on_nv_a100.ipynb) |
+| 5       | Customizing Chatbots on Different Platforms |                                                            |                                                         |
+| 5.1     | Using Plugins to Customize Chatbots         | Customize your chatbot using plugins.                      | [Notebook](./docs/notebooks/customize_chatbot_with_plugins.ipynb) |
+| 5.2     | Registering New Models to Customize Chatbots |                                                            |                                                         |
+| 5.2.1   | Using Fine-Tuned Models to Customize Chatbots | Instructions for using fine-tuned models to customize chatbots. | [Notebook](./docs/notebooks/customize_chatbot_with_finetuned_models.ipynb) |
+| 5.2.2   | Using Optimized Models to Customize Chatbots | Customize chatbots using optimized models.                | [Notebook](./docs/notebooks/customize_chatbot_with_optimized_models.ipynb) |
+| 5.2.3   | Using New LLM Models to Customize Chatbots  | Learn how to use new LLM models for chatbot customization. | [Notebook](./docs/notebooks/customize_chatbot_with_new_llm_models.ipynb) |
+
 
diff --git a/intel_extension_for_transformers/neural_chat/config.py b/intel_extension_for_transformers/neural_chat/config.py
@@ -81,7 +81,7 @@ class ModelArguments:
         },
     )
     use_fast_tokenizer: bool = field(
-        default=True,
+        default=False,
         metadata={
             "help": "Whether to use one of the fast tokenizer (backed by the tokenizers library) or not."
         },
@@ -312,7 +312,7 @@ class FinetuningArguments:
         },
     )
     lora_all_linear: bool = field(
-        default=False,
+        default=True,
         metadata={"help": "if True, will add adaptor for all linear for lora finetuning"},
     )
     task: Optional[str] = field(
@@ -322,7 +322,7 @@ class FinetuningArguments:
             },
     )
     do_lm_eval: bool = field(
-        default=False,
+        default=True,
         metadata={"help": "whether to run the LM evaluation with EleutherAI/lm-evaluation-harness"},
     )
     lm_eval_tasks: Optional[List[str]] = field(

diff --git a/...ension_for_transformers/neural_chat/docs/notebooks/amp_optimization_on_habana_gaudi.ipynb b/...ension_for_transformers/neural_chat/docs/notebooks/amp_optimization_on_habana_gaudi.ipynb
@@ -0,0 +1,61 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# AMP Optimization of Chatbot on Habana's Gaudi processors(HPU)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Prepare Environment"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**IMPORTANT:** Please note Habana's Gaudi processors(HPU) requires docker environment for running. User needs to manually execute below steps to build docker image and run docker container for inference on Habana HPU. The Jupyter notebook server should be started in the docker container and then run this Jupyter notebook. \n",
+    "\n",
+    "```bash\n",
+    "git clone https://github.com/intel/intel-extension-for-transformers.git\n",
+    "cd ./intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/docker/\n",
+    "docker build --build-arg UBUNTU_VER=22.04 -f Dockerfile -t neuralchat . --target hpu\n",
+    "docker run -it --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host --ipc=host neuralchat:latest\n",
+    "```\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## BF16 Optimization"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from intel_extension_for_transformers.neural_chat import build_chatbot\n",
+    "from intel_extension_for_transformers.neural_chat.config import PipelineConfig, AMPConfig\n",
+    "config = PipelineConfig(optimization_config=AMPConfig())\n",
+    "chatbot = build_chatbot(config)\n",
+    "response = chatbot.predict(query=\"Tell me about Intel Xeon Scalable Processors.\")\n",
+    "print(response)"
+   ]
+  }
+ ],
+ "metadata": {
+  "language_info": {
+   "name": "python"
+  },
+  "orig_nbformat": 4
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/intel_extension_for_transformers/neural_chat/docs/notebooks/amp_optimization_on_spr.ipynb b/intel_extension_for_transformers/neural_chat/docs/notebooks/amp_optimization_on_spr.ipynb
@@ -0,0 +1,94 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# AMP Optimization of Chatbot on 4th Generation of Intel® Xeon® Scalable Processors Sapphire Rapids"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Prepare Environment"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Install intel extension for transformers:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install intel-extension-for-transformers"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Install Requirements:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%cd ../../\n",
+    "!pip install -r requirements.txt"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## BF16 Optimization"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from intel_extension_for_transformers.neural_chat import build_chatbot\n",
+    "from intel_extension_for_transformers.neural_chat.config import PipelineConfig, AMPConfig\n",
+    "config = PipelineConfig(optimization_config=AMPConfig())\n",
+    "chatbot = build_chatbot(config)\n",
+    "response = chatbot.predict(query=\"Tell me about Intel Xeon Scalable Processors.\")\n",
+    "print(response)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "py39",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.16"
+  },
+  "orig_nbformat": 4
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}