NeuralChat is a customizable chat framework designed to create user own chatbot within few minutes on multiple architectures. This notebook is used to demonstrate how to build a talking chatbot on 4th Generation of IntelÂ® XeonÂ® Scalable Processors Sapphire Rapids.

The 4th Generation of IntelÂ® XeonÂ® Scalable processor provides two instruction sets viz. AMX_BF16 and AMX_INT8 which provides acceleration for bfloat16 and int8 operations respectively.

# Prepare Environment

Install intel extension for transformers:

In [1]:
!pip install intel-extension-for-transformers

Defaulting to user installation because normal site-packages is not writeable


Install Requirements:

In [4]:
!git clone https://github.com/intel/intel-extension-for-transformers.git

fatal: destination path 'intel-extension-for-transformers' already exists and is not an empty directory.


In [5]:
%cd ./intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/
!pip install -r requirements_cpu.txt
%cd ../../../

/home/u09eaec73fc2ef253e8aa0ff1cb68e0f/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/docs/notebooks/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat
Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.org/simple, https://pytorch-extension.intel.com/release-whl/stable/cpu/us/
/home/u09eaec73fc2ef253e8aa0ff1cb68e0f/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/docs/notebooks


# Build your chatbot ðŸ’»

## Text Chat

Giving NeuralChat the textual instruction, it will respond with the textual response.

In [3]:
# BF16 Optimization
from intel_extension_for_transformers.neural_chat import build_chatbot, PipelineConfig
from intel_extension_for_transformers.transformers import MixedPrecisionConfig
config = PipelineConfig(optimization_config=MixedPrecisionConfig())
chatbot = build_chatbot(config)
response = chatbot.predict(query="Tell me about Intel Xeon Scalable Processors.")
print(response)


ModuleNotFoundError: No module named 'intel_extension_for_transformers'

## Text Chat With Retrieval Plugin

User could also leverage NeuralChat Retrieval plugin to do domain specific chat by feding with some documents like below:

In [5]:
%cd ./intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/pipeline/plugins/retrieval/
!pip install -r requirements.txt
%cd ../../../../../../

/home/u09eaec73fc2ef253e8aa0ff1cb68e0f/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/docs/notebooks/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/pipeline/plugins/retrieval
Defaulting to user installation because normal site-packages is not writeable
/home/u09eaec73fc2ef253e8aa0ff1cb68e0f/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/docs/notebooks


In [11]:
!mkdir docs
%cd docs
!curl -OL https://raw.githubusercontent.com/intel/intel-extension-for-transformers/main/intel_extension_for_transformers/neural_chat/assets/docs/sample.jsonl
!curl -OL https://raw.githubusercontent.com/intel/intel-extension-for-transformers/main/intel_extension_for_transformers/neural_chat/assets/docs/sample.txt
!curl -OL https://raw.githubusercontent.com/intel/intel-extension-for-transformers/main/intel_extension_for_transformers/neural_chat/assets/docs/sample.xlsx
%cd ..

mkdir: cannot create directory â€˜docsâ€™: File exists
/home/u09eaec73fc2ef253e8aa0ff1cb68e0f/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/docs/notebooks/docs
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   854  100   854    0     0  10878      0 --:--:-- --:--:-- --:--:-- 10948
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    59  100    59    0     0    675      0 --:--:-- --:--:-- --:--:--   686
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  8577  100  8577    0     0  54660      0 --:--:-- --:--:-- --:--:-- 54980
/home/u09eaec73fc2ef253e8aa0ff1cb68e0f/intel-extension-for-transformers/intel_extension_for_trans

In [12]:
from intel_extension_for_transformers.neural_chat import PipelineConfig
from intel_extension_for_transformers.neural_chat import build_chatbot
from intel_extension_for_transformers.neural_chat import plugins
plugins.retrieval.enable=True
plugins.retrieval.args["input_path"]="./docs/"
config = PipelineConfig(plugins=plugins)
chatbot = build_chatbot(config)
response = chatbot.predict("How many cores does the IntelÂ® XeonÂ® Platinum 8480+ Processor have in total?")
print(response)

ModuleNotFoundError: No module named 'intel_extension_for_transformers'

## Voice Chat with ASR & TTS Plugin

In the context of voice chat, users have the option to engage in various modes: utilizing input audio and receiving output audio, employing input audio and receiving textual output, or providing input in textual form and receiving audio output.

For the Python API code, users have the option to enable different voice chat modes by setting ASR and TTS plugins enable or disable.

In [8]:
%cd ./intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/pipeline/plugins/audio/
!pip install -r requirements.txt
%cd ../../../../../../

/home/u09eaec73fc2ef253e8aa0ff1cb68e0f/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/docs/notebooks/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/pipeline/plugins/audio
Defaulting to user installation because normal site-packages is not writeable
Ignoring openjtalk: markers 'sys_platform != "linux"' don't match your environment
/home/u09eaec73fc2ef253e8aa0ff1cb68e0f/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/docs/notebooks


In [14]:
!curl -OL https://raw.githubusercontent.com/intel/intel-extension-for-transformers/main/intel_extension_for_transformers/neural_chat/assets/speaker_embeddings/spk_embed_default.pt
!curl -OL https://raw.githubusercontent.com/intel/intel-extension-for-transformers/main/intel_extension_for_transformers/neural_chat/assets/audio/sample.wav

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2816  100  2816    0     0  12204      0 --:--:-- --:--:-- --:--:-- 12190
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 48172  100 48172    0     0   171k      0 --:--:-- --:--:-- --:--:--  171k


In [15]:
from intel_extension_for_transformers.neural_chat import PipelineConfig
from intel_extension_for_transformers.neural_chat import build_chatbot
from intel_extension_for_transformers.neural_chat import plugins
plugins.tts.enable = True
plugins.tts.args["output_audio_path"] = "./response.wav"
plugins.asr.enable = True

config = PipelineConfig(plugins=plugins)
chatbot = build_chatbot(config)
result = chatbot.predict(query="./sample.wav")
print(result)

ModuleNotFoundError: No module named 'intel_extension_for_transformers'

# Low Precision Optimization

## BF16

In [13]:
# BF16 Optimization
from intel_extension_for_transformers.neural_chat.config import PipelineConfig
from intel_extension_for_transformers.transformers import MixedPrecisionConfig
config = PipelineConfig(optimization_config=MixedPrecisionConfig())
chatbot = build_chatbot(config)
response = chatbot.predict(query="Tell me about Intel Xeon Scalable Processors.")
print(response)

ModuleNotFoundError: No module named 'intel_extension_for_transformers'