NeuralChat is a customizable chat framework designed to create user own chatbot within few minutes on multiple architectures. This notebook is used to demostrate how to build a talking chatbot on 3rd Generation of Intel® Xeon® Scalable Processors Ice Lake.

# Prepare Environment

Install Requirements:

In [None]:
!git clone https://github.com/intel/intel-extension-for-transformers.git

In [None]:
%cd ./intel-extension-for-transformers/
!pip install -r requirements.txt
%cd ./intel_extension_for_transformers/neural_chat/
!pip install -r requirements.txt

Install intel extension for transformers:

In [None]:
%cd ../../
!pip install v .

In [None]:
!pip uninstall torch -y
!pip install torch

In [None]:
!conda list

In [None]:
%cd ..

# Build your chatbot 💻

## Text Chat

Giving NeuralChat the textual instruction, it will respond with the textual response.

Python Code:

In [1]:
from intel_extension_for_transformers.neural_chat import build_chatbot
chatbot = build_chatbot()
response = chatbot.predict("Tell me about Intel Xeon Scalable Processors.")
print(response)

2023-10-30 00:35:32.287861: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-10-30 00:35:32.293245: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-10-30 00:35:32.293258: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


Loading config settings from the environment...
Loading model meta-llama/Llama-2-7b-chat-hf


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Model loaded.
[INST] Tell me about Intel Xeon Scalable Processors. [/INST]  Intel Xeon Scalable Processors are a line of high-performance server processors designed for data center and enterprise computing applications. These processors are part of Intel's "Xeon" family, which offers a range of models with different performance and power efficiency characteristics.

Here are some key features and benefits of Intel Xeon Scalable Processors:

1. Performance: Xeon Scalable Processors are designed to deliver high levels of processing power and performance, making them well-suited for demanding workloads such as virtualization, cloud computing, and big data analytics.
2. Power Efficiency: These processors are built with advanced power management technologies, such as Intel's Turbo Boost 3.0 and Dynamic Voltage and Frequency Scaling (DVFS), which help to reduce power consumption and improve energy efficiency.
3. Security: Xeon Scalable Processors include a range of security features, such as

CLI command:

In [2]:
!neuralchat predict --query "Tell me about Intel Xeon Scalable Processors."

Loading config settings from the environment...
Loading model meta-llama/Llama-2-7b-chat-hf
Loading checkpoint shards: 100%|██████████████████| 2/2 [00:01<00:00,  1.54it/s]
Model loaded.
[INST] Tell me about Intel Xeon Scalable Processors. [/INST]  Intel Xeon Scalable Processors are a line of high-performance server processors designed for data center and enterprise computing applications. These processors are part of Intel's "Xeon" family, which offers a range of models with different performance and power efficiency characteristics.

Here are some key features and benefits of Intel Xeon Scalable Processors:

1. Performance: Xeon Scalable Processors are designed to deliver high levels of processing power and performance, making them well-suited for demanding workloads such as virtualization, cloud computing, and big data analytics.
2. Power Efficiency: These processors are built with advanced power management technologies, such as Intel's Turbo Boost 3.0 and Dynamic Voltage and Freque

## Text Chat With Retrieval Plugin

User could also leverage NeuralChat Retrieval plugin to do domain specific chat by feding with some documents like below:

In [3]:
from intel_extension_for_transformers.neural_chat import PipelineConfig
from intel_extension_for_transformers.neural_chat import build_chatbot
from intel_extension_for_transformers.neural_chat import plugins
plugins.retrieval.enable=True
plugins.retrieval.args["input_path"]="../../assets/docs/"
config = PipelineConfig(plugins=plugins, model_name_or_path='Intel/neural-chat-7b-v1-1')
chatbot = build_chatbot(config)
response = chatbot.predict("How many cores does the Intel® Xeon® Platinum 8480+ Processor have in total?")

2023-10-30 00:37:41,725 - sentence_transformers.SentenceTransformer - INFO - Load pretrained SentenceTransformer: BAAI/bge-base-en-v1.5


create retrieval plugin instance...
plugin parameters:  {'input_path': '../../assets/docs/'}


2023-10-30 00:37:42,877 - sentence_transformers.SentenceTransformer - INFO - Use pytorch device: cpu
2023-10-30 00:37:43,098 - chromadb.telemetry.product.posthog - INFO - Anonymized telemetry enabled. See                     https://docs.trychroma.com/telemetry for more information.


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

The local knowledge base has been successfully built!
Loading model Intel/neural-chat-7b-v1-1
Model loaded.
Chat with AI Agent.


## Voice Chat with ASR & TTS Plugin

In the context of voice chat, users have the option to engage in various modes: utilizing input audio and receiving output audio, employing input audio and receiving textual output, or providing input in textual form and receiving audio output.

For the Python API code, users have the option to enable different voice chat modes by setting ASR and TTS plugins enable or disable.

In [4]:
!curl -OL https://raw.githubusercontent.com/intel/intel-extension-for-transformers/main/intel_extension_for_transformers/neural_chat/assets/speaker_embeddings/spk_embed_default.pt
!curl -OL https://raw.githubusercontent.com/intel/intel-extension-for-transformers/main/intel_extension_for_transformers/neural_chat/assets/audio/sample.wav

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2816  100  2816    0     0  16769      0 --:--:-- --:--:-- --:--:-- 16862
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 48172  100 48172    0     0   249k 

In [5]:
from intel_extension_for_transformers.neural_chat import PipelineConfig
from intel_extension_for_transformers.neural_chat import build_chatbot
from intel_extension_for_transformers.neural_chat import plugins
plugins.tts.enable = True
plugins.tts.args["output_audio_path"] = "./response.wav"
plugins.asr.enable = True

config = PipelineConfig(plugins=plugins, model_name_or_path='Intel/neural-chat-7b-v1-1')
chatbot = build_chatbot(config)
result = chatbot.predict(query="./sample.wav")
print(result)

create tts plugin instance...
plugin parameters:  {'output_audio_path': './response.wav'}


2023-10-30 00:39:39,560 - speechbrain.pretrained.fetching - INFO - Fetch hyperparams.yaml: Using existing file/symlink in /tmp/speechbrain/spkrec-xvect-voxceleb/hyperparams.yaml.
2023-10-30 00:39:39,560 - speechbrain.pretrained.fetching - INFO - Fetch custom.py: Delegating to Huggingface hub, source speechbrain/spkrec-xvect-voxceleb.
2023-10-30 00:39:39,710 - speechbrain.pretrained.fetching - INFO - Fetch embedding_model.ckpt: Using existing file/symlink in /tmp/speechbrain/spkrec-xvect-voxceleb/embedding_model.ckpt.
2023-10-30 00:39:39,711 - speechbrain.pretrained.fetching - INFO - Fetch mean_var_norm_emb.ckpt: Using existing file/symlink in /tmp/speechbrain/spkrec-xvect-voxceleb/mean_var_norm_emb.ckpt.
2023-10-30 00:39:39,712 - speechbrain.pretrained.fetching - INFO - Fetch classifier.ckpt: Using existing file/symlink in /tmp/speechbrain/spkrec-xvect-voxceleb/classifier.ckpt.
2023-10-30 00:39:39,712 - speechbrain.pretrained.fetching - INFO - Fetch label_encoder.txt: Using existing fi

create asr plugin instance...
plugin parameters:  {}


2023-10-30 00:39:42,239 - sentence_transformers.SentenceTransformer - INFO - Load pretrained SentenceTransformer: BAAI/bge-base-en-v1.5


create retrieval plugin instance...
plugin parameters:  {'input_path': '../../assets/docs/'}


2023-10-30 00:39:43,023 - sentence_transformers.SentenceTransformer - INFO - Use pytorch device: cpu


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

The local knowledge base has been successfully built!
Loading model Intel/neural-chat-7b-v1-1
Model loaded.
generated text in 0.5537045001983643 seconds, and the result is: who is pat gelsinger
Chat with AI Agent.
assistant
I don't know this person.#---#
## --- ### --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- ## --- #
["assistant I don't know this person.# # ## ### ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## ## #."]
No customized speaker embedding or default embedding are found! Use the backup one
./response.wav


You can display the generated wav file using IPython.

In [6]:
from intel_extension_for_transformers.neural_chat import PipelineConfig
from intel_extension_for_transformers.neural_chat import build_chatbot
from intel_extension_for_transformers.neural_chat import plugins
plugins.tts.enable = True
plugins.tts.args["output_audio_path"] = "./response.wav"
plugins.asr.enable = True

config = PipelineConfig(plugins=plugins, model_name_or_path='Intel/neural-chat-7b-v1-1')
chatbot = build_chatbot(config)
result = chatbot.predict(query="./sample.wav")
print(result)

create tts plugin instance...
plugin parameters:  {'output_audio_path': './response.wav'}


2023-10-30 00:40:59,520 - speechbrain.pretrained.fetching - INFO - Fetch hyperparams.yaml: Using existing file/symlink in /tmp/speechbrain/spkrec-xvect-voxceleb/hyperparams.yaml.
2023-10-30 00:40:59,521 - speechbrain.pretrained.fetching - INFO - Fetch custom.py: Delegating to Huggingface hub, source speechbrain/spkrec-xvect-voxceleb.
2023-10-30 00:40:59,675 - speechbrain.pretrained.fetching - INFO - Fetch embedding_model.ckpt: Using existing file/symlink in /tmp/speechbrain/spkrec-xvect-voxceleb/embedding_model.ckpt.
2023-10-30 00:40:59,676 - speechbrain.pretrained.fetching - INFO - Fetch mean_var_norm_emb.ckpt: Using existing file/symlink in /tmp/speechbrain/spkrec-xvect-voxceleb/mean_var_norm_emb.ckpt.
2023-10-30 00:40:59,676 - speechbrain.pretrained.fetching - INFO - Fetch classifier.ckpt: Using existing file/symlink in /tmp/speechbrain/spkrec-xvect-voxceleb/classifier.ckpt.
2023-10-30 00:40:59,676 - speechbrain.pretrained.fetching - INFO - Fetch label_encoder.txt: Using existing fi

create asr plugin instance...
plugin parameters:  {}


2023-10-30 00:41:02,335 - sentence_transformers.SentenceTransformer - INFO - Load pretrained SentenceTransformer: BAAI/bge-base-en-v1.5


create retrieval plugin instance...
plugin parameters:  {'input_path': '../../assets/docs/'}


2023-10-30 00:41:03,115 - sentence_transformers.SentenceTransformer - INFO - Use pytorch device: cpu


Batches:   0%|          | 0/1 [00:00<?, ?it/s]

The local knowledge base has been successfully built!
Loading model Intel/neural-chat-7b-v1-1
Model loaded.
generated text in 0.32239818572998047 seconds, and the result is: who is pat gelsinger
Chat with AI Agent.
assistant
I don't know this person.#---#
## --- ### Gelsinger #

Patrick Joseph "Pat" Gelsinger (born August 29, 1956) is an American business executive and former politician who served as the United States Secretary of Defense from 2021 until 2023 under President Joe Biden.[10] He previously served in the same role during the Obama administration,[11][12] making him the second person after Robert Gates to have held both positions simultaneously since 1947;[13] he was also the first secretary of defense to have been born in the 20th century.[14] Prior to his appointment at age 65, Gelsinger had never held elected office or worked for the federal government before being appointed by then–Senator Chuck Hagel.[15] As president, Barack Obama nominated Gelsinger on December 23, 2