NeuralChat is a customizable chat framework designed to create user own chatbot within few minutes on multiple architectures. This notebook is used to demostrate how to deploy a talking chatbot as a service on 3rd Generation of Intel® Xeon® Scalable Processors Ice Lake.

# Prepare Environment

Install intel extension for transformers:

In [5]:
!pip install intel-extension-for-transformers

Install Requirements:

In [None]:
!git clone https://github.com/intel/intel-extension-for-transformers.git
%cd ./intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/
!pip install -r requirements.txt
%cd ../../../

In [None]:
!pip uninstall torch -y
!pip install torch

# Client-Server Architecture for Performance and Scalability

## Quick Start Local Server & Access Text Chat Service 

In [None]:
!curl -OL https://raw.githubusercontent.com/intel/intel-extension-for-transformers/main/intel_extension_for_transformers/neural_chat/examples/deployment/textbot/backend/xeon/textbot.yaml

In [None]:
import time
import multiprocessing
from intel_extension_for_transformers.neural_chat import NeuralChatServerExecutor
import nest_asyncio
nest_asyncio.apply()

def start_service():
    server_executor = NeuralChatServerExecutor()
    server_executor(config_file="textbot.yaml", log_file="neuralchat.log")
multiprocessing.Process(target=start_service).start()

❗ Please notice that the server is running on the background. 

If you run the codes in a command-line window, please run the following codes in a new terminal or session.

In [None]:
from intel_extension_for_transformers.neural_chat import TextChatClientExecutor
executor = TextChatClientExecutor()
result = executor(
    prompt="Tell me about Intel Xeon Scalable Processors.",
    server_ip="127.0.0.1", # master server ip
    port=8000 # master server entry point 
    )
print(result.text)

## Quick Start Local Server & Access Voice Chat Service

In [None]:
!curl -OL https://raw.githubusercontent.com/intel/intel-extension-for-transformers/main/intel_extension_for_transformers/neural_chat/examples/deployment/talkingbot/server/backend/talkingbot.yaml

In [None]:
import time
import multiprocessing
from intel_extension_for_transformers.neural_chat import NeuralChatServerExecutor
import nest_asyncio
nest_asyncio.apply()

def start_service():
    server_executor = NeuralChatServerExecutor()
    server_executor(config_file="talkingbot.yaml", log_file="neuralchat.log")
multiprocessing.Process(target=start_service).start()

❗ Please notice that the server is running on the background. 

If you run the codes in a command-line window, please run the following codes in a new terminal or session.

In [None]:
!curl -OL https://raw.githubusercontent.com/intel/intel-extension-for-transformers/main/intel_extension_for_transformers/neural_chat/assets/audio/sample.wav
!curl -OL https://raw.githubusercontent.com/intel/intel-extension-for-transformers/main/intel_extension_for_transformers/neural_chat/assets/speaker_embeddings/spk_embed_default.pt

In [None]:
from intel_extension_for_transformers.neural_chat import VoiceChatClientExecutor
executor = VoiceChatClientExecutor()
result = executor(
    audio_input_path='sample.wav',
    audio_output_path='results.wav',
    server_ip="127.0.0.1", # master server ip
    port=8888 # master server entry point 
    )
