# Open Terminal
Run the following commands

conda create -n itrex python=3.10 -y
conda activate itrex

pip install intel-extension-for-transformers

git clone https://github.com/intel/intel-extension-for-transformers.git

cd ./intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/

pip install -r requirements_cpu.txt

pip install -r requirements.txt

huggingface-cli login
(enter your hugging face login details)

##install jupyter and ipykernel 
python3 -m pip install jupyter ipykernel

##Add kernel for its environment 
python3 -m ipykernel install --name Neural-Chat --user

In the notebook set kernel to Neural-Chat

## Prepare Dataset

In [None]:
!curl -OL https://raw.githubusercontent.com/tatsu-lab/stanford_alpaca/main/alpaca_data.json

## Fine Tune the Model

In [None]:
from transformers import TrainingArguments
from intel_extension_for_transformers.neural_chat.config import (
    ModelArguments,
    DataArguments,
    FinetuningArguments,
    TextGenerationFinetuningConfig,
)
from intel_extension_for_transformers.neural_chat.chatbot import finetune_model
model_args = ModelArguments(model_name_or_path="meta-llama/Llama-2-7b-chat-hf")
data_args = DataArguments(train_file="alpaca_data.json")
training_args = TrainingArguments(
    output_dir='./finetuned_model_path',
    do_train=True,
    do_eval=True,
    num_train_epochs=3,
    overwrite_output_dir=True,
    per_device_train_batch_size=4,
    per_device_eval_batch_size=4,
    gradient_accumulation_steps=1,
    save_strategy="no",
    log_level="info",
    save_total_limit=2,
    bf16=True
)
finetune_args = FinetuningArguments()
finetune_cfg = TextGenerationFinetuningConfig(
            model_args=model_args,
            data_args=data_args,
            training_args=training_args,
            finetune_args=finetune_args,
        )
finetune_model(finetune_cfg)

## BF16 Optimization

In [None]:
# BF16 Optimization
from intel_extension_for_transformers.neural_chat import build_chatbot, PipelineConfig
from intel_extension_for_transformers.transformers import MixedPrecisionConfig
config = PipelineConfig(optimization_config=MixedPrecisionConfig())
chatbot = build_chatbot(config)
response = chatbot.predict(query="Tell me about Intel Xeon Scalable Processors.")
print(response)

## Text Chat With Retrieval Plugin

In [None]:
from intel_extension_for_transformers.neural_chat import PipelineConfig
from intel_extension_for_transformers.neural_chat import build_chatbot
from intel_extension_for_transformers.neural_chat import plugins
plugins.retrieval.enable=False
plugins.retrieval.args["input_path"]="./docs/"
config = PipelineConfig(plugins=plugins)
chatbot = build_chatbot(config)
response = chatbot.predict("How many cores does the Intel® Xeon® Platinum 8480+ Processor have in total?")
print(response)

## Setup Backend

In [None]:
!pip install intel-extension-for-transformers
!git clone https://github.com/intel/intel-extension-for-transformers.git
%cd ./intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/
!pip install -r requirements.txt
!sudo apt install numactl
!conda install astunparse ninja pyyaml mkl mkl-include setuptools cmake cffi typing_extensions future six requests dataclasses -y
!conda install jemalloc gperftools -c conda-forge -y
!pip install nest_asyncio

## Start Server

In [None]:
!curl -OL https://raw.githubusercontent.com/intel/intel-extension-for-transformers/main/intel_extension_for_transformers/neural_chat/examples/deployment/textbot/backend/xeon/textbot.yaml

## Setup frontend

Hugging Face Space helps to make some amazing ML applications more accessible to the community. Inspired by this, we can create a chatbot application on Hugging Face Spaces. Alternatively, you can also deploy the frontend on your own server.

## Deploy on Huggingface Space

### Create a new space on Huggingface
To create a new application space on Hugging Face, visit the website at [https://huggingface.co/new-space](https://huggingface.co/new-space) and follow the below steps to create a new space.

![Create New Space](https://i.imgur.com/QyjqUd6.png)

The new space is like a new project that supports GitHub-style code repository management.

### Check configuration
We recommend using Gradio as the Space SDK, keeping the default values for the other settings.

For detailed information about the configuration settings, please refer to the [Hugging Face Spaces Config Reference](https://huggingface.co/docs/hub/spaces-config-reference).

### Setup application
We strongly recommend utilizing the provided textbot frontend code as it represents the reference implementation already deployed on Hugging Face Space. To establish your application, simply copy the code files from this directory(intel_extension_for_transformers/neural_chat/examples/textbot/frontend) and adjust their configurations as necessary (e.g., backend service URL in the `app.py` file like below).

![Update backend URL](https://i.imgur.com/rQxPOV7.png)

Alternatively, you have the option to clone the existing space from [https://huggingface.co/spaces/Intel/NeuralChat-GNR-1](https://huggingface.co/spaces/Intel/NeuralChat-GNR-1).

![Clone Space](https://i.imgur.com/76N8m5B.png)

Please also update the backend service URL in the `app.py` file.

## Deploy frontend on your server

### Install the required Python dependencies

In [None]:
!pip install -r ./examples/deployment/textbot/frontend/requirements.txt

# Run the frontend
## Launch the chatbot frontend on your server using the following command:

In [None]:
!cd ./examples/deployment/textbot/frontend/
!nohup python app.py &

This will run the chatbot application in the background on your server. The port is defined in `server_port=` at the end of the `app.py` file.

Once the application is running, you can find the access URL in the trace log:

```log
INFO | gradio_web_server | Models: meta-llama/Llama-2-7b-chat-hf
INFO | stdout | Running on local URL:  http://0.0.0.0:7860
```
The URL to access the chatbot frontend is http://SERVER_IP_ADDRESS:7860. Please remember to replace SERVER_IP_ADDRESS with your server's actual IP address.

![URL](https://i.imgur.com/La3tJ8d.png)

Please update the backend service URL in the `app.py` file.

![Update backend URL](https://i.imgur.com/gRtZHrJ.png)