Skip to content

adityaadarsh/local-chatbot-cpu

Repository files navigation

Local Chatbot (CPU)

The project aims to run a LLM based question answering chatbot on enterprise/private data using CPU only.

You can ask questions to your private txt documents without an internet connection, using opensource LLM.

Note: This project is using a quantized LLM model which designed to run on cpu only, therefore the performance may not be upto SOTA llm (falcon or similar) model and speed will be a bit slow based on you cpu compute availability.

Approach

image description

Built with

Environment Setup

Install conda and create an environment

conda create -n localChatbot python=3.9
conda activate localChatbot

In order to set your environment up to run the code here, install all requirements:

pip install -r requirements.txt

(Important) Before moving forward you should have a Milvus instance up and running. -

Milvus database setup guide

Running up the application

Data ingestion

Below command will automatically load the embedding model and save the vector embeddings of txt file present in data/ directory

python data_to_vector_ingestion.py 

LLM Application UI

It will start a gradio prediction instance.

python app.py 

It may also ask for your W&B api key. Please go through the guide mentioned in the terminal.

terminal output -

image description Now you can access the LLM app from your localhost -

  • Navigate to the address mentioned in the terminal and start asking the questions to your chatbot.

Screenshot

Gradio App

image description

W&B Experiment Logs

image description

Performance Evaluation

  • Using weights and bias to keep track for all the prompts results and manually checking the output’s performance
  • Using BertScore as a performance metric

Author @adityaadarsh

About

Question answering chatbot for private/enterprise data using LLM

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published