# Docu-Chat on a CPU

This shows how to build a personal assistant which has access to your documents (you provide the folder address) and you can chat with it, on-device, using CPU only. It is completely private, and pretty fast to run.

In [3]:
chat_model = 'MBZUAI/LaMini-T5-738M'
embedding_model = 'sentence-transformers/all-MiniLM-L6-v2'
device = 'cpu'
docs_folder = 'source_documents/'
docx_address = 'C:/Users/rezabonyadi/Desktop/test_folder/'

# Read docs and embed

- First, we read the docx files recursively from the docx_address folder.
- We then save them as text files in docs_folder
- We finally embed the paragraphs (can be improved significantly)
- Save the embeddings in a vector db

In [5]:
from utils import build_datasets, prepare_chat_model, settings

build_datasets.get_docs(docx_address, docs_folder)
print('Done reading!')

build_datasets.build_embedding_dataset(docs_folder, embedding_model, device = device)
print('Done building db of your documents!')

2it [00:00, 14.09it/s]


extracting content of  C:/Users/rezabonyadi/Desktop/test_folder/creating fictional universe.docx
writing:  source_documents/creating fictional universe.txt
extracting content of  C:/Users/rezabonyadi/Desktop/test_folder/future of human computer interaction.docx
writing:  source_documents/future of human computer interaction.txt
extracting content of  C:/Users/rezabonyadi/Desktop/test_folder/natural language programming.docx
writing:  source_documents/natural language programming.txt
extracting content of  C:/Users/rezabonyadi/Desktop/test_folder/LLMops\addressing soft challenges llms.docx
writing:  source_documents/addressing soft challenges llms.txt
extracting content of  C:/Users/rezabonyadi/Desktop/test_folder/LLMops\operationalizing language models.docx
writing:  source_documents/operationalizing language models.txt
Done reading!
Loading documents from source_documents/
Loaded 5 documents from source_documents/
Split into 342 chunks of text (max. 500 characters each)
Done building 

# Prepare chat engine and test it
- Firs build a retriver (a connection to vector db to retrive relevant contexts using semantic search)
- Prepare the chat model

In [6]:
retriver = prepare_chat_model.get_context_retriver(embedding_model, device='cpu')
qa_engine = prepare_chat_model.get_qa_engine(chat_model, retriver, input_max_length=512, 
                                             pipeline_model="text2text-generation", device=-1)
print('Retriver and Q&A engine ready!')


Retriver and Q&A engine ready!


In [8]:
def chatbot_response(user_input, qa_engine):
    # Put the inference and the LLM stuff here...
    query = user_input
    res = qa_engine(query)
    answer, docs = res['result'], res['source_documents']
    print(answer)
    all_context = "\n\n".join(['Context: ' + i.dict()['page_content'] + '\n \n Taken from: ' + i.dict()['metadata']['source'] for i in docs])

    return answer, all_context

user_input = 'What are some challenges with language models?'
response, context = chatbot_response(user_input, qa_engine)

The challenges with language models include running cost, latency, trust and safety considerations, hallucination, environmental footprints, and privacy challenges.


# More fancy chat

In [10]:
from ipywidgets import interact, interactive, fixed, interact_manual
import ipywidgets as widgets

def chatbot_interactive(user_input):
    response, context = chatbot_response(user_input, qa_engine)
    return response

interact(chatbot_interactive, user_input=widgets.Text(value='Hello, how are you?', description='Input:'))

interactive(children=(Text(value='Hello, how are you?', description='Input:'), Output()), _dom_classes=('widge…

<function __main__.chatbot_interactive(user_input)>