# RAG = LangChain + llama-cpp + ChromaDB

### [Using Langchain and Open Source Vector DB Chroma for Semantic Search with OpenAI's LLM](https://blog.futuresmart.ai/using-langchain-and-open-source-vector-db-chroma-for-semantic-search-with-openais-llm)

In [1]:
!pip install langchain sentence_transformers chromadb unstructured -q

[0m

In [2]:
!pip install llama-cpp-python -q

[0m

## Documents Directory

In [3]:
!ls ./docs/pets

'Different Types of Pet Animals.txt'
'Health Care for Pets.txt'
'Nutrition Needs of Pet Animals.txt'
'The Emotional Bond Between Humans and Pets.txt'
'Training and Behaviour of Pets.txt'


## Loading and Splitting the Documents

In [5]:
from langchain.document_loaders import DirectoryLoader

directory = './docs/pets/'

def load_docs(directory):
    loader = DirectoryLoader(directory)
    documents = loader.load()
    return documents

documents = load_docs(directory)
len(documents)

2024-05-09 09:54:46.307036: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-05-09 09:54:46.327583: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


5

In [6]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

def split_docs(documents,chunk_size=1000,chunk_overlap=20):
  text_splitter = RecursiveCharacterTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap)
  docs = text_splitter.split_documents(documents)
  return docs

docs = split_docs(documents)
print(len(docs))

5


## Embedding Text Using Langchain

In [7]:
from langchain.embeddings import SentenceTransformerEmbeddings
embeddings = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")

## Creating Vector Store with Chroma DB

In [8]:
from langchain.vectorstores import Chroma
db = Chroma.from_documents(docs, embeddings)

## Retrieving Semantically Similar Documents

In [9]:
query = "What are the different kinds of pets people commonly own?"
matching_docs = db.similarity_search(query)

matching_docs[0]

Document(page_content='Pet animals come in all shapes and sizes, each suited to different lifestyles and home environments. Dogs and cats are the most common, known for their companionship and unique personalities. Small mammals like hamsters, guinea pigs, and rabbits are often chosen for their low maintenance needs. Birds offer beauty and song, and reptiles like turtles and lizards can make intriguing pets. Even fish, with their calming presence, can be wonderful pets.', metadata={'source': 'docs/pets/Different Types of Pet Animals.txt'})

## Persistence in Chroma DB

In [10]:
persist_directory = "chroma_db"

vectordb = Chroma.from_documents(
    documents=docs, embedding=embeddings, persist_directory=persist_directory
)

vectordb.persist()

  warn_deprecated(


## Using OpenAI Large Language Models (LLM) with Chroma DB

import os

os.environ["OPENAI_API_KEY"] = "key"

from langchain.chat_models import ChatOpenAI

model_name = "gpt-3.5-turbo"

llm = ChatOpenAI(model_name=model_name)

## Using local LLM based on LlamaCpp

In [11]:
### using llama-cpp
from huggingface_hub import hf_hub_download
from langchain_community.llms import LlamaCpp
from langchain_core.callbacks import CallbackManager, StreamingStdOutCallbackHandler

model_name = "taide/Llama3-TAIDE-LX-8B-Chat-Alpha1-4bit" # TAIDE
model_file = "taide-8b-a.3-q4_k_m.gguf"

model_path = hf_hub_download(model_name, filename=model_file)
print(model_file)

# Callbacks support token-wise streaming
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()])

llm = LlamaCpp(model_path=model_path, 
               temperature=0.75, max_tokens=2000, top_p=1, callback_manager=callback_manager, verbose=True)

llama_model_loader: loaded meta data with 23 key-value pairs and 291 tensors from /home/rkuo/.cache/huggingface/hub/models--taide--Llama3-TAIDE-LX-8B-Chat-Alpha1-4bit/snapshots/7f36bc501c4549bffa9e79d1e9a7fc36e8f50c7c/taide-8b-a.3-q4_k_m.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = models
llama_model_loader: - kv   2:                           llama.vocab_size u32              = 128256
llama_model_loader: - kv   3:                       llama.context_length u32              = 8192
llama_model_loader: - kv   4:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv   5:                          llama.block_count u32              = 32
llama_model_loader: - kv   6:                  

taide-8b-a.3-q4_k_m.gguf


llm_load_vocab: missing pre-tokenizer type, using: 'default'
llm_load_vocab:                                             
llm_load_vocab: ************************************        
llm_load_vocab: GENERATION QUALITY WILL BE DEGRADED!        
llm_load_vocab: CONSIDER REGENERATING THE MODEL             
llm_load_vocab: ************************************        
llm_load_vocab:                                             
llm_load_vocab: special tokens definition check successful ( 256/128256 ).
llm_load_print_meta: format           = GGUF V3 (latest)
llm_load_print_meta: arch             = llama
llm_load_print_meta: vocab type       = BPE
llm_load_print_meta: n_vocab          = 128256
llm_load_print_meta: n_merges         = 280147
llm_load_print_meta: n_ctx_train      = 8192
llm_load_print_meta: n_embd           = 4096
llm_load_print_meta: n_head           = 32
llm_load_print_meta: n_head_kv        = 8
llm_load_print_meta: n_layer          = 32
llm_load_print_meta: n_rot            =

## Extracting Answers from Documents

In [12]:
from langchain.chains.question_answering import load_qa_chain
chain = load_qa_chain(llm, chain_type="stuff",verbose=True)

query = "What are the emotional benefits of owning a pet?"
matching_docs = db.similarity_search(query)
answer =  chain.run(input_documents=matching_docs, question=query)
answer

  warn_deprecated(




[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mUse the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

Pets offer more than just companionship; they provide emotional support, reduce stress, and can even help their owners lead healthier lives. The bond between pets and their owners is strong, and many people consider their pets as part of the family. This bond can be especially important in times of personal or societal stress, providing comfort and consistency.

Pet animals come in all shapes and sizes, each suited to different lifestyles and home environments. Dogs and cats are the most common, known for their companionship and unique personalities. Small mammals like hamsters, guinea pigs, and rabbits are often chosen for their low maintenance needs. Birds offer beauty and song, and reptile


llama_print_timings:        load time =    1418.21 ms
llama_print_timings:      sample time =      77.16 ms /   159 runs   (    0.49 ms per token,  2060.57 tokens per second)
llama_print_timings: prompt eval time =    7964.41 ms /   352 tokens (   22.63 ms per token,    44.20 tokens per second)
llama_print_timings:        eval time =   12455.64 ms /   159 runs   (   78.34 ms per token,    12.77 tokens per second)
llama_print_timings:       total time =   21478.23 ms /   511 tokens



[1m> Finished chain.[0m

[1m> Finished chain.[0m


" The emotional benefits of owning a pet are numerous and well-documented. Pets provide companionship, which can help to alleviate feelings of loneliness and isolation. This is especially true for older adults who may be more likely to experience social isolation.\nPets also serve as stress relievers. Interacting with pets can help to reduce levels of the stress hormone cortisol in the body. This can lead to improved overall health and well-being.\nIn addition to providing companionship and stress relief, pets can also play a role in promoting social interactions and reducing feelings of loneliness among individuals, particularly those who may be more prone to experiencing social isolation due to factors such as age, disability, or socioeconomic status. By fulfilling these roles, pets can have a profound impact on an individual's emotional well-being and overall quality of life"

## Utilizing RetrieverQA Chain

In [13]:
from langchain.chains import RetrievalQA

retrieval_chain = RetrievalQA.from_chain_type(llm, chain_type="stuff", retriever=db.as_retriever())
retrieval_chain.run(query)

 The emotional

Llama.generate: prefix-match hit


 benefits of owning a pet are numerous and can significantly enrich an individual's life. Firstly, pets provide companionship, which can be particularly beneficial for individuals who live alone or have limited social interactions.

Secondly, interacting with pets can help reduce stress levels. Playing with dogs or cats, watching them play, or simply petting them can all help to lower stress and promote feelings of calm and relaxation.

Thirdly, the bond between an individual and their pet can provide a sense of security and belonging. This can be particularly beneficial for individuals who may feel isolated or alienated in other areas of their life.

In summary, the emotional benefits of owning a pet are substantial and can include companionship, stress reduction, a sense of security and belonging, and many other positive emotions and experiences.


llama_print_timings:        load time =    1418.21 ms
llama_print_timings:      sample time =      75.95 ms /   159 runs   (    0.48 ms per token,  2093.43 tokens per second)
llama_print_timings: prompt eval time =       0.00 ms /     1 tokens (    0.00 ms per token,      inf tokens per second)
llama_print_timings:        eval time =   12384.43 ms /   159 runs   (   77.89 ms per token,    12.84 tokens per second)
llama_print_timings:       total time =   13305.75 ms /   160 tokens


" The emotional benefits of owning a pet are numerous and can significantly enrich an individual's life. Firstly, pets provide companionship, which can be particularly beneficial for individuals who live alone or have limited social interactions.\n\nSecondly, interacting with pets can help reduce stress levels. Playing with dogs or cats, watching them play, or simply petting them can all help to lower stress and promote feelings of calm and relaxation.\n\nThirdly, the bond between an individual and their pet can provide a sense of security and belonging. This can be particularly beneficial for individuals who may feel isolated or alienated in other areas of their life.\n\nIn summary, the emotional benefits of owning a pet are substantial and can include companionship, stress reduction, a sense of security and belonging, and many other positive emotions and experiences."