# Local Rag Example

I am running a minimal rag-based llm application here to test the feasibility of running a local model on an old 8G-RAM intel-chip mac laptop. 

In [1]:
# ! pip install ollama
# ! pip install langchain
# ! pip install langchain-community

In [2]:
import os
import ollama

from langchain_community.vectorstores import Chroma
from langchain_community.chat_models import ChatOllama
from langchain_community.embeddings import FastEmbedEmbeddings
from langchain.schema.output_parser import StrOutputParser
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.schema.runnable import RunnablePassthrough
from langchain.prompts import PromptTemplate
from langchain.vectorstores.utils import filter_complex_metadata

## Build rag retriever

In [3]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1024, chunk_overlap=100)

In [13]:
from pypdf import PdfReader

pdf_file_path = os.path.join("./data", "manual.pdf")
docs = PyPDFLoader(file_path=pdf_file_path).load()
chunks = text_splitter.split_documents(docs)
chunks = filter_complex_metadata(chunks)

vector_store = Chroma.from_documents(documents=chunks, embedding=FastEmbedEmbeddings())

Fetching 5 files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 11516.49it/s]


In [14]:
# Check chunks
print(f"Total chunks: {len(chunks)}")
chunks[0]

Total chunks: 263


Document(metadata={'source': './data/manual.pdf', 'page': 0}, page_content='www.samsung.com English. 04/2024. Rev.1.1SM-A356B/DS\nSM-A356B/DS UDSM-A356ESM-A356E/DSSM-A556B/DSSM-A556B/DS UDSM-A556ESM-A556E/DS\nUSER GUIDE')

In [7]:
# Construct retriever
retriever = vector_store.as_retriever(
            search_type="similarity_score_threshold",
            search_kwargs={
                "k": 3,
                "score_threshold": 0.5,
            },
        )

In [15]:
# Check retriever
question = "what's wrong with my phone when it does not turn on"
context = retriever.get_relevant_documents(question)
context

[Document(metadata={'page': 150, 'source': './data/manual.pdf'}, page_content='Appendix\n151\nYour device does not turn on\nWhen the battery is completely discharged, your device will not turn on. Fully charge the \nbattery before turning on the device.\nThe touch screen responds slowly or improperly\n •If you attach a screen protector or optional accessories to the touch screen, the touch screen may not function properly.\n •If you are wearing gloves, if your hands are not clean while touching the touch screen, or if you tap the screen with sharp objects or your fingertips, the touch screen may malfunction.\n •The touch screen may malfunction in humid conditions or when exposed to water.\n •Restart your device to clear any temporary software problems.\n •Ensure that your device software is updated to the latest version.\n •If the touch screen is scratched or damaged, visit a Samsung Service Centre or an authorised service centre.\nYour device freezes or encounters a fatal problem'),
 

## Feed both question and retriever to prompt and model

You will need to download [Ollama](https://ollama.com/) to host your (hopefully open-source) llm locally, and also choose a model from their [library](https://ollama.com/library). I am choosing the smallest model so I can run it on my old laptop. 

In [20]:
model_name = "qwen2:0.5b"
model = ChatOllama(model=model_name)

In [21]:
prompt = PromptTemplate.from_template(
            """
            <s> [INST] You are an assistant for question-answering tasks. Use the following pieces of retrieved context 
            to answer the question. If you don't know the answer, just say that you don't know. Use three sentences
             maximum and keep the answer concise. [/INST] </s> 
            [INST] Question: {question} 
            Context: {context} 
            Answer: [/INST]
            """
        )

In [22]:
chain = ({"context": lambda x: retriever, "question": RunnablePassthrough()} | prompt | model)

In [24]:
response = chain.invoke(question)
response.content

"It looks like there might be some issues with your phone that are causing it to turn off or malfunction. This is due to a number of factors including overheating, touch screen issues, connectivity problems, and battery drain. To fix these issues, you can try closing apps running on the device and reducing excessive battery usage. If these steps do not work, you may need to consult with a service center for assistance. In terms of troubleshooting, it's also important to ensure that your phone is updated to the latest software version, as this can help resolve any potential issues or prevent more serious problems from occurring."

## And it worked!!!