# RAG using Ollama and Langchain

## Installing required packages

In [None]:
%pip install langchain_community unstructured unstructured[all-docs] langchain chromadb langchain-text-splitters

Optional installation if you haven't already installed any other version of torch

In [None]:
# %pip install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url https://download.pytorch.org/whl/cpu

##### Along with these packages download and install ollama from - https://ollama.com/download

After installing ollama successfully pull the required models from ollama locally on your PC

In [None]:
!ollama pull nomic-embed-text
!ollama pull llama3:latest

In [None]:
!ollama list

## Document Processing

### 1.Loading the document

Importing all the packages used for loading the document

In [1]:
import os
from langchain_community.document_loaders import UnstructuredPDFLoader
from langchain_community.document_loaders import OnlinePDFLoader

Loading the PDF using the "UnstructuredPDFLoader" of "langchain_community"

In [2]:
local_path = "MotorACT.pdf"

# Local PDF file uploads
if local_path:
  loader = UnstructuredPDFLoader(file_path=local_path)
  data = loader.load()
else:
  print("Upload a PDF file")

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
data[0].page_content



### 3. Converting the document into "chunks"

In [4]:
from langchain_community.embeddings import OllamaEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma

In [5]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
chunks = text_splitter.split_documents(data)

Setting location to save the vector store

In [6]:
current_dir = os.getcwd()
persistent_directory = os.path.join(current_dir, "db", "chroma_db_for_MotorAct")

This is the function that is used to convert the chunks into embeddings

In [7]:
embedding_function = OllamaEmbeddings(model="nomic-embed-text", show_progress=True)

### 4. Creating and storing Embeddings

Here we are Embedding the data and simultaneously storing it in the vector store using the "chromadb"

In [8]:
if os.path.exists(persistent_directory):
    vector_db = Chroma(
        persist_directory=persistent_directory, 
        embedding_function=embedding_function,
        collection_name="local-rag"
    )
    print("Loaded existing Chroma vector store.")
else:
    vector_db = Chroma.from_documents(
        documents=chunks, 
        embedding=OllamaEmbeddings(model="nomic-embed-text", show_progress=True),
        collection_name="local-rag",
        persist_directory=persistent_directory
    )
    vector_db.persist()

  warn_deprecated(


Loaded existing Chroma vector store.


## Retrieval And Question Answering

In [9]:
from langchain.prompts import ChatPromptTemplate, PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_community.chat_models import ChatOllama
from langchain_core.runnables import RunnablePassthrough
from langchain.retrievers.multi_query import MultiQueryRetriever

In this example we are using the "llama3" model from meta, which is an open-source LLM(Large Language Model)

In [10]:
local_model = "llama3"
llm = ChatOllama(model=local_model)

Here we are giving specific instructions to the bot on how to answer the queries asked by the end users

In [11]:
QUERY_PROMPT = PromptTemplate(
    input_variables=["question"],
    template="""You are an AI legal assistant. Your task is to generate five
    different versions of the given legal question to retrieve relevant documents from
    a vector database. By generating multiple perspectives on the user question, your
    goal is to help the user overcome some of the limitations of the distance-based
    similarity search. Provide these alternative questions separated by newlines.
    Original question: {question}""",
)

In [12]:
retriever = MultiQueryRetriever.from_llm(
    vector_db.as_retriever(), 
    llm,
    prompt=QUERY_PROMPT
)

template = """You are an AI legal assistant. Answer the question based ONLY on the following legal context:
{context}
Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

Here with the help of chaining we are retriving the embeddings from the vector store, converting it to text, passing it to the LLM along with the query all at once

In [13]:
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

Finally we are invoking the chain by taking the query from the user and giving the output

In [None]:
chain.invoke(input("Ask any question regarding the PDF you uploaded earlier: "))