<a target="_blank" href="https://colab.research.google.com/github/sergiopaniego/RAG_local_tutorial/blob/main/example_rag.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# Simple RAG example with Langchain, Ollama and and open-source LLM model

In this example, we first connect to an LLM locally and make request to the LLM that Ollama is serving using LangChain. After that, we generate our RAG application from a PDF file and extract details from that document.

<p align="center">
  <img src="https://cdn.analyticsvidhya.com/wp-content/uploads/2023/07/langchain3.png" alt="Langchain Logo" width="20%">
  <img src="https://bookface-images.s3.amazonaws.com/logos/ee60f430e8cb6ae769306860a9c03b2672e0eaf2.png" alt="Ollama Logo" width="20%">
</p>

Sources:

* https://github.com/svpino/llm
* https://github.com/AIAnytime/Gemma-7B-RAG-using-Ollama/blob/main/Ollama%20Gemma.ipynb
* https://www.youtube.com/watch?v=-MexTC18h20&ab_channel=AIAnytime
* https://www.youtube.com/watch?v=HRvyei7vFSM&ab_channel=Underfitted

 
# Requirements

* Ollama installed locally

# Install the requirements

If an error is raised related to docarray, refer to this solution: https://stackoverflow.com/questions/76880224/error-using-using-docarrayinmemorysearch-in-langchain-could-not-import-docarray

In [1]:
import subprocess

# Define the packages to install
packages = [
    "langchain",
    "langchain_pinecone",
    "langchain[docarray]",
    "docarray",
    "pypdf",
    "langchain-ollama"
]

# Install packages silently without output
for package in packages:
    subprocess.run(['pip3', 'install', package], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)

In [2]:
!ollama list

NAME                       ID              SIZE      MODIFIED    
all-minilm:l6-v2           1b226e2802db    45 MB     3 weeks ago    
llama3.1:8b                46e0c10c039e    4.9 GB    3 weeks ago    
mistral:latest             f974a74358d6    4.1 GB    3 weeks ago    
nomic-embed-text:latest    0a109f422b47    274 MB    3 weeks ago    


# Select the LLM model to use

The model must be downloaded locally to be used, so if you want to run llama3, you should run:

```

ollama pull llama3

```

Check the list of models available for Ollama here: https://ollama.com/library

In [3]:
#MODEL = "gpt-3.5-turbo"
#MODEL = "mixtral:8x7b"
#MODEL = "gemma:7b"
#MODEL = "llama2"
MODEL = "llama3.1:8b" # https://ollama.com/library/llama3

# We instantiate the LLM model and the Embedding model

In [4]:
from langchain_ollama import OllamaLLM, OllamaEmbeddings

# Replace MODEL with the desired model name, e.g., "llama-2"
model = OllamaLLM(model=MODEL)
embeddings = OllamaEmbeddings(model=MODEL)

In [5]:
# Invoke the model
response = model.invoke("Give me an inspirational quote")
print(response)

Here's one:

"You don't have to be great to start, but you have to start to be great." - Zig Ziglar


In [6]:
model.invoke("What is 2+2?")

'2 + 2 = 4.'

## Using a parser provided by LangChain, we can transform the LLM output to something more suitable to be read

In [7]:
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()
response_from_model = model.invoke("Give me an inspirational quote")
parsed_response = parser.parse(response_from_model)
print(parsed_response)

Here's one:

"You don't have to be great to start, but you have to start to be great." - Zig Ziglar


# We generate the template for the conversation with the instruct-based LLM

We can create a template to structure the conversation effectively.

This template allows us to provide some general context to the Language Learning Model (LLM), which will be utilized for every prompt. This ensures that the model has a consistent background understanding for all interactions.

Additionally, we can include specific context relevant to the particular prompt. This helps the model understand the immediate scenario or topic before addressing the actual question. Following this specific context, we then present the actual question we want the model to answer.

By using this approach, we enhance the model's ability to generate accurate and relevant responses based on both the general and specific contexts provided.

In [8]:
from langchain.prompts import PromptTemplate

template = """
Answer the question based on the context below. If you can't 
answer the question, answer with "I don't know".

Context: {context}

Question: {question}
"""

prompt = PromptTemplate.from_template(template)
prompt.format(context="Here is some context", question="Here is a question")

'\nAnswer the question based on the context below. If you can\'t \nanswer the question, answer with "I don\'t know".\n\nContext: Here is some context\n\nQuestion: Here is a question\n'

The model can answer prompts based on the context:

In [9]:
formatted_prompt = prompt.format(context="My parents named me Sergio", question="What's your name?")
response_from_model = model.invoke(formatted_prompt)
parsed_response = parser.parse(response_from_model)
print(parsed_response)

My name is Sergio.


But it can't answer what is not provided as context:

In [10]:
formatted_prompt = prompt.format(context="My parents named me Sergio", question="What's my age?")
response_from_model = model.invoke(formatted_prompt)
parsed_response = parser.parse(response_from_model)
print(parsed_response)

I don't know. The context only mentions your name, but not any information about your age.


Even previously known info!

In [11]:
formatted_prompt = prompt.format(context="My parents named me Sergio", question="What is 2+2?")
response_from_model = model.invoke(formatted_prompt)
parsed_response = parser.parse(response_from_model)
print(parsed_response)

I don't know


# Load an example PDF to do Retrieval Augmented Generation (RAG)

For the example, you can select your own PDF.

In [12]:
from langchain_community.document_loaders import PyPDFLoader


loader = PyPDFLoader("reasoning_dataset_without_explanation.pdf")
pages = loader.load_and_split()
#pages = loader.load()
pages

[Document(metadata={'source': 'reasoning_dataset_without_explanation.pdf', 'page': 0}, page_content='Question 1:\nContext: Exactly six trade representatives negotiate a treaty: Klosnik, Londi, Manley, Neri, Osata,\nPoirier. There are exactly six chairs evenly spaced around a circular table. The chairs are numbered\n1 through 6, with successively numbered chairs next to each other and chair number 1 next to chair\nnumber 6. Each chair is occupied by exactly one of the representatives. The following conditions\napply: Poirier sits immediately next to Neri. Londi sits immediately next to Manley, Neri, or both.\nKlosnik does not sit immediately next to Manley. If Osata sits immediately next to Poirier, Osata\ndoes not sit immediately next to Manley.\nQuestion: Which one of the following seating arrangements of the six representatives in chairs 1\nthrough 6 would NOT violate the stated conditions?\nOptions: [ "Klosnik, Poirier, Neri, Manley, Osata, Londi", "Klosnik, Londi, Manley, Poirier, 

In [13]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=100, chunk_overlap=20)
text_documents = text_splitter.split_documents(pages)

text_documents

[Document(metadata={'source': 'reasoning_dataset_without_explanation.pdf', 'page': 0}, page_content='Question 1:'),
 Document(metadata={'source': 'reasoning_dataset_without_explanation.pdf', 'page': 0}, page_content='Context: Exactly six trade representatives negotiate a treaty: Klosnik, Londi, Manley, Neri, Osata,'),
 Document(metadata={'source': 'reasoning_dataset_without_explanation.pdf', 'page': 0}, page_content='Poirier. There are exactly six chairs evenly spaced around a circular table. The chairs are'),
 Document(metadata={'source': 'reasoning_dataset_without_explanation.pdf', 'page': 0}, page_content='The chairs are numbered'),
 Document(metadata={'source': 'reasoning_dataset_without_explanation.pdf', 'page': 0}, page_content='1 through 6, with successively numbered chairs next to each other and chair number 1 next to chair'),
 Document(metadata={'source': 'reasoning_dataset_without_explanation.pdf', 'page': 0}, page_content='number 6. Each chair is occupied by exactly one of t

# Store the PDF in a vector space.

From Langchain docs:

`DocArrayInMemorySearch is a document index provided by Docarray that stores documents in memory. It is a great starting point for small datasets, where you may not want to launch a database server.`

The execution time of the following block depends on the complexity and longitude of the PDF provided. Try to keep it small and simple for the example.

In [14]:
from langchain_community.vectorstores import DocArrayInMemorySearch

vectorstore = DocArrayInMemorySearch.from_documents(text_documents, embedding=embeddings)



In [17]:
# Create retriever with a high relevance threshold or no filtering
# retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": len(text_documents)})
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 10})

# Retrieve all document chunks
retrieved_context = retriever.invoke("")  # Empty query to retrieve everything

In [None]:
import pandas as pd
from langchain.chains import LLMChain

# Load the Excel file
df = pd.read_excel('reasoning_dataset.xlsx', sheet_name='test')

# Create a prompt template
prompt = PromptTemplate(
    input_variables=["context", "question", "options"],
    template="""
    Question description: {context}
    Question: {question}
    Options: {options}

    Please answer the question based on the given information.
    """
)

# Create an LLMChain
chain = LLMChain(llm=model, prompt=prompt)

# Process each row in the DataFrame
for _, row in df.iterrows():
    context = row['context'] if not pd.isna(row['context']) else ""
    question = row['question'] if not pd.isna(row['question']) else ""
    options = row['options'] if not pd.isna(row['options']) else ""

    # Create a query for the retriever
    retriever_query = f"{context} {question} {options}"
    
    # Retrieve relevant context
    retrieved_context = retriever.invoke(retriever_query)

    # Combine the original context with the retrieved context
    combined_context = f"{context}\n\nAdditional context:\n{retrieved_context}"

    # Invoke the model
    response = chain.run(context=combined_context, question=question, options=options)

    print(f"Question description: {context}")
    print(f"Question: {question}")
    print(f"Options: {options}")
    print(f"Answer: {response}")
    print()

  response = chain.run(context=combined_context, question=question, options=options)


Question description: Eight persons namely P, Q, R, S, T, U, V, and W are sitting around a circular table facing the centre of the table but not necessarily in the same order. P sits second to the right of U, who sits opposite to T. Two persons sit between Q and T. S sits third to the right of P. V sits second to the left of W.
Question: How many persons sit between W and R when counted from the right of R?
Options: ["Two", "Three", "Four", "One", "None of these"]

Answer: To solve this problem, let's break down the given information:

1. P sits second to the right of U.
2. U sits opposite T (i.e., they are facing each other).
3. Two persons sit between Q and T.
4. S sits third to the right of P.
5. V sits second to the left of W.

From statement 2, we know that U and T sit opposite each other. So, when counting from the right of R (i.e., moving clockwise), let's place T at the top-right position in our imaginary circular table.

Now, based on the given statements:

- P sits second to 

# Loop to ask-answer questions continously

In [17]:
while True:
    print("Say 'exit' or 'quit' to exit the loop")
    question = input('User question: ')
    print(f"Question: {question}")
    if question.lower() in ["exit", "quit"]:
        print("Exiting the conversation. Goodbye!")
        break
    formatted_prompt = prompt.format(context=retrieved_context, question=question)
    response_from_model = model.invoke(formatted_prompt)
    parsed_response = parser.parse(response_from_model)
    print(f"Answer: {parsed_response}")
    print()

Say 'exit' or 'quit' to exit the loop
Question: exit
Exiting the conversation. Goodbye!
