<a target="_blank" href="https://colab.research.google.com/github/sergiopaniego/RAG_local_tutorial/blob/main/example_rag.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# Install the requirements

If an error is raised related to docarray, refer to this solution: https://stackoverflow.com/questions/76880224/error-using-using-docarrayinmemorysearch-in-langchain-could-not-import-docarray

In [1]:
import subprocess

# Define the packages to install
packages = [
    "langchain",
    "langchain_pinecone",
    "langchain[docarray]",
    "docarray",
    "pypdf",
    "langchain-ollama"
]

# Install packages silently without output
for package in packages:
    subprocess.run(['pip3', 'install', package], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)

In [2]:
!ollama list

NAME                       ID              SIZE      MODIFIED    
all-minilm:l6-v2           1b226e2802db    45 MB     3 weeks ago    
llama3.1:8b                46e0c10c039e    4.9 GB    3 weeks ago    
mistral:latest             f974a74358d6    4.1 GB    3 weeks ago    
nomic-embed-text:latest    0a109f422b47    274 MB    3 weeks ago    


# Select the LLM model to use

The model must be downloaded locally to be used, so if you want to run llama3, you should run:

```

ollama pull llama3

```

Check the list of models available for Ollama here: https://ollama.com/library

In [3]:
#MODEL = "gpt-3.5-turbo"
#MODEL = "mixtral:8x7b"
#MODEL = "gemma:7b"
#MODEL = "llama2"
MODEL = "llama3.1:8b" # https://ollama.com/library/llama3
EMBED_MODEL = "nomic-embed-text:latest"

# We instantiate the LLM model and the Embedding model

In [4]:
from langchain_ollama import OllamaLLM, OllamaEmbeddings

# Replace MODEL with the desired model name, e.g., "llama-2"
model = OllamaLLM(model=MODEL)
embeddings = OllamaEmbeddings(model=MODEL)

In [5]:
# Invoke the model
response = model.invoke("Give me an inspirational quote")
print(response)

Here's one:

"Believe you can and you're halfway there." - Theodore Roosevelt


## Using a parser provided by LangChain, we can transform the LLM output to something more suitable to be read

In [6]:
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()
response_from_model = model.invoke("Give me an inspirational quote")
parsed_response = parser.parse(response_from_model)
print(parsed_response)
print(type(response_from_model))
print(type(parsed_response))

Here's one:

"You don't have to be great to start, but you have to start to be great." - Zig Ziglar
<class 'str'>
<class 'str'>


# Load an example PDF to do Retrieval Augmented Generation (RAG)

For the example, you can select your own PDF.

In [7]:
from langchain_community.document_loaders import PyPDFLoader


loader = PyPDFLoader("reasoning_dataset_without_explanation.pdf")
pages = loader.load_and_split()
#pages = loader.load()
pages

[Document(metadata={'source': 'reasoning_dataset_without_explanation.pdf', 'page': 0, 'page_label': '1'}, page_content='Question 1:\nContext: Exactly six trade representatives negotiate a treaty: Klosnik, Londi, Manley, Neri, Osata,\nPoirier. There are exactly six chairs evenly spaced around a circular table. The chairs are numbered\n1 through 6, with successively numbered chairs next to each other and chair number 1 next to chair\nnumber 6. Each chair is occupied by exactly one of the representatives. The following conditions\napply: Poirier sits immediately next to Neri. Londi sits immediately next to Manley, Neri, or both.\nKlosnik does not sit immediately next to Manley. If Osata sits immediately next to Poirier, Osata\ndoes not sit immediately next to Manley.\nQuestion: Which one of the following seating arrangements of the six representatives in chairs 1\nthrough 6 would NOT violate the stated conditions?\nOptions: [ "Klosnik, Poirier, Neri, Manley, Osata, Londi", "Klosnik, Londi

In [8]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)
text_documents = text_splitter.split_documents(pages)

text_documents

[Document(metadata={'source': 'reasoning_dataset_without_explanation.pdf', 'page': 0, 'page_label': '1'}, page_content='Question 1:\nContext: Exactly six trade representatives negotiate a treaty: Klosnik, Londi, Manley, Neri, Osata,\nPoirier. There are exactly six chairs evenly spaced around a circular table. The chairs are numbered\n1 through 6, with successively numbered chairs next to each other and chair number 1 next to chair\nnumber 6. Each chair is occupied by exactly one of the representatives. The following conditions'),
 Document(metadata={'source': 'reasoning_dataset_without_explanation.pdf', 'page': 0, 'page_label': '1'}, page_content='number 6. Each chair is occupied by exactly one of the representatives. The following conditions\napply: Poirier sits immediately next to Neri. Londi sits immediately next to Manley, Neri, or both.\nKlosnik does not sit immediately next to Manley. If Osata sits immediately next to Poirier, Osata\ndoes not sit immediately next to Manley.\nQues

# Store the PDF in a vector space.

From Langchain docs:

`DocArrayInMemorySearch is a document index provided by Docarray that stores documents in memory. It is a great starting point for small datasets, where you may not want to launch a database server.`

The execution time of the following block depends on the complexity and longitude of the PDF provided. Try to keep it small and simple for the example.

In [9]:
from langchain_community.vectorstores import DocArrayInMemorySearch

vectorstore = DocArrayInMemorySearch.from_documents(text_documents, embedding=embeddings)



In [10]:
from langchain.vectorstores import FAISS

# Save to FAISS index
faiss_vectorstore = FAISS.from_documents(text_documents, embeddings)
faiss_vectorstore.save_local("vectorstore_directory_wo_explanation")

In [11]:
# Load from FAISS index with dangerous deserialization enabled
loaded_vectorstore = FAISS.load_local(
    "vectorstore_directory_wo_explanation", 
    embeddings, 
    allow_dangerous_deserialization=True
)

In [12]:
# Create retriever with a high relevance threshold or no filtering
# retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": len(text_documents)})
retriever = vectorstore.as_retriever(search_type="similarity", search_kwargs={"k": 5})

# Retrieve all document chunks
retrieved_context = retriever.invoke("")  # Empty query to retrieve everything

In [13]:
import time
import pandas as pd
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain_community.vectorstores import DocArrayInMemorySearch

# Safe invocation with retry for handling quota/rate-limit errors
def safe_chat_with_retry(chain, **kwargs):
    while True:
        try:
            # Invoke the LangChain model
            response = chain.invoke(kwargs)
            return response
        except Exception as e:
            # Handle specific quota or rate-limit errors
            if "ResourceExhausted" in str(e) or "429" in str(e):
                print("Quota exceeded. Retrying in 1 minute and 5 seconds...")
                time.sleep(65)  # Retry after delay
            else:
                # If the error is unrelated, re-raise it
                raise e

# Define your prompt template
prompt = PromptTemplate(
    input_variables=["context", "question", "options"],
    template="""
    Question description: {context}
    Question: {question}
    Options: {options}

    Please answer the question using the information provided in context to aid your reasoning.
    """
)

llm = model

# Create an LLMChain
chain = prompt | llm  # Assuming `llm` is initialized, e.g., OpenAI or Llama model

# Load the Excel dataset
df = pd.read_excel("reasoning_dataset_test.xlsx", sheet_name="test")

# Initialize an empty list to store the results
results = []

# Process each row in the dataset
for _, row in df.iterrows():
    # Extract row data
    context = row['context'] if not pd.isna(row['context']) else ""
    question = row['question'] if not pd.isna(row['question']) else ""
    options = row['options'] if not pd.isna(row['options']) else ""
    correct_answer = row['answer'] if not pd.isna(row['answer']) else ""
    label = row['label'] if not pd.isna(row['label']) else ""
    question_type = row['type'] if not pd.isna(row['type']) else ""

    # Create a query for the retriever
    retriever_query = f"{context} {question} {options}"

    # Retrieve relevant context from the vectorstore
    retrieved_contexts = vectorstore.similarity_search(query=retriever_query, k=5)
    retrieved_context = "\n".join([doc.page_content for doc in retrieved_contexts])

    # Combine original context with retrieved context
    combined_context = f"{context}\n\nAdditional context:\n{retrieved_context}"

    # Invoke the model safely with retry
    response = safe_chat_with_retry(chain, context=combined_context, question=question, options=options)

    # Store the results
    result = {
        "description": context,
        "question": question,
        "options": options,
        "retrieved_context": retrieved_context,
        "llama_RAG_wo_explanation_responses": response,
        "correct_answer": correct_answer,
        "label": label,
        "type": question_type
    }
    results.append(result)

    # Print the results for immediate feedback
    print(f"Description: {context}")
    print(f"Question: {question}")
    print(f"Options: {options}")
    print(f"Retrieved Context: {retrieved_context}")
    print(f"Llama RAG_wo_explanation Responses: {response}")
    print(f"Correct Answer: {correct_answer}")
    print(f"Label: {label}")
    print(f"Type: {question_type}")
    print()

Description: Eight persons namely P, Q, R, S, T, U, V, and W are sitting around a circular table facing the centre of the table but not necessarily in the same order. P sits second to the right of U, who sits opposite to T. Two persons sit between Q and T. S sits third to the right of P. V sits second to the left of W.
Question: How many persons sit between W and R when counted from the right of R?
Options: ["Two", "Three", "Four", "One", "None of these"]

Retrieved Context: not necessarily in the same order. Each of the persons is facing towards the centre. S7 sits second
to the right of S3. S2 sits third to the left of S4, who is the neighbour of S7. S6 is neither the
neighbour of S2 nor S7. Neither S1 is the neighbour of S2 nor S5 is the neighbour of S3.
Question: Who sits second to the right of the person who sits third to the left of S2?
Options: ["S5", "S1", "S3", "S6", "None of these"
Answer: "S5"
Question 9:
Context: There are eight persons S1, S2, S3, S4, S5, S6, S7 and S8 sit

In [14]:
print(results)

[{'description': 'Eight persons namely P, Q, R, S, T, U, V, and W are sitting around a circular table facing the centre of the table but not necessarily in the same order. P sits second to the right of U, who sits opposite to T. Two persons sit between Q and T. S sits third to the right of P. V sits second to the left of W.', 'question': 'How many persons sit between W and R when counted from the right of R?', 'options': '["Two", "Three", "Four", "One", "None of these"]\n', 'retrieved_context': 'not necessarily in the same order. Each of the persons is facing towards the centre. S7 sits second\nto the right of S3. S2 sits third to the left of S4, who is the neighbour of S7. S6 is neither the\nneighbour of S2 nor S7. Neither S1 is the neighbour of S2 nor S5 is the neighbour of S3.\nQuestion: Who sits second to the right of the person who sits third to the left of S2?\nOptions: ["S5", "S1", "S3", "S6", "None of these"\nAnswer: "S5"\nQuestion 9:\nContext: There are eight persons S1, S2, S

In [15]:
results[0]

{'description': 'Eight persons namely P, Q, R, S, T, U, V, and W are sitting around a circular table facing the centre of the table but not necessarily in the same order. P sits second to the right of U, who sits opposite to T. Two persons sit between Q and T. S sits third to the right of P. V sits second to the left of W.',
 'question': 'How many persons sit between W and R when counted from the right of R?',
 'options': '["Two", "Three", "Four", "One", "None of these"]\n',
 'retrieved_context': 'not necessarily in the same order. Each of the persons is facing towards the centre. S7 sits second\nto the right of S3. S2 sits third to the left of S4, who is the neighbour of S7. S6 is neither the\nneighbour of S2 nor S7. Neither S1 is the neighbour of S2 nor S5 is the neighbour of S3.\nQuestion: Who sits second to the right of the person who sits third to the left of S2?\nOptions: ["S5", "S1", "S3", "S6", "None of these"\nAnswer: "S5"\nQuestion 9:\nContext: There are eight persons S1, S2,

In [16]:
# Check types of each element in the results[0]
for key, value in results[0].items():
    print(f"Key: {key}, Type: {type(value)}")

Key: description, Type: <class 'str'>
Key: question, Type: <class 'str'>
Key: options, Type: <class 'str'>
Key: retrieved_context, Type: <class 'str'>
Key: llama_RAG_wo_explanation_responses, Type: <class 'str'>
Key: correct_answer, Type: <class 'str'>
Key: label, Type: <class 'float'>
Key: type, Type: <class 'str'>


In [20]:
import json
from langchain.schema import AIMessage

# Serialize the AIMessage object
def serialize_results(results):
    serialized_results = []
    for result in results:
        serialized_result = {
            'description': result['description'],
            'question': result['question'],
            'options': result['options'],
            'retrieved_context': result['retrieved_context'],  # Include retrieved context if relevant
            'llama RAG wo explanaton responses': result['llama_RAG_wo_explanation_responses'].content 
            if isinstance(result['llama_RAG_wo_explanation_responses'], AIMessage) 
            else result['llama_RAG_wo_explanation_responses'],
            'correct_answer': result['correct_answer'],
            'label': result['label'],
            'type': result['type']
        }
        serialized_results.append(serialized_result)
    return serialized_results

# Serialize and save results to a JSON file
with open("reasoning_llama_rag_wo_explanation_responses.json", "w", encoding="utf-8") as json_file:
    json.dump(serialize_results(results), json_file, indent=4, ensure_ascii=False)

print("Results have been saved to 'reasoning_llama_rag_wo_explanation_responses.json'.")


Results have been saved to 'reasoning_llama_rag_wo_explanation_responses.json'.


In [21]:
import json
import pandas as pd

# Open and load the JSON file
with open("reasoning_llama_rag_wo_explanation_responses.json", "r", encoding="utf-8") as json_file:
    data = json.load(json_file)

# Print number of rows (length of the data)
print(f"Number of rows: {len(data)}")

# Convert the data into a pandas DataFrame
df = pd.DataFrame(data)

# Display the DataFrame in a tabular format
print(df)

Number of rows: 29
                                          description  \
0   Eight persons namely P, Q, R, S, T, U, V, and ...   
1   Eight persons namely P, Q, R, S, T, U, V, and ...   
2   Eight persons namely P, Q, R, S, T, U, V, and ...   
3   Eight persons namely P, Q, R, S, T, U, V, and ...   
4   Eight persons namely P, Q, R, S, T, U, V, and ...   
5   A law firm has exactly nine partners: Fox, Gla...   
6   A law firm has exactly nine partners: Fox, Gla...   
7   A law firm has exactly nine partners: Fox, Gla...   
8   A law firm has exactly nine partners: Fox, Gla...   
9   A law firm has exactly nine partners: Fox, Gla...   
10  A law firm has exactly nine partners: Fox, Gla...   
11                                                      
12                                                      
13                                                      
14                                                      
15                                                      
16          