<a href="https://colab.research.google.com/github/reyagao/Health_Chatbot/blob/main/RAG_example.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<a target="_blank" href="https://colab.research.google.com/github/sergiopaniego/RAG_local_tutorial/blob/main/example_rag.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# Simple RAG example with Langchain, Ollama and and open-source LLM model

In this example, we first connect to an LLM locally and make request to the LLM that Ollama is serving using LangChain. After that, we generate our RAG application from a PDF file and extract details from that document.

<p align="center">
  <img src="https://cdn.analyticsvidhya.com/wp-content/uploads/2023/07/langchain3.png" alt="Langchain Logo" width="20%">
  <img src="https://bookface-images.s3.amazonaws.com/logos/ee60f430e8cb6ae769306860a9c03b2672e0eaf2.png" alt="Ollama Logo" width="20%">
</p>

Sources:

* https://github.com/svpino/llm
* https://github.com/AIAnytime/Gemma-7B-RAG-using-Ollama/blob/main/Ollama%20Gemma.ipynb
* https://www.youtube.com/watch?v=-MexTC18h20&ab_channel=AIAnytime
* https://www.youtube.com/watch?v=HRvyei7vFSM&ab_channel=Underfitted


# Requirements

* Ollama installed locally

# Install the requirements

If an error is raised related to docarray, refer to this solution: https://stackoverflow.com/questions/76880224/error-using-using-docarrayinmemorysearch-in-langchain-could-not-import-docarray

In [1]:
!pip3 install langchain
!pip3 install langchain_pinecone
!pip3 install langchain[docarray]
!pip3 install docarray
!pip3 install pypdf

Collecting langchain_pinecone
  Downloading langchain_pinecone-0.2.3-py3-none-any.whl.metadata (1.3 kB)
Collecting pinecone<6.0.0,>=5.4.0 (from langchain_pinecone)
  Downloading pinecone-5.4.2-py3-none-any.whl.metadata (19 kB)
Collecting aiohttp<3.11,>=3.10 (from langchain_pinecone)
  Downloading aiohttp-3.10.11-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.7 kB)
Collecting langchain-tests<1.0.0,>=0.3.7 (from langchain_pinecone)
  Downloading langchain_tests-0.3.12-py3-none-any.whl.metadata (3.2 kB)
Collecting pytest-asyncio<1,>=0.20 (from langchain-tests<1.0.0,>=0.3.7->langchain_pinecone)
  Downloading pytest_asyncio-0.25.3-py3-none-any.whl.metadata (3.9 kB)
Collecting syrupy<5,>=4 (from langchain-tests<1.0.0,>=0.3.7->langchain_pinecone)
  Downloading syrupy-4.8.2-py3-none-any.whl.metadata (36 kB)
Collecting pytest-socket<1,>=0.6.0 (from langchain-tests<1.0.0,>=0.3.7->langchain_pinecone)
  Downloading pytest_socket-0.7.0-py3-none-any.whl.metadata (6.7 kB)
Coll

# Select the LLM model to use

The model must be downloaded locally to be used, so if you want to run llama3, you should run:

```

ollama pull llama3

```

Check the list of models available for Ollama here: https://ollama.com/library

In [2]:
#MODEL = "gpt-3.5-turbo"
#MODEL = "mixtral:8x7b"
#MODEL = "gemma:7b"
#MODEL = "llama2"
MODEL = "llama3" # https://ollama.com/library/llama3

In [2]:
!pip install -U langchain-ollama

Collecting langchain-ollama
  Downloading langchain_ollama-0.2.3-py3-none-any.whl.metadata (1.9 kB)
Collecting ollama<1,>=0.4.4 (from langchain-ollama)
  Downloading ollama-0.4.7-py3-none-any.whl.metadata (4.7 kB)
Downloading langchain_ollama-0.2.3-py3-none-any.whl (19 kB)
Downloading ollama-0.4.7-py3-none-any.whl (13 kB)
Installing collected packages: ollama, langchain-ollama
Successfully installed langchain-ollama-0.2.3 ollama-0.4.7


# We instanciate the LLM model and the Embedding model

In [3]:

from langchain_ollama import OllamaLLM

model = OllamaLLM(base_url="http://ab65-217-117-226-146.ngrok-free.app", model="llama3")
response = model.invoke("你好，介绍一下Ollama的使用方法")
print(response)




😊 Ollama is a popular online tool for creating and sharing voice messages, and I'd be happy to introduce you to its usage method. Here's a step-by-step guide:

**What is Ollama?**
Before we dive into the usage method, let me briefly explain what Ollama is. Ollama is an online platform that allows users to create and share voice messages anonymously. It's like a digital post-it note, but instead of leaving a written message, you leave a spoken one.

**How to use Ollama:**

1. **Visit the website**: Go to [Ollama.com](http://ollama.com) in your web browser.
2. **Create an account**: Click on "Sign up" and enter your email address, password, and username (optional).
3. **Choose a theme**: Select a theme for your voice message from the available options or create your own.
4. **Record your message**: Use the built-in microphone to record your voice message. You can speak as much or as little as you like.
5. **Add audio effects (optional)**: If you want to add some flair to your message, yo

In [5]:
model.invoke("Waht is 2+2?")

'The answer to 2+2 is 4.'

## Using a parser provided by LangChain, we can transform the LLM output to something more suitable to be read

# We generate the template for the conversation with the instruct-based LLM

We can create a template to structure the conversation effectively.

This template allows us to provide some general context to the Language Learning Model (LLM), which will be utilized for every prompt. This ensures that the model has a consistent background understanding for all interactions.

Additionally, we can include specific context relevant to the particular prompt. This helps the model understand the immediate scenario or topic before addressing the actual question. Following this specific context, we then present the actual question we want the model to answer.

By using this approach, we enhance the model's ability to generate accurate and relevant responses based on both the general and specific contexts provided.

In [6]:
from langchain.prompts import PromptTemplate

template = """
Answer the question based on the context below. If you can't
answer the question, answer with "I don't know".

Context: {context}

Question: {question}
"""

prompt = PromptTemplate.from_template(template)
prompt.format(context="Here is some context", question="Here is a question")

'\nAnswer the question based on the context below. If you can\'t \nanswer the question, answer with "I don\'t know".\n\nContext: Here is some context\n\nQuestion: Here is a question\n'

The model can answer prompts based on the context:

In [7]:
formatted_prompt = prompt.format(context="My parents named me Sergio", question="What's your name?")
response_from_model = model.invoke(formatted_prompt)
parsed_response = parser.parse(response_from_model)
print(parsed_response)

NameError: name 'parser' is not defined

But it can't answer what is not provided as context:

In [None]:
formatted_prompt = prompt.format(context="My parents named me Sergio", question="What's my age?")
response_from_model = model.invoke(formatted_prompt)
parsed_response = parser.parse(response_from_model)
print(parsed_response)

I don't know! The provided context only tells me that your parents named you Sergio, but it doesn't mention anything about your age. I can't infer or guess your age based on this information.


Even previously known info!

In [None]:
formatted_prompt = prompt.format(context="My parents named me Sergio", question="What is 2+2?")
response_from_model = model.invoke(formatted_prompt)
parsed_response = parser.parse(response_from_model)
print(parsed_response)

I don't know!


# Load an example PDF to do Retrieval Augmented Generation (RAG)

For the example, you can select your own PDF.

In [None]:
from langchain_community.document_loaders import PyPDFLoader


loader = PyPDFLoader("./files/teaching.pdf")
pages = loader.load_and_split()
#pages = loader.load()
pages

[Document(page_content='teaching/talks\nUniv ersity and non-univ ersity courses shown\n🎓Curr ently teaching assistant for:\nSubject Wher e When\nArti\x00cial Intelligence3rd year Robotics Softwar e Engineering\nUniv ersidad Re y Juan Carlos22-23\n21-22\n20-21\nRobotics3rd year Telematics Engineering\nUniv ersidad Re y Juan Carlos22-23\n\x00Couses:\nCourse name Wher e When\nIntroduction t o Coding ISDI (DMBA, MBA)December\n2023 -\nIntroduction t o Arti\x00cial Intelligence IES E uropa, MadridNovember\n2023\nIntroduction t o Programming with P ython Atenea F ormaciónNovember\n2023\nIntroduction t o Arti\x00cial Intelligence and\nits A pplicationsAtenea F ormación July 2023\nObject detection and segmentation with\nTensor\x00owPlatzi July 2022\n© Cop yright 2024 Ser gio P aniego. P ower ed b y Jekyll  with al-folio  theme. Hosted b y GitHub P ages .21/6/24, 17:18 teaching/talks | Sergio Paniego\nhttps://sergiopaniego.github.io/teaching_and_activities/ 1/3', metadata={'source': './files/tea

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=100, chunk_overlap=20)
text_documents = text_splitter.split_documents(pages)[:5]

pages

[Document(page_content='teaching/talks\nUniv ersity and non-univ ersity courses shown\n🎓Curr ently teaching assistant for:\nSubject Wher e When\nArti\x00cial Intelligence3rd year Robotics Softwar e Engineering\nUniv ersidad Re y Juan Carlos22-23\n21-22\n20-21\nRobotics3rd year Telematics Engineering\nUniv ersidad Re y Juan Carlos22-23\n\x00Couses:\nCourse name Wher e When\nIntroduction t o Coding ISDI (DMBA, MBA)December\n2023 -\nIntroduction t o Arti\x00cial Intelligence IES E uropa, MadridNovember\n2023\nIntroduction t o Programming with P ython Atenea F ormaciónNovember\n2023\nIntroduction t o Arti\x00cial Intelligence and\nits A pplicationsAtenea F ormación July 2023\nObject detection and segmentation with\nTensor\x00owPlatzi July 2022\n© Cop yright 2024 Ser gio P aniego. P ower ed b y Jekyll  with al-folio  theme. Hosted b y GitHub P ages .21/6/24, 17:18 teaching/talks | Sergio Paniego\nhttps://sergiopaniego.github.io/teaching_and_activities/ 1/3', metadata={'source': './files/tea

# Store the PDF in a vector space.

From Langchain docs:

`DocArrayInMemorySearch is a document index provided by Docarray that stores documents in memory. It is a great starting point for small datasets, where you may not want to launch a database server.`

The execution time of the following block depends on the complexity and longitude of the PDF provided. Try to keep it small and simple for the example.

In [None]:
from langchain_community.vectorstores import DocArrayInMemorySearch

vectorstore = DocArrayInMemorySearch.from_documents(text_documents, embedding=embeddings)

# Create retriever of vectors that are similar to be used as context

In [None]:
retriever = vectorstore.as_retriever()
retriever.invoke("artificial intelligence")

[Document(page_content='Introduction t o Coding ISDI (DMBA, MBA)December\n2023 -', metadata={'source': './files/teaching_talks _ Sergio Paniego.pdf', 'page': 0}),
 Document(page_content='teaching/talks\nUniv ersity and non-univ ersity courses shown\n🎓Curr ently teaching assistant for:', metadata={'source': './files/teaching_talks _ Sergio Paniego.pdf', 'page': 0}),
 Document(page_content='Univ ersidad Re y Juan Carlos22-23\n21-22\n20-21\nRobotics3rd year Telematics Engineering', metadata={'source': './files/teaching_talks _ Sergio Paniego.pdf', 'page': 0}),
 Document(page_content='Subject Wher e When\nArti\x00cial Intelligence3rd year Robotics Softwar e Engineering', metadata={'source': './files/teaching_talks _ Sergio Paniego.pdf', 'page': 0})]

# Generate conversate with the document to extract the details

In [None]:
# Assuming retriever is an instance of a retriever class and has a method to retrieve context
retrieved_context = retriever.invoke("artificial intelligence")

In [None]:
questions = [
    "What are his research interests?",
    "Does he have teaching experience?",
    "Does he know about Tensorflow?"
]

for question in questions:
    formatted_prompt = prompt.format(context=retrieved_context, question=question)
    response_from_model = model.invoke(formatted_prompt)
    parsed_response = parser.parse(response_from_model)

    print(f"Question: {question}")
    print(f"Answer: {parsed_response}")
    print()

Question: What are his research interests?
Answer: I don't know. The provided context does not mention the specific research interests of Sergio Paniego, but it does show the courses and topics he is teaching or has taught in the past. If you're looking for information on his research interests, I recommend searching for academic articles, presentations, or online profiles where he may have shared his research areas of focus.

Question: Does he have teaching experience?
Answer: Based on the context, it appears that the individual has teaching experience. The documents mention "teaching talks" and "teaching assistant", which suggests that they have experience in this area. Therefore, my answer is:

Yes, he has teaching experience.

Question: Does he know about Tensorflow?
Answer: I don't know. The provided context only shows documents related to teaching talks and course information, but it does not mention TensorFlow specifically. Therefore, I cannot determine whether the person knows 

# Loop to ask-answer questions continously

In [None]:
while True:
    print("Say 'exit' or 'quit' to exit the loop")
    question = input('User question: ')
    print(f"Question: {question}")
    if question.lower() in ["exit", "quit"]:
        print("Exiting the conversation. Goodbye!")
        break
    formatted_prompt = prompt.format(context=retrieved_context, question=question)
    response_from_model = model.invoke(formatted_prompt)
    parsed_response = parser.parse(response_from_model)
    print(f"Answer: {parsed_response}")
    print()