<a target="_blank" href="https://colab.research.google.com/github/sergiopaniego/RAG_local_tutorial/blob/main/example.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# Simple RAG example with Langchain, Ollama and and open-source model

Sources:

* https://github.com/svpino/llm
* https://github.com/AIAnytime/Gemma-7B-RAG-using-Ollama/blob/main/Ollama%20Gemma.ipynb
* https://www.youtube.com/watch?v=-MexTC18h20&ab_channel=AIAnytime
* https://www.youtube.com/watch?v=HRvyei7vFSM&ab_channel=Underfitted

 
# Requirements

* Ollama installed locally

# Install the requirements

If an error is raised related to docarray, refer to this solution: https://stackoverflow.com/questions/76880224/error-using-using-docarrayinmemorysearch-in-langchain-could-not-import-docarray

In [None]:
!pip3 install langchain
!pip3 install langchain_pinecone
!pip3 install langchain[docarray]
!pip3 install docarray
!pip3 install pypdf

# Select the LLM model to use

The model must be downloaded locally to be used, so if you want to run llama2, you should run:

```

ollama pull llama2

```

Check the list of models available for Ollama here: https://ollama.com/library

In [23]:
#MODEL = "gpt-3.5-turbo"
#MODEL = "mixtral:8x7b"
MODEL = "gemma:7b"
#MODEL = "llama2"
MODEL = "llama3"

# We instanciate the LLM model and the Embedding model

In [24]:
from langchain_community.llms import Ollama
from langchain_community.embeddings import OllamaEmbeddings

model = Ollama(model=MODEL)
embeddings = OllamaEmbeddings(model=MODEL)

model.invoke("Tell me a joke")

In [None]:
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()

chain = model | parser 
chain.invoke("Tell me a joke")

'What did the ocean say to the beach?\n\nNothing, it just waved.'

# We generate the template for the conversation with the instruct-based LLM

In [7]:
from langchain.prompts import PromptTemplate

template = """
Answer the question based on the context below. If you can't 
answer the question, reply "I don't know".

Context: {context}

Question: {question}
"""

prompt = PromptTemplate.from_template(template)
prompt.format(context="Here is some context", question="Here is a question")

'\nAnswer the question based on the context below. If you can\'t \nanswer the question, reply "I don\'t know".\n\nContext: Here is some context\n\nQuestion: Here is a question\n'

In [9]:
chain = prompt | model | parser

chain.invoke({"context": "My parents named me Sergio", "question": "What's your name'?"})

'Sergio'

# Load an example PDF to do Retrieval Augmented Generation (RAG)

For the example, you can select your own PDF.

In [10]:
# Example pdf downloaded from https://www.ml.school/ 
from langchain_community.document_loaders import PyPDFLoader


loader = PyPDFLoader("machine_learning.pdf")
pages = loader.load_and_split()
pages

[Document(page_content='11/4/24, 19:02 Building Machine Learning Systems That Don\'t Suck\nhttps://www.ml.school 1/10Building Machine Learning Systems That Don\'t\nSuck\nA live, interactive program that\'ll help you build production-readymachine\nlearning systems from the ground up.\nNext cohort:\xa0May6 - 23, 2024\nCheck the schedulefor more details about upcoming cohorts.\nI want to join!Sign in\nLearn how to design, build, deploy, and scale machine learning\nsystems to solve real-world problems.\nI\'ll lose my mind if I see another book or course teaching people the same basic\nideas for the hundredth time. Most people are stuck in beginner mode, and finding\nhelp to solve real-world problems is hard.\nI want to change that.\nI started writing software 30 years ago. I\'ve written pipelines and trained models\nfor some of the largest companies in the world. I want to show you how to do the\nsame.\nThis is the class I wish I had taken when I started."This is the best machine learning 

# Store the PDF in a vector space.

From Langchain docs:

`DocArrayInMemorySearch is a document index provided by Docarray that stores documents in memory. It is a great starting point for small datasets, where you may not want to launch a database server.`

The execution time of the following block depends on the complexity and longitude of the PDF provided. Try to keep it small and simple for the example.

In [11]:
from langchain_community.vectorstores import DocArrayInMemorySearch

vectorstore = DocArrayInMemorySearch.from_documents(pages, embedding=embeddings)



# Create retriever of vectors that are similar to be used as context

In [12]:
retriever = vectorstore.as_retriever()
retriever.invoke("machine learning")

[Document(page_content="11/4/24, 19:02 Building Machine Learning Systems That Don't Suck\nhttps://www.ml.school 7/10Program Syllabus\nThis program will teach you the practical skills and insights that will\nhelp you build machine learning systems.\nHere are the contents of the six live sessions of the program:\nSession 1 - How To Start (Almost) Any Project\nWhat makes production machine learning different from what you've learned.\nThe strategy to solve the right problem using the right solution.\nCritical questions to ask before starting any project.\nProblem framing, inversion, and the haystack principle for building successful\napplications.\nThe first rule of machine learning engineering and how to start building.\nData collection strategies. A technique to determine how much data you\nneed.\nThe problem of selection bias and how to deal with it.\nLabeling data. Human annotations, natural labels and weak supervision.\nActive learning using the uncertainty and diversity sampling str

# Generate chain to generate the conversation, including the documents

In [19]:
from operator import itemgetter

chain = (
    {
        "context": itemgetter("question") | retriever,
        "question": itemgetter("question"),
    }
    | prompt
    | model
    | parser
)

In [20]:
questions = [
    "What is the purpose of the course?",
    "How many hours of live sessions?",
    "How many coding assignments are there in the program?",
    "Is there a program certificate upon completion?",
    "What programming language will be used in the program?",
    "How much does the program cost?",
]

for question in questions:
    print(f"Question: {question}")
    print(f"Answer: {chain.invoke({'question': question})}")
    print()

Question: What is the purpose of the course?
Answer: The purpose of the course is to teach machine learning engineers how to build, evaluate, run, monitor, and maintain machine learning systems in real-world scenarios.

Question: How many hours of live sessions?
Answer: The live sessions are held every Monday and Thursday for 2 hours each.

Question: How many coding assignments are there in the program?
Answer: The provided text does not contain any information regarding the number of coding assignments in the program, so I cannot answer this question from the given context.

Question: Is there a program certificate upon completion?
Answer: The provided text does not contain any information regarding a program certificate upon completion, so I cannot answer this question from the given context.

Question: What programming language will be used in the program?
Answer: Python

Question: How much does the program cost?
Answer: The provided text does not contain information regarding the c

# Loop to ask-answer questions continously

In [None]:
while True:
    question = input('User question: ')
    print(f"Question: {question}")
    print(f"Answer: {chain.invoke({'question': question})}")
    print()