<a href="https://colab.research.google.com/github/elektromusik/RAG/blob/main/RAG_Basic.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# RAG with Langchain and Mistral.

## Load Packages.

In [14]:
!pip install --quiet faiss-cpu langchain langchain_community langchain_mistralai
!pip install --quiet sentence_transformers

from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.prompts import ChatPromptTemplate
from langchain_mistralai.chat_models import ChatMistralAI
from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser

## Data Preprocessing.

In [15]:
# Import the Text.
with open("The_Great_Gatsby.txt") as x:
  text = x.read()

# Chunk the Text
# [1 page ~ 700 words. 1 chunk <= 256 words (due to the embedding model).
# 1 word ~ 4.7 characters. So, 1 chunk <= 1000 characters, otherwise it is
# truncated. Alltogether, we end up with at least 3 chunks per page at a
# chunk_size of 1000.]
text_splitter = RecursiveCharacterTextSplitter(
                                  chunk_size=1000,
                                  chunk_overlap=100,
                                  separators=["\n\n", "\n", ".", ",", " ", ""])

chunks = text_splitter.split_text(text)

# Choose the Embedding Model.
# [I tried to find the best embedding model via the MTEB leaderboard at
# huggingface.co:
# 1) nvidia/NV-Embed-v2 (not found on NVIDIA website),
# 2) BAAI/bge-en-icl (runs forever)],
# ...
# 10) nvidia/NV-Embed-v1 (needed packages incompatible).
# Hence, I ended up with the following standard model.]
embedding_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

# Add the Chunks to the Vector Database.
vectorstore = FAISS.from_texts(texts=chunks, embedding=embedding_model)



## Main components of RAG.

In [12]:
# Retriever.
# [Similarity search with a threshold: search_type="similarity_score_threshold",
# search_kwargs={"score_threshold": 0.05}]
retriever = vectorstore.as_retriever()

# Systemprompt.
template = """
You are an assistant for question-answering tasks.
Use the following pieces of retrieved context to answer the question.
If you don't know the answer, just say that you don't know.
Use three sentences maximum and keep the answer concise.
Question: {question}
Context:  {context}
Answer:
"""
prompt = ChatPromptTemplate.from_template(template)

# Set LLM.
llm = ChatMistralAI(mistral_api_key="QlvclnycvhnkP808e4HS0BWz0kwZU06j")

# Create Pipeline (Retrieve-Augment-Generate)
RAG_chain = ({"context": retriever,  "question": RunnablePassthrough()}
              | prompt
              | llm
              | StrOutputParser())

## Q&A

In [13]:
# Generate.
query = "What foods were served at the parties?"
RAG_chain.invoke(query)

"The documents mention various foods at Gatsby's parties, including oranges and lemons, spiced baked hams, salads, pastry pigs and turkeys, and hors-d'oeuvre. There's also a bar with gins, liquors, and cordials. The first supper served at a party had married couples and Jordan's escort. The specific food consumed at the dinner is not detailed."