<a href="https://colab.research.google.com/github/zhiq/llm/blob/main/Local_RAG_with_emails.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook illustrates how to build **local** RAG with your emails.
Download it as a Jupyter notebook and run **locally**.

For privacy reasons, email examples are not provided. Feel free to use your own :)


Built with:
- [Unstructured.io](https://unstructured.io/)
- [LangChain](https://www.langchain.com/)
- [Ollama](https://ollama.com/)

In [None]:
# Setup step 1: download ollama, and pull the models
# !ollama pull llama3
# !ollama pull nomic-embed-text

In [None]:
# Setup step 2: install libraries
# !pip install -q langchain unstructured[all-docs] faiss-cpu

In [None]:
#Step 2: Preprocess emails with partition_email from Unstructured, turning them into elements.

import os
from unstructured.partition.email import partition_email

def preprocess_emails(directory):
  elements = []
  for root, _, files in os.walk(directory):
    for file in files:
        if file.endswith(".eml"):
            elems = partition_email(filename=os.path.join(root,
                                                          file))
            elements.extend(elems)
  return elements

email_elements = preprocess_emails("emails")

In [None]:
# Step 3: Chunk the email elements and prepare them for LangChain.
# `chunk_by_title` will take into account the documents' logical structure for better RAG results.

from unstructured.chunking.title import chunk_by_title
from langchain_core.documents import Document

chunked_elements = chunk_by_title(email_elements)

documents = []
for element in chunked_elements:
    metadata = element.metadata.to_dict()
    documents.append(Document(page_content=element.text,
                              metadata=metadata))

In [None]:
# Step4: Create vector storage with embeddings, prepare the retriever

from langchain.vectorstores import FAISS
from langchain_community.embeddings import OllamaEmbeddings

db = FAISS.from_documents(documents, OllamaEmbeddings(model="nomic-embed-text",show_progress=True))
retriever = db.as_retriever(search_type="similarity", search_kwargs={"k": 4})

OllamaEmbeddings: 100%|██████████| 3116/3116 [00:52<00:00, 59.47it/s]


In [None]:
# Step 5: Set up the local model:

from langchain_community.chat_models import ChatOllama

local_model = "llama3"
llm = ChatOllama(model=local_model, num_predict=400,
                 stop=["<|start_header_id|>", "<|end_header_id|>", "<|eot_id|>"])

In [None]:
# Step 6: Set up the RAG chain:

from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

prompt_template = """
<|start_header_id|>user<|end_header_id|>
Answer the user's question using provided context. Stick to the facts, do not draw your own conclusions.
Question: {question}
Context: {context}<|eot_id|><|start_header_id|>assistant<|end_header_id|>
"""

prompt = PromptTemplate(
    input_variables=["context", "question"],
    template=prompt_template,
)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [None]:
question = "What can you tell me about Cohere's Command+ models?"

In [None]:
rag_chain.invoke(question)

OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00, 24.45it/s]


"According to the provided context, Cohere's Command+ models are available with open weight access, meaning you can use them without restrictions or costs for non-commercial purposes. However, these models are not commercially applicable, and their weights are available for non-commercial use only. The training data used to develop these models is not shared."