# LangChain MongoDB Integration - Implement RAG Locally

This notebook is a companion to the [LangChain Local RAG](https://www.mongodb.com/docs/atlas/ai-integrations/langchain/get-started/) tutorial. Refer to the page for set-up instructions and detailed explanations.

<a target="_blank" href="https://colab.research.google.com/github/mongodb/docs-notebooks/blob/main/ai-integrations/langchain-local-rag.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

## Create a local Atlas deployment

Run the following command in your terminal to set up your local Atlas deployment. 

```
atlas deployments setup
```

## Set up the environment

In [None]:
pip install --quiet --upgrade pymongo langchain langchain-community langchain-huggingface gpt4all pypdf

In [None]:
MONGODB_URI = ("mongodb://localhost:<port-number>/?directConnection=true")

## Configure the vector store

In [None]:
from langchain_mongodb import MongoDBAtlasVectorSearch
from langchain_huggingface import HuggingFaceEmbeddings

# Load the embedding model (https://huggingface.co/mixedbread-ai/mxbai-embed-large-v1)
embedding_model = HuggingFaceEmbeddings(model_name="mixedbread-ai/mxbai-embed-large-v1")

# Instantiate vector store
vector_store = MongoDBAtlasVectorSearch.from_connection_string(
   connection_string = MONGODB_URI,
   namespace = "langchain_db.local_rag",
   embedding=embedding_model,
   index_name="vector_index"
)

In [None]:
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Load the PDF
loader = PyPDFLoader("https://investors.mongodb.com/node/13176/pdf")
data = loader.load()

# Split PDF into documents
text_splitter = RecursiveCharacterTextSplitter(chunk_size=200, chunk_overlap=20)
docs = text_splitter.split_documents(data)

# Add data to the vector store
vector_store.add_documents(docs)

In [None]:
vector_store.create_vector_search_index(
  dimensions = 1024,       # The dimensions of the vector embeddings to be indexed
  wait_until_complete = 60 # Number of seconds to wait for the index to build (can take around a minute)
)

## Implement RAG with a local LLM
Before running the following code, [download the local model](https://gpt4all.io/models/gguf/mistral-7b-openorca.gguf2.Q4_0.gguf).

In [None]:
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain_community.llms import GPT4All

# Configure the LLM
local_path = "<path-to-model>"

# Callbacks support token-wise streaming
callbacks = [StreamingStdOutCallbackHandler()]

# Verbose is required to pass to the callback manager
llm = GPT4All(model=local_path, callbacks=callbacks, verbose=True)

In [None]:
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
import pprint

# Instantiate MongoDB Vector Search as a retriever
retriever = vector_store.as_retriever()

# Define prompt template
template = """
Use the following pieces of context to answer the question at the end.
{context}
Question: {question}
"""
custom_rag_prompt = PromptTemplate.from_template(template)

def format_docs(docs):
   return "\n\n".join(doc.page_content for doc in docs)

# Create chain   
rag_chain = (
   {"context": retriever | format_docs, "question": RunnablePassthrough()}
   | custom_rag_prompt
   | llm
   | StrOutputParser()
)

# Prompt the chain
question = "What was MongoDB's latest acquisition?"
answer = rag_chain.invoke(question)

# Return source documents
documents = retriever.invoke(question)
print("\nSource documents:")
pprint.pprint(documents)