# RAG with MongoDB Atlas and VertexAI Reasoning Engine using Langchain

**Vertex AI Reasoning Engine with Langchain**  is a powerful duo for building and deploying generative AI applications. It is one of the managed services in the Vertex AI console that porvides secure scalable runtime environment for your workload. Langchain provides the tool to design your application logic while reasoning engine provides environment to run it. With its flexibility to connect external data sources we can connect the MongoDB Atlas to Google VertexAI reasoning engine. 

At the core lies **MongoDB Atlas Vector Search**. It excels at searching unstructured data using vector embeddings, allowing you to find similar information even if phrased differently. This empowers your AI to grasp the true meaning behind user queries. Langchain then steps in, providing a user-friendly framework to design your application logic. Here, you can leverage Langchain's flexibility to seamlessly integrate MongoDB Atlas Vector Search, enabling your AI to retrieve highly relevant data based on semantic similarity. Finally, Vertex AI Reasoning Engine provides a secure and scalable environment to run your creations. This trio simplifies development, offering a pre-built foundation and tools to focus on building innovative solutions. With MongoDB Atlas Vector Search's semantic understanding, your generative AI applications can deliver superior results and user experiences.

In this Notebook we will cover *How to build a RAG and deploy it as endpoints using Reasoning Engine, MongoDB Atlas and VertexAI*

First we will install all thre required dependecies and restart the kernel

In [49]:
!pip install --upgrade --quiet \
    "google-cloud-aiplatform[langchain,reasoningengine]" \
    cloudpickle==3.0.0 \
    pydantic==2.7.4 \
    requests \
    datasets \
    pymongo \
    langchain \
    langchain-mongodb \
    langchain-google-vertexai \

import sys

if "google.colab" in sys.modules:
    from google.colab import auth

    auth.authenticate_user()

import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

{'status': 'ok', 'restart': True}

## Ingest data
To begin with the setup we will load the dataset to MongoDB Atlas. For user convinience, we are using an existing Hugingface MongoDB embedding dataset. Run the below code to import the *MongoDB/subset_arxiv_papers_with_embeddings* dataset as ds and load to MongoDB Atlas.




## Create vector search index on the newly created MongoDb collection


// To do: add code for creating atlas vector search index on the collection

### Initilize Vertex AI SDK

To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/apis/enableflow?apiid=aiplatform.googleapis.com).

In [1]:
PROJECT_ID = "gcp-pov"  # @param {type:"string"}
LOCATION = "us-central1"  # @param {type:"string"}
STAGING_BUCKET = "gs://vshanbh01"  # @param {type:"string"}

import vertexai

vertexai.init(project=PROJECT_ID, location=LOCATION, staging_bucket=STAGING_BUCKET)

### Import reasoning engine library

In [2]:
from vertexai.preview import reasoning_engines

### 1. Define Model

As you construct your reasoning engine agent from the bottom up, the first component deals with which generative model you want to use in your agent. We are using "gemini-1.5-pro" which is latest at the time of creation of this python notebook. This LLM model will be used to build the RAG itself.

In [3]:
model = "gemini-1.5-pro-001"


### 2. Defile Function to read from MongoDB Atlas using langchain

The second component of your agent includes tools and functions, which enable the generative model to interact with MongoDB Atlas. We use Langchain to interact and query vectors from MongoDB Atlas. The function takes "query" as input and is trasformed into embeddings using Googles "textembedding-gecko@001" model. The function returns the queried data from MongoDB that has most similarity with the queried data.

In [22]:
def get_vectors_from_mongodb(
    query: str
):
    """
    Retrieves vectors from a MongoDB database and uses them to answer a question.

    Args:
        query: The question to be answered.

    Returns:
        A dictionary containing the response to the question.
    """
    from langchain.chains import ConversationalRetrievalChain, RetrievalQA
    from langchain_mongodb import MongoDBAtlasVectorSearch
    from langchain_google_vertexai import VertexAIEmbeddings, ChatVertexAI
    from langchain.memory import ConversationBufferMemory, ConversationBufferWindowMemory
    from pymongo import MongoClient
    import certifi

    from langchain.prompts import PromptTemplate


    prompt_template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. Summarise the response in 2 sentences.


    {context}


    Question: {question}
    """
    PROMPT = PromptTemplate(
        template=prompt_template, input_variables=["context", "question"]
    )

    # Add your connection string in srv format below in place of URI
    client = MongoClient("URI", tlsCAFile=certifi.where())
    db = client["vertexaiApp"]

    embeddings = VertexAIEmbeddings(model_name="textembedding-gecko@001")

    vs = MongoDBAtlasVectorSearch(
        collection=db["chat-vec"],
        embedding=embeddings,
        index_name="vector_index",
        embedding_key="vec",
        text_key="line",
    )

    llm = ChatVertexAI(
        model_name="gemini-pro",
        convert_system_message_to_human=True,
        max_output_tokens=1000,
    )
    retriever = vs.as_retriever(
        search_type="mmr", search_kwargs={"k": 10, "lambda_mult": 0.25}
    )
    memory = ConversationBufferWindowMemory(
        memory_key="chat_history", k=5, return_messages=True
    )
    conversation_chain = ConversationalRetrievalChain.from_llm(
        llm=llm,
        retriever=retriever,
        memory=memory,
        combine_docs_chain_kwargs={"prompt": PROMPT},
    )
    response = conversation_chain({"question": query})

    return response

In [None]:
get_vectors_from_mongodb(query="tell me about 04 Examples of collaborations with GCP")

### 3. Define agent for calling your function

The third component of your agent involves adding a reasoning layer, which helps your agent use the tools that you provided to help the end user achieve a higher-level goal.


In [44]:
agent = reasoning_engines.LangchainAgent(
    model=model,
    tools=[get_vectors_from_mongodb],
)

In [None]:
agent.query(input="tell me about 04 Examples of collaborations with GCP")

### 4. Deploy the agent

Now that you've specified a model, tools, and reasoning for your agent and tested it out, you're ready to deploy your agent as a remote service in Vertex AI!

In [None]:
remote_agent = reasoning_engines.ReasoningEngine.create(
    agent,
    requirements=[
        "google-cloud-aiplatform[langchain,reasoningengine]",
        "cloudpickle==3.0.0",
        "pydantic==2.7.4",
        "langchain-mongodb",
        "pymongo",
        "langchain-google-vertexai",

    ],
)