<img src="https://cdn-assets-cloud.frontify.com/local/frontify/eyJwYXRoIjoiXC9wdWJsaWNcL3VwbG9hZFwvc2NyZWVuc1wvMTk3OTA0XC80M2ZmNTdhYjc4OTdlZjUzY2IzMWUwNGU0MTVjZTY2NC0xNTYyMTAzMDk0LnBuZyJ9:frontify:7CTV2DtJsWvlctEUEyFK36JoXsZuVtHssMaDED6O5z0" width='150' />

# VECTOR SEARCH - RETRIEVAL AUGMENTED GENERATION

__How to use this notebook__

1. Run the code cell below and paste the following into the password widgets:
    1. Your Atlas cluster URI (w/ read/write permissions)
    1. Your OpenAI API key (used to create embeddings and pose questions to an LLM)
1. If desired, use a custom PDF URL and ask custom questions at the end
1. Present the notebook via the "Enter/Exit RISE Slideshow" toolbar button (looks like a bar chart)
    1. Put your browser into full-screen mode for best results
    1. To advance a cell without executing, use "Space"
    1. To execute the current cell, use "Shift-Enter"

In [11]:
import ipywidgets as widgets
import os

mongodb_uri_widget = widgets.Password(
    description='Your Atlas URI:',
    disabled=False,
    style=dict(description_width='125px')
)

openai_api_key_widget = widgets.Password(
    description='Your OpenAI API key:',
    disabled=False,
    style=dict(description_width='125px')
)

display(mongodb_uri_widget)
display(openai_api_key_widget)

Password(description='Your Atlas URI:', style=DescriptionStyle(description_width='125px'))

Password(description='Your OpenAI API key:', style=DescriptionStyle(description_width='125px'))

# Retrieval Augmented Generation
### Using MongoDB Atlas, OpenAI and LangChain

In [12]:
from IPython.display import IFrame

PDF_URI = "https://s3.amazonaws.com/info-mongodb-com/MongoDB_Architecture_Guide.pdf"
IFrame(PDF_URI, width=1280, height=500)

# Get connection to MongoDB Atlas

In [13]:
from pymongo import MongoClient
import os

mongo_db_name = 'rag_demo'
mongo_coll_name = 'content'

mongo_client = MongoClient(mongodb_uri_widget.value)
mongo_coll = mongo_client[mongo_db_name][mongo_coll_name]
mongo_db_and_coll_path = '{}.{}'.format(mongo_db_name, mongo_coll_name)

doc_count = mongo_coll.count_documents({})
'{} document count is {:,}'.format(mongo_db_and_coll_path, doc_count)

'rag_demo.content document count is 18'

In [14]:
# Delete existing documents -- run before demo
mongo_coll.delete_many({})

<pymongo.results.DeleteResult at 0x7fe16b291e80>

# Select embeddings/transformer model

In [15]:
from langchain.embeddings import OpenAIEmbeddings

embeddings_model = OpenAIEmbeddings(
    model='text-embedding-ada-002',
    openai_api_key=openai_api_key_widget.value
)

print('Embedding Model - OpenAI')

Embedding Model - OpenAI


# Split PDF into chunks

In [16]:
from langchain.document_loaders import PyPDFLoader

loader = PyPDFLoader(PDF_URI)
chunked_docs = loader.load_and_split()

'PDF has resulted in {:,} chunks'.format(len(chunked_docs))

'PDF has resulted in 18 chunks'

In [17]:
biggest_chunk_length = max(len(chunk.page_content.split()) for chunk in chunked_docs)
'The biggest chunk contains {:,} words'.format(biggest_chunk_length)

'The biggest chunk contains 415 words'

# Create vectors and add to MongoDB Atlas

In [18]:
from langchain.vectorstores import MongoDBAtlasVectorSearch

vector_db = MongoDBAtlasVectorSearch.from_documents(
    chunked_docs,
    embeddings_model,
    collection=mongo_coll
)

In [19]:
doc_count = mongo_coll.count_documents({})
'MongoDB document count in {} is {:,}'.format(mongo_db_and_coll_path, doc_count)

'MongoDB document count in rag_demo.content is 18'

# Create MongoDB Atlas vector search index

In [19]:
from pymongo.errors import OperationFailure
import inspect

mongo_index_def = {
    'name': 'rag_demo_index',
    'definition': {
      "fields": [
        {
          "numDimensions": 1536,
          "path": "embedding",
          "similarity": "cosine",
          "type": "vector"
        }
      ]
    }
}

try:
    mongo_coll.create_search_index(mongo_index_def)
    print('Search index is building')
except OperationFailure as e:
    print(e.details['codeName'])

ServerSelectionTimeoutError: cluster0-shard-00-00.pnmkh.mongodb.net:27017: timed out,cluster0-shard-00-02.pnmkh.mongodb.net:27017: [Errno 60] Operation timed out,cluster0-shard-00-01.pnmkh.mongodb.net:27017: [Errno 60] Operation timed out, Timeout: 30s, Topology Description: <TopologyDescription id: 65e7d6686a27e14a0ad595d8, topology_type: ReplicaSetNoPrimary, servers: [<ServerDescription ('cluster0-shard-00-00.pnmkh.mongodb.net', 27017) server_type: Unknown, rtt: None, error=NetworkTimeout('cluster0-shard-00-00.pnmkh.mongodb.net:27017: timed out')>, <ServerDescription ('cluster0-shard-00-01.pnmkh.mongodb.net', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('cluster0-shard-00-01.pnmkh.mongodb.net:27017: [Errno 60] Operation timed out')>, <ServerDescription ('cluster0-shard-00-02.pnmkh.mongodb.net', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('cluster0-shard-00-02.pnmkh.mongodb.net:27017: [Errno 60] Operation timed out')>]>

# Create a LangChain handle for the vector search index

In [20]:
vector_db = MongoDBAtlasVectorSearch.from_connection_string(
    mongodb_uri_widget.value,
    mongo_db_and_coll_path,
    embeddings_model,
    index_name='rag_demo_index'
)

# Setup question function

In [21]:
from langchain.chains import ConversationalRetrievalChain
from langchain.schema.document import Document
from langchain.chat_models import ChatOpenAI

llm_model = ChatOpenAI(
    model_name='gpt-3.5-turbo',
    temperature=0.0,
    openai_api_key=openai_api_key_widget.value
)

pdf_qa = ConversationalRetrievalChain.from_llm(
    llm_model,
    vector_db.as_retriever(),
    return_source_documents=True
)

def ask_question(question):
    result = pdf_qa({'question': 'Answer only if this information is available in the source document - ' + question, 'chat_history': []})
    print("Answer:{}\n".format(result.get('answer')))
    print('Chunks from Atlas Vector Search used for context:')
    
    for chunk in result.get('source_documents'):
        id = chunk.metadata['_id']
        page = chunk.metadata['page']
        print('ObjectId({}) | page {:,}'.format(id, page))

In [22]:
# from langchain.llms import OpenAI
#
# llm_model = OpenAI(
#     model_name='text-davinci-003',
#     temperature=0.0,
#     openai_api_key=os.environ['OPENAI_API_KEY']
# )

# from langchain.chat_models import ChatOpenAI
#
# llm_model = ChatOpenAI(
#     model_name='gpt-3.5-turbo',
#     temperature=0.0,
#     openai_api_key=os.environ['OPENAI_API_KEY']
# )

# Start asking questions

In [23]:
ask_question("What is MongoDB and why is it used?")

Answer:MongoDB is a general-purpose database that combines the best aspects of relational and NoSQL databases. It replaces the rigid tables of relational databases with flexible documents that can store data as JSON. This flexibility allows developers to be more productive and build applications faster. MongoDB is used because it enables organizations to meet the demands of modern applications by providing a technology foundation that aligns with the way developers think and code. It allows for easy modification of data structures, supports various data types, and offers features like schema validation and clustered indexing for efficient data storage and querying.

Chunks from Atlas Vector Search used for context:
ObjectId(65e880ee6d99efb760ca2c04) | page 1
ObjectId(65e880ee6d99efb760ca2c05) | page 2
ObjectId(65e880ee6d99efb760ca2c03) | page 0
ObjectId(65e880ee6d99efb760ca2c07) | page 4


In [35]:
ask_question("Explain the basic principles behind MongoDB's architecture?")

Answer:Based on the provided information from the MongoDB Architecture Guide, some basic principles behind MongoDB's architecture include:

1. **Flexible Schema**: MongoDB allows for a dynamic and self-describing schema, where fields can vary from document to document. This flexibility enables developers to continuously integrate new application functionality without the need for disruptive schema migrations. Changes to the data model can be made without costly operations like "ALTER TABLE," making it easier to manage schema changes across multiple teams.

2. **Universal JSON Documents**: MongoDB stores data as JSON documents in a binary representation called BSON (Binary JSON). This format allows for a wide range of data structures, including rich objects, key-value pairs, tables, geospatial and time series data, and graph nodes and edges. The BSON encoding extends the JSON representation to include additional types like int, long, date, floating point, and decimal128, making data pro

In [36]:
ask_question("What are the requirements to open an account with Ally Bank?")

Answer:I don't have that information available in the provided document.

Chunks from Atlas Vector Search used for context:
ObjectId(65e86146fea4d903e938e73f) | page 1
ObjectId(65e86146fea4d903e938e749) | page 11
ObjectId(65e86146fea4d903e938e74d) | page 15
ObjectId(65e86146fea4d903e938e74f) | page 17


In [24]:
# Use this cell to show that the majority of time spent waiting is due to the LLM, not Atlas Vector Search

import time

search_vector = embeddings_model.embed_query("How should I optimize query performance?")

before_time = time.perf_counter()
cursor = mongo_coll.aggregate([
    {
        "$vectorSearch": {
            "index": "rag_demo_index",
            "path": "embedding",
            "queryVector": search_vector,
            "numCandidates": 100,
            "limit": 4
        }
    },
    {
        "$project": {
            "_id": 1,
            "page": 1,
        }
    }
])
vector_search_ms = int((time.perf_counter() - before_time) * 1_000)
print('Atlas Vector Search roundtrip took {} ms'.format(vector_search_ms))
list(cursor)

Atlas Vector Search roundtrip took 261 ms


[{'_id': ObjectId('65e880ee6d99efb760ca2c08'), 'page': 5},
 {'_id': ObjectId('65e880ee6d99efb760ca2c11'), 'page': 14},
 {'_id': ObjectId('65e880ee6d99efb760ca2c04'), 'page': 1},
 {'_id': ObjectId('65e880ee6d99efb760ca2c0b'), 'page': 8}]