Skip to content

Latest commit



34 lines (26 loc) · 4.45 KB

File metadata and controls

34 lines (26 loc) · 4.45 KB

Atlas Vector Search with RAG

The Python scripts in this repo use Atlas Vector Search with Retrieval-Augmented Generation (RAG) architecture to build a Question Answering application. They use the LangChain framework, OpenAI models, as well as Gradio in conjunction with Atlas Vector Search in a RAG architecture, to create this app.

Setting up the Environment

  1. Install the following packages:
pip3 install langchain pymongo bs4 openai tiktoken gradio requests lxml argparse unstructured
  1. Create OpenAI API Key from here. Note that this requires a paid account with OpenAI, with enough credits. OpenAI API requests stop working if credit balance reaches $0.

  2. Save the OpenAI API key and the MongoDB URI in the file, like this:

openai_api_key = "ENTER_OPENAI_API_KEY_HERE"
  1. Use the following two python scripts:
    • This script will be used to load your documents and ingest the text and vector embeddings, in a MongoDB collection.
    • This script will generate the user interface and will allow you to perform question-answering against your data, using Atlas Vector Search and OpenAI.

Note: In this demo, I've used:

  • DB Name: langchain_demo
  • Collection Name: collection_of_text_blobs
  • The text files that I am using as my source data are saved in a directory named sample_files.

Main Components

LangChain OpenAI Atlas Vector Search Gradio
- All documents from a directory
- Split and load
- Uses the Unstructured package
Embedding Model:
- text-embedding-ada-002
- Text → Vector embeddings
- 1536 dimensions
Vector Store UI for LLM app
- Open-source Python library
- Allows to quickly create user interfaces for ML models
- Retriever
- Question-answering chain
Language model:
- gpt-3.5-turbo
- Understands and generates natural language
- Generates text, answers, translations, etc.
- Wrapper around Atlas Vector Search
- Easily create and store embeddings in MongoDB collections
- Perform KNN Search using Atlas Vector Search