Skip to content


Latest commit



363 lines (272 loc) · 12 KB

File metadata and controls

363 lines (272 loc) · 12 KB

Vector Database Customization

Available Vector Databases

By default, the Docker Compose files for the examples deploy Milvus as the vector database with CPU-only support. You must install the NVIDIA Container Toolkit to use Milvus with GPU acceleration.

The available vector databases in the examples are shown in the following list:

  • LlamaIndex: Milvus, pgvector
  • LangChain: FAISS, Milvus, pgvector

The following customizations are common:

  • Use Milvus with GPU acceleration.
  • Use pgvector as an alternative to Milvus. pgvector uses CPU only.
  • Use your own vector database and prevent deploying a vector database with each RAG example.

Configuring Milvus with GPU Acceleration

  1. Edit the RAG/examples/local_deploy/docker-compose-vectordb.yaml file and make the following changes to the Milvus service.

    • Change the image tag to include the -gpu suffix:

        container_name: milvus-standalone
        image: milvusdb/milvus:v2.4.5-gpu
    • Add the GPU resource reservation:

        - "etcd"
        - "minio"
              - driver: nvidia
                capabilities: ["gpu"]
                device_ids: ['${VECTORSTORE_GPU_DEVICE_ID:-0}']  
      profiles: ["nemo-retriever", "milvus", ""]
  2. Stop and start the containers:

    docker compose down
    docker compose up -d --build

    Note: when deploying milvus with local-nim you have to use milvus profile to deploy the vectorstore

    docker compose --profile local-nim --profile milvus up -d --build
  3. Optional: View the chain server logs to confirm the vector database is operational.

    1. View the logs:

      docker logs -f chain-server
    2. Upload a document to the knowledge base. Refer to Use Unstructured Documents as a Knowledge Base for more information.

    3. Confirm the log output includes the vector database:

      INFO:RAG.src.chain_server.utils:Using milvus collection: nvidia_api_catalog
      INFO:RAG.src.chain_server.utils:Vector store created and saved.

Configuring pgvector as the Vector Database

  1. Export the following environment variables in your terminal:

    export POSTGRES_PASSWORD=password
    export POSTGRES_USER=postgres
    export POSTGRES_DB=api
  2. Edit the docker-compose.yaml file for the RAG example and set the following environment variables for the Chain Server:

      APP_VECTORSTORE_URL: "pgvector:5432"
      APP_VECTORSTORE_NAME: "pgvector"
  3. Start the containers:

    docker compose --profile pgvector up -d --build
  4. Optional: View the chain server logs to confirm the vector database is operational.

    1. View the logs:

      docker logs -f chain-server
    2. Upload a document to the knowledge base. Refer to Use Unstructured Documents as a Knowledge Base for more information.

    3. Confirm the log output includes the vector database:

      INFO:RAG.src.chain_server.utils:Using PGVector collection: nvidia_api_catalog
      INFO:RAG.src.chain_server.utils:Vector store created and saved.

To stop pgvector and the other containers run docker compose --profile pgvector down.

Configuring Support for an External Milvus or pgvector database

  1. Edit the docker-compose.yaml file for the RAG example and make the following edits.

    • Remove or comment the include path to the docker-compose-vectordb.yaml file:

        - path:
          # - ../../local_deploy/docker-compose-vectordb.yaml
          - ../../local_deploy/docker-compose-nim-ms.yaml
    • To use an external Milvus server, specify the connection information:

        APP_VECTORSTORE_URL: "http://<milvus-hostname-or-ipaddress>:19530"
        APP_VECTORSTORE_NAME: "milvus"
    • To use an external pgvector server, specify the connection information:

        APP_VECTORSTORE_URL: "<pgvector-hostname-or-ipaddress>:5432"
        APP_VECTORSTORE_NAME: "pgvector"

      Also export the POSTGRES_PASSWORD, POSTGRES_USER, and POSTGRES_DB environment variables in your terminal.

  2. Start the containers:

    docker compose up -d --build

Adding a New Vector Store

You can extend the code to add support for any vector store.

LlamaIndex Framework

  1. Navigate to the file RAG/src/chain_server/ from the project's root directory. This file contains the utility functions used for vector store interactions.

  2. Modify the get_vector_index function to handle your new vector store. Implement the logic for creating your vector store object within this function.

    def get_vector_index():
       # existing code
       elif == "chromadb":
          import chromadb
          from llama_index.vector_stores.chroma import ChromaVectorStore
          if not collection_name:
             collection_name = os.getenv('COLLECTION_NAME', "vector_db")
"Using Chroma collection: {collection_name}")
          chroma_client = chromadb.EphemeralClient()
          chroma_collection = chroma_client.create_collection(collection_name)
          vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
  3. Modify the get_docs_vectorstore_llamaindex function to retrieve the list of files stored in your new vector store.

    def get_docs_vectorstore_llamaindex():
       # existing code
       elif == "chromadb":
          ref_doc_info = index.ref_doc_info
          # iterate over all the document in vectorstore and return unique filename
          for _ , ref_doc_value in ref_doc_info.items():
                metadata = ref_doc_value.metadata
                if 'filename' in metadata:
                   filename = metadata['filename']
          decoded_filenames = list(set(decoded_filenames))
  4. Update the del_docs_vectorstore_llamaindex function to handle document deletion in your new vector store.

    def del_docs_vectorstore_llamaindex(filenames: List[str]):
       # existing code
       elif == "chromadb":
          ref_doc_info = index.ref_doc_info
          # Iterate over all the filenames and if filename present in metadata of doc delete it
          for filename in filenames:
                for ref_doc_id, doc_info in ref_doc_info.items():
                   if 'filename' in doc_info.metadata and doc_info.metadata['filename'] == filename:
                      index.delete_ref_doc(ref_doc_id, delete_from_docstore=True)
            "Deleted documents with filenames {filename}")
  5. In your custom implementation, import the functions from The sample in RAG/examples/basic_rag/llamaindex already imports the functions.

    from RAG.src.chain_server.utils import (
  6. Update RAG/src/chain_server/requirements.txt with any additional package required for the vector store.

    # existing dependency
  7. Build and start the containers.

    1. Navigate to the example directory.

      cd RAG/examples/basic_rag/llamaindex
    2. Set the APP_VECTORSTORE_NAME environment variable for the chain-server microservice in your docker-compose.yaml file. Set it to the name of your newly added vector store.

      APP_VECTORSTORE_NAME: "chromadb"
    3. Build and deploy the microservice.

      docker compose up -d --build chain-server rag-playground

LangChain Framework

  1. Navigate to the file RAG/src/chain_server/ in the project's root directory.

  2. Modify the create_vectorstore_langchain function to handle your new vector store. Implement the logic for creating your vector store object within it.

    def create_vectorstore_langchain(document_embedder, collection_name: str = "") -> VectorStore:
       # existing code
       elif == "chromadb":
          from langchain_chroma import Chroma
          import chromadb
"Using Chroma collection: {collection_name}")
          persistent_client = chromadb.PersistentClient()
          vectorstore = Chroma(
  3. Update the get_docs_vectorstore_langchain function to retrieve a list of documents from your new vector store. Implement your retrieval logic within it.

    def get_docs_vectorstore_langchain(vectorstore: VectorStore) -> List[str]:
       # Existing code
       elif == "chromadb":
          chroma_data = vectorstore.get()
          filenames = set([extract_filename(metadata) for metadata in chroma_data.get("metadatas", [])])
          return filenames
  4. Update the del_docs_vectorstore_langchain function to handle document deletion in your new vector store.

    def del_docs_vectorstore_langchain(vectorstore: VectorStore, filenames: List[str]) -> bool:
       # Existing code
       elif == "chromadb":
          chroma_data = vectorstore.get()
          for filename in filenames:
                ids_list = [chroma_data.get("ids")[idx] for idx, metadata in enumerate(chroma_data.get("metadatas", [])) if extract_filename(metadata) == filename]
          return True
  5. In your custom implementation, import the preceding functions from The sample in RAG/examples/basic_rag/langchain already imports the functions.

    from RAG.src.chain_server.utils import (
  6. Update RAG/src/chain_server/requirements.txt with any additional package required for the vector store.

    # existing dependency
    langchain-core==0.1.40 # Update this dependency as there is conflict with existing one
  7. Build and start the containers.

    1. Navigate to the example directory.

      cd RAG/examples/basic_rag/langchain
    2. Set the APP_VECTORSTORE_NAME environment variable for the chain-server microservice in your docker-compose.yaml file. Set it to the name of your newly added vector store.

      APP_VECTORSTORE_NAME: "chromadb"
    3. Build and deploy the microservices.

      docker compose up -d --build chain-server rag-playground