# VectorStore: Postgres and Sentence Transformer (all-MiniLM-L6-v2) with Basic Examples

This notebook demonstrates how to use the `PostgresVectorStore` in `dapr-agents` for storing, querying, and filtering documents. We will explore:

* Initializing the `SentenceTransformerEmbedder` embedding function and `PostgresVectorStore`.
* Adding documents with text and metadata.
* Performing similarity searches.
* Filtering results based on metadata.
* Resetting the database.

## Install Required Libraries
Before starting, ensure the required libraries are installed:

In [None]:
!pip install dapr-agents python-dotenv "psycopg[binary,pool]" pgvector

## Load Environment Variables

Load API keys or other configuration values from your `.env` file using `dotenv`.

In [1]:
from dotenv import load_dotenv
load_dotenv()

True

## Setting Up The Database

Before initializing the `PostgresVectorStore`, set up a PostgreSQL instance with pgvector enabled. For a local setup, use Docker:

In [2]:
!docker run --name pgvector-container \
    -e POSTGRES_USER=floki \
    -e POSTGRES_PASSWORD=floki \
    -e POSTGRES_DB=floki \
    -p 5432:5432 \
    -d pgvector/pgvector:pg17

8c6e20c4e57a786821b66eafb3459fbc985f256c5761fb9fdbbb2580e99ca5c5


## Initializing SentenceTransformer Embedding Function

The default embedding function is `SentenceTransformerEmbedder`, but we will initialize it explicitly for clarity.

In [3]:
from dapr_agents.document.embedder import SentenceTransformerEmbedder

embedding_function = SentenceTransformerEmbedder(
    model="all-MiniLM-L6-v2"
)

## Initializing the PostgresVectorStore

To start, create an instance of the `PostgresVectorStore` and set the `embedding_function` to the instance of `SentenceTransformerEmbedder`

In [4]:
from dapr_agents.storage.vectorstores import PostgresVectorStore
import os

# Set up connection parameters
connection_string = os.getenv("POSTGRES_CONNECTION_STRING", "postgresql://floki:floki@localhost:5432/floki")

# Initialize PostgresVectorStore
store = PostgresVectorStore(
    connection_string=connection_string,
    table_name="floki",
    embedding_function=SentenceTransformerEmbedder()
)

## Adding Documents
We will use Document objects to add content to the collection. Each document includes text and optional metadata.

### Creating Documents

In [5]:
from dapr_agents.types.document import Document

# Example Lord of the Rings-inspired conversations
documents = [
    Document(
        text="Gandalf: A wizard is never late, Frodo Baggins. Nor is he early; he arrives precisely when he means to.",
        metadata={"topic": "wisdom", "location": "The Shire"}
    ),
    Document(
        text="Frodo: I wish the Ring had never come to me. I wish none of this had happened.",
        metadata={"topic": "destiny", "location": "Moria"}
    ),
    Document(
        text="Aragorn: You cannot wield it! None of us can. The One Ring answers to Sauron alone. It has no other master.",
        metadata={"topic": "power", "location": "Rivendell"}
    ),
    Document(
        text="Sam: I can't carry it for you, but I can carry you!",
        metadata={"topic": "friendship", "location": "Mount Doom"}
    ),
    Document(
        text="Legolas: A red sun rises. Blood has been spilled this night.",
        metadata={"topic": "war", "location": "Rohan"}
    ),
    Document(
        text="Gimli: Certainty of death. Small chance of success. What are we waiting for?",
        metadata={"topic": "bravery", "location": "Helm's Deep"}
    ),
    Document(
        text="Boromir: One does not simply walk into Mordor.",
        metadata={"topic": "impossible tasks", "location": "Rivendell"}
    ),
    Document(
        text="Galadriel: Even the smallest person can change the course of the future.",
        metadata={"topic": "hope", "location": "Lothlórien"}
    ),
    Document(
        text="Théoden: So it begins.",
        metadata={"topic": "battle", "location": "Helm's Deep"}
    ),
    Document(
        text="Elrond: The strength of the Ring-bearer is failing. In his heart, Frodo begins to understand. The quest will claim his life.",
        metadata={"topic": "sacrifice", "location": "Rivendell"}
    )
]

### Adding Documents to the Collection

In [6]:
store.add_documents(documents=documents)
print(f"Number of documents in the collection: {store.count()}")

Number of documents in the collection: 10


## Retrieving Documents

Retrieve all documents or specific ones by ID.

In [7]:
# Retrieve all documents
retrieved_docs = store.get()
print("Retrieved documents:")
for doc in retrieved_docs:
    print(f"ID: {doc['id']}, Text: {doc['document']}, Metadata: {doc['metadata']}")

Retrieved documents:
ID: bc6493c3-d036-47bd-bb16-220ddbdffb35, Text: Gandalf: A wizard is never late, Frodo Baggins. Nor is he early; he arrives precisely when he means to., Metadata: {'topic': 'wisdom', 'location': 'The Shire'}
ID: 1f298a2a-ef77-4584-9d8a-2c734b69a5b6, Text: Frodo: I wish the Ring had never come to me. I wish none of this had happened., Metadata: {'topic': 'destiny', 'location': 'Moria'}
ID: 7fd80245-cca1-4d3b-bc05-429513fe2c6e, Text: Aragorn: You cannot wield it! None of us can. The One Ring answers to Sauron alone. It has no other master., Metadata: {'topic': 'power', 'location': 'Rivendell'}
ID: e4efaef9-d200-46bc-a19c-2085d572a77b, Text: Sam: I can't carry it for you, but I can carry you!, Metadata: {'topic': 'friendship', 'location': 'Mount Doom'}
ID: 42c60794-0dcd-4ae5-8941-702a8e00bac3, Text: Legolas: A red sun rises. Blood has been spilled this night., Metadata: {'topic': 'war', 'location': 'Rohan'}
ID: 0707bd7d-d7c8-4469-9926-85dc480a20c9, Text: Gimli: Certai

In [8]:
# Retrieve a specific document by ID
doc_id = retrieved_docs[0]['id']
specific_doc = store.get(ids=[doc_id])
print(f"Specific document: {specific_doc}")

Specific document: [{'id': UUID('bc6493c3-d036-47bd-bb16-220ddbdffb35'), 'document': 'Gandalf: A wizard is never late, Frodo Baggins. Nor is he early; he arrives precisely when he means to.', 'metadata': {'topic': 'wisdom', 'location': 'The Shire'}}]


In [9]:
# Retrieve a specific document by ID
doc_id = retrieved_docs[0]['id']
specific_doc = store.get(ids=[doc_id], with_embedding=True)
embedding = specific_doc[0]['embedding']
print(f"Specific document Embedding (first 5 values): {embedding[:5]}")

Specific document Embedding (first 5 values): [-0.0


## Updating Documents

You can update existing documents' text or metadata using their IDs.

In [10]:
# Retrieve a document by its ID
retrieved_docs = store.get()  # Get all documents to find the ID
doc_id = retrieved_docs[0]['id']  # Select the first document's ID for this example

# Define updated text and metadata
updated_text = "Gandalf: Even the wisest cannot foresee all ends, but hope remains while the Company is true."
updated_metadata = {"topic": "hope and wisdom", "location": "Fangorn Forest"}

# Update the document's text and metadata in the store
store.update(ids=[doc_id], documents=[updated_text], metadatas=[updated_metadata])

# Verify the update
updated_doc = store.get(ids=[doc_id])
print(f"Updated document: {updated_doc}")

Updated document: [{'id': UUID('bc6493c3-d036-47bd-bb16-220ddbdffb35'), 'document': 'Gandalf: Even the wisest cannot foresee all ends, but hope remains while the Company is true.', 'metadata': {'topic': 'hope and wisdom', 'location': 'Fangorn Forest'}}]


## Deleting Documents

Delete documents by their IDs.

In [11]:
# Delete a document by ID
doc_id_to_delete = retrieved_docs[2]['id']
store.delete(ids=[doc_id_to_delete])

# Verify deletion
print(f"Number of documents after deletion: {store.count()}")

Number of documents after deletion: 9


## Similarity Search

Perform a similarity search using text queries. The embedding function automatically generates embeddings for the input query.

In [12]:
# Perform a similarity search using text queries.
query = "wise advice"
results = store.search_similar(query_texts=query, k=2)

# Display results
print("Similarity search results:")
for result in results:
    print(f"ID: {result['id']}, Document: {result['document']}, Metadata: {result['metadata']}, Similarity: {result['similarity']}")

Similarity search results:
ID: 0707bd7d-d7c8-4469-9926-85dc480a20c9, Document: Gimli: Certainty of death. Small chance of success. What are we waiting for?, Metadata: {'topic': 'bravery', 'location': "Helm's Deep"}, Similarity: 0.1567628941818613
ID: 8df14762-3dee-40d4-8ae3-4110982ca85f, Document: Boromir: One does not simply walk into Mordor., Metadata: {'topic': 'impossible tasks', 'location': 'Rivendell'}, Similarity: 0.13233356090384096


## Filtering Results

Filter results based on metadata.

In [13]:
# Search for documents with specific metadata filters
query = "journey"
filter_conditions = {
    "location": "Fangorn Forest",
    "topic": "hope and wisdom"
}

filtered_results = store.search_similar(query_texts=query, metadata_filter=filter_conditions, k=3)

# Display filtered results
print("Filtered search results:")
for result in filtered_results:
    print(f"ID: {result['id']}, Document: {result['document']}, Metadata: {result['metadata']}, Similarity: {result['similarity']}")

Filtered search results:
ID: bc6493c3-d036-47bd-bb16-220ddbdffb35, Document: Gandalf: Even the wisest cannot foresee all ends, but hope remains while the Company is true., Metadata: {'topic': 'hope and wisdom', 'location': 'Fangorn Forest'}, Similarity: 0.1670202911216282


## Resetting the Database

Reset the database to clear all stored data.

In [14]:
# Reset the collection
store.reset()
print("Database reset complete. Current documents:", store.get())

Database reset complete. Current documents: []
