# PGVector with async connections

>[PGVector](https://github.com/pgvector/pgvector) is an open-source vector similarity search for `Postgres`

It supports:
- exact and approximate nearest neighbor search
- L2 distance, inner product, and cosine distance

This notebook shows how to use the Postgres vector database (`PGVector`) with async connections.

See the [installation instruction](https://github.com/pgvector/pgvector).

In [None]:
# Pip install necessary package
!pip install pgvector
!pip install openai
!pip install asyncpg
!pip install greenlet

We want to use `OpenAIEmbeddings` so we have to get the OpenAI API Key.

In [None]:
import os
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")

In [None]:
## Loading Environment Variables
from dotenv import load_dotenv

load_dotenv()

In [None]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores.pgvector_async import PGVectorAsync
from langchain.document_loaders import TextLoader
from langchain.docstore.document import Document

In [None]:
loader = TextLoader("../../../state_of_the_union.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

embeddings = OpenAIEmbeddings()

In [None]:
# PGVectorAsync need the database url to connect to the database.

DATABASE_URL = "postgresql+asyncpg://postgres:postgres@localhost:5432/postgres"

# Alternatively, you can pass a async engine to PGVectorAsync
# engine = create_async_engine(url=DATABASE_URL, echo=True)

## Set up your database

You only need to run this once, preferably in a migration script.

In [None]:
vectorstore = PGVectorAsync(
    embeddings=embeddings,
    db_url=DATABASE_URL,
)

# Alternatively, you can pass a async engine to PGVectorAsync
# vectorstore = PGVectorAsync(
#     embeddings=embeddings,
#     engine=engine,
# )

await vectorstore.create_schema()

## Similarity Search with Euclidean Distance (Default)

In [None]:
COLLECTION_NAME = "state_of_the_union_test"

vectorstore = await PGVectorAsync.afrom_documents(
    embedding=embeddings,
    documents=docs,
    db_url=DATABASE_URL,
)

In [None]:
query = "What did the president say about Ketanji Brown Jackson"
docs_with_score = await vectorstore.asimilarity_search_with_score(query)

In [None]:
for doc, score in docs_with_score:
    print("-" * 80)
    print("Score: ", score)
    print(doc.page_content)
    print("-" * 80)

## Maximal Marginal Relevance Search (MMR)

Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.

In [None]:
docs_with_score = await vectorstore.amax_marginal_relevance_search_with_score(query)

In [None]:
for doc, score in docs_with_score:
    print("-" * 80)
    print("Score: ", score)
    print(doc.page_content)
    print("-" * 80)

## Working with vectorstore

Above, we created a vectorstore from scratch. However, often times we want to work with an existing vectorstore.
In order to do that, we can initialize it directly.

In [None]:
vectorstore = PGVectorAsync(
    collection_name=COLLECTION_NAME,
    embeddings=embeddings,
    db_url=DATABASE_URL,
)

### Add documents

We can add documents to the existing vectorstore.

In [None]:
await vectorstore.aadd_documents(documents=[Document(page_content="foo")])

In [None]:
docs_with_score = await vectorstore.asimilarity_search_with_score("foo", k=2)

docs_with_score

### Overriding a vectorstore

If you have an existing collection, you override it by doing `from_documents` and setting `pre_delete_collection` = True

In [None]:
docs = [Document(page_content="foo"), Document(page_content="bar")]
vectorstore = await PGVectorAsync.afrom_documents(
    collection_name=COLLECTION_NAME,
    embedding=embeddings,
    db_url=DATABASE_URL,
    documents=docs,
    pre_delete_collection=True,
)

In [None]:
docs_with_score = await vectorstore.asimilarity_search_with_score("foo", k=2)

In [None]:
docs_with_score

## Using a VectorStore as a Retriever

In [None]:
retriever = vectorstore.as_retriever()

In [None]:
await retriever.aget_relevant_documents(query="foo")