[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain-pinecone/blob/main/examples/async-vectorstore.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/langchain-ai/langchain-pinecone/blob/main/examples/async-vectorstore.ipynb)

# Pinecone VectorStore Async Demo

This notebook demonstrates two ways to manage the async Pinecone client that backs `PineconeVectorStore`:

1. Letting the store handle connections automatically by using `async with` (no manual close).
2. Closing the session yourself with `await store.aclose()` when you want deterministic cleanup.

Each section below walks through one of these approaches so you can run the cells and observe that the store no longer throws `RuntimeError: Session is closed` after back-to-back async calls.


## Prerequisites
- Install the project dependencies (see the repository README on local, or run the next cell in colab).
- Create or reuse a Pinecone serverless index that matches the dimensionality of your embeddings.
- Export the credentials before running: `export PINECONE_API_KEY=...` and optionally `export PINECONE_INDEX_NAME=...`.
- Provide an embedding model; this example uses `langchain-openai` but you can swap in any `Embeddings` implementation.

In [1]:
!pip install -qU \
    "langchain-pinecone==0.2.13rc1" \
    "langchain-openai==1.0.1"

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.1/62.1 kB[0m [31m1.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m81.9/81.9 kB[0m [31m5.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m469.3/469.3 kB[0m [31m19.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m16.6/16.6 MB[0m [31m54.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m587.6/587.6 kB[0m [31m24.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m259.3/259.3 kB[0m [31m14.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m65.5/65.5 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the followin

## Initialize Embeddings

We begin by initializing our OpenAI embedding model, this does require an `OPENAI_API_KEY` to be set.


In [2]:
import os
from getpass import getpass
from langchain_openai import OpenAIEmbeddings

os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY") \
    or getpass("Enter your OpenAI API key: ")

embedding = OpenAIEmbeddings()

# Demo content to upsert and query
TEXTS = [
    "Pinecone is a vector database built for production workloads.",
    "LangChain integrates with Pinecone for semantic search use cases.",
    "Async workflows let you reuse Pinecone connections efficiently.",
]
METADATAS = [{"source": "demo", "idx": idx} for idx, _ in enumerate(TEXTS)]


Enter your OpenAI API key: ··········


Note, the snippets in `TEXTS` act as stand-ins for your own documents. Feel free to replace them with any list of strings and matching metadata before running the demos.


## Initialize the Index

A vectir index is a data structure that stores vector embeddings and allows for efficient similarity search. Before you can store or query embeddings, you need to create an index with the appropriate configuration (such as dimension and metric).

The following code will connect to Pinecone using your API key - you can get a [free API key here](https://app.pinecone.io). Check if an index with the specified name exists, and create it if necessary. This step is essential for managing and querying your vector data.

In [3]:
import os
from getpass import getpass
from pinecone import ServerlessSpec, Pinecone

os.environ["PINECONE_API_KEY"] = os.getenv("PINECONE_API_KEY") \
    or getpass("Enter your Pinecone API key here: ")

index_name = "langchain-async-vectorstore"

# Initialize Pinecone client
pc = Pinecone()

# Define serverless deployment specification (cloud provider and region)
spec = ServerlessSpec(
    cloud="aws",
    region="us-west-2",  # You can change region as needed
)

Enter your Pinecone API key here: ··········


In [4]:
# List all existing indexes in your Pinecone project
existing_indexes = [index_info["name"] for index_info in pc.list_indexes()]

# Check if the index already exists; if not, create it
if index_name not in existing_indexes:
    # Create a new index with specified dimension and metric
    pc.create_index(
        index_name,
        dimension=1536,  # Must match embedding model size
        metric="dotproduct",  # Similarity metric
        spec=spec,
    )

# Connect to the index
index = pc.Index(index_name)
# View index statistics to confirm connection
index.describe_index_stats()

{'dimension': 1536,
 'index_fullness': 0.0,
 'metric': 'dotproduct',
 'namespaces': {},
 'total_vector_count': 0,
 'vector_type': 'dense'}

## Scenario 1 – Use `async with` (no explicit close)
The context manager keeps a single client session open for the duration of the block and then closes it automatically when the block exits.


In [5]:
from langchain_pinecone import PineconeVectorStore

async def async_with_demo() -> None:
    """Demonstrate the async context manager to reuse a single Pinecone session."""
    vectorstore = PineconeVectorStore(
        index_name=index_name,
        embedding=embedding,
        namespace="async-demo",
    )

    print("Entering 'async with' block — a single aiohttp session backs all operations.")
    async with vectorstore:
        ids = await vectorstore.aadd_texts(TEXTS, metadatas=METADATAS)
        print(f"Upserted {len(ids)} vectors")

        print("First similarity search inside the context.")
        results = await vectorstore.asimilarity_search("pinecone async", k=2)
        for doc in results:
            print(f"  {doc.id}: {doc.page_content} -> {doc.metadata}")

        print("Second similarity search reuses the same session (no reconnect).")
        extra = await vectorstore.asimilarity_search("LangChain", k=1)
        for doc in extra:
            print(f"  {doc.id}: {doc.page_content} -> {doc.metadata}")

        await vectorstore.adelete(ids=ids)
        print("Cleaned up vectors while still inside the context.")

    print("Context exited — session closed automatically.")

In [6]:
# Run the async-context-manager demo
await async_with_demo()

Entering 'async with' block — a single aiohttp session backs all operations.
Upserted 3 vectors
First similarity search inside the context.
  0aa160bd-134a-4d4a-a0fd-8f0066e48caf: Pinecone is a vector database built for production workloads. -> {'idx': 0.0, 'source': 'demo'}
  4a84582b-17e2-4bf6-9395-43f15822306f: Async workflows let you reuse Pinecone connections efficiently. -> {'idx': 2.0, 'source': 'demo'}
Second similarity search reuses the same session (no reconnect).
  dedd230d-93a8-4199-9ae8-12a4a15e6013: LangChain integrates with Pinecone for semantic search use cases. -> {'idx': 1.0, 'source': 'demo'}
Cleaned up vectors while still inside the context.
Context exited — session closed automatically.


## Scenario 2 – Sequential calls without a context manager
Even without `async with`, the vector store now rebuilds the async client whenever a previous session has been closed. The loop below runs multiple add/search/delete cycles back-to-back without calling `aclose()` manually.


In [7]:
async def automatic_refresh_demo() -> None:
    vectorstore = PineconeVectorStore(
        index_name=index_name,
        embedding=embedding,
        namespace="auto-refresh-demo",
    )

    for run in range(2):
        print(f"Run {run + 1}: adding texts without an outer context manager.")
        payload = [f"Pinecone auto refresh demo {run}"]
        metadata = [{"source": "auto", "run": run}]
        ids = await vectorstore.aadd_texts(payload, metadatas=metadata)
        print(f"  Upserted ids: {ids}")

        results = await vectorstore.asimilarity_search("pinecone", k=1)
        for doc in results:
            print(f"  Search hit: {doc.page_content} -> {doc.metadata}")

        await vectorstore.adelete(ids=ids)
        print("  Deleted vectors; the next loop iteration will reopen a fresh session automatically.")

    print("Finished sequential runs without ever calling aclose().")

In [8]:
# Run the sequential demo without an explicit context manager
await automatic_refresh_demo()

Run 1: adding texts without an outer context manager.
  Upserted ids: ['ff18e03f-1e9a-4d70-a08c-1417ced2e78b']
  Search hit: Pinecone auto refresh demo 0 -> {'run': 0.0, 'source': 'auto'}
  Deleted vectors; the next loop iteration will reopen a fresh session automatically.
Run 2: adding texts without an outer context manager.
  Upserted ids: ['414385eb-97cb-416b-b46d-cb65c890a4ac']
  Search hit: Pinecone auto refresh demo 1 -> {'run': 1.0, 'source': 'auto'}
  Deleted vectors; the next loop iteration will reopen a fresh session automatically.
Finished sequential runs without ever calling aclose().


## Scenario 3 – Explicitly close the async session
Call `await store.aclose()` when you want deterministic cleanup after a set of operations (for example before handing the store to another task). The store can still be reused afterwards because it will lazily build a new session on the next async call.


In [9]:
async def manual_close_demo() -> None:
    """Show explicit lifecycle management with aclose()."""
    vectorstore = PineconeVectorStore(
        index_name=index_name,
        embedding=embedding,
        namespace="manual-demo",
    )
    ids = await vectorstore.aadd_texts(TEXTS[:1], metadatas=METADATAS[:1])
    print(f"Added initial ids: {ids}")
    try:
        results = await vectorstore.asimilarity_search("pinecone", k=1)
        for doc in results:
            print(f"  Search hit: {doc.page_content} -> {doc.metadata}")
    finally:
        await vectorstore.adelete(ids=ids)
        print("Deleted initial vectors; calling aclose() to release the session.")
        await vectorstore.aclose()

    print("Session closed. The next call recreates the client lazily.")
    follow_up_metadata = [{"source": "manual", "stage": "follow-up"}]
    follow_up_ids = await vectorstore.aadd_texts(TEXTS[1:2], metadatas=follow_up_metadata)
    print(f"Added follow-up ids: {follow_up_ids}")
    try:
        follow_up_results = await vectorstore.asimilarity_search("langchain", k=1)
        for doc in follow_up_results:
            print(f"  Follow-up hit: {doc.page_content} -> {doc.metadata}")
    finally:
        await vectorstore.adelete(ids=follow_up_ids)
        await vectorstore.aclose()
        print("Explicit close called again to tidy up.")

In [10]:
# Run the explicit close demo
await manual_close_demo()

Added initial ids: ['5ad0dc35-63c9-410d-a8b5-9fea283d0c92']
  Search hit: Pinecone is a vector database built for production workloads. -> {'idx': 0.0, 'source': 'demo'}
Deleted initial vectors; calling aclose() to release the session.
Session closed. The next call recreates the client lazily.
Added follow-up ids: ['24a3d951-6e7a-4367-aa88-70847bd878c6']
  Follow-up hit: LangChain integrates with Pinecone for semantic search use cases. -> {'source': 'manual', 'stage': 'follow-up'}
Explicit close called again to tidy up.


## Notes
- Replace the demo texts and metadata with your own dataset to mirror production behaviour.
- `async with PineconeVectorStore(...)` keeps one HTTP session open across the block and closes it automatically on exit.
- Without an `async with` block, each call now reinitialises the session whenever the previous one has been closed, so you can run back-to-back async operations safely.
- `await store.aclose()` is still useful when you want deterministic cleanup between batches or before handing the store to another component.
- If you run these cells multiple times, consider changing the namespace or cleaning up vectors to avoid duplicate data.
