# CloudflareVectorizeVectorStore

This notebook covers how to get started with the CloudflareVectorize vector store.

## Setup

This Python package is a wrapper around Cloudflare's REST API.  To interact with the API, you need to provide an API token with the appropriate privileges.

You can create and manage API tokens here:

https://dash.cloudflare.com/YOUR-ACCT-NUMBER/api-tokens

### Credentials

CloudflareVectorize depends on WorkersAI (if you want to use it for Embeddings), and D1 (if you are using it to store and retrieve raw values).

While you can create a single `api_token` with Edit privileges to all needed resources (WorkersAI, Vectorize & D1), you may want to follow the principle of "least privilege access" and create separate API tokens for each service

**Note:** These service-specific tokens (if provided) will take preference over a global token.  You could provide these instead of a global token.


In [1]:
from dotenv import load_dotenv
import os

load_dotenv(".env")

cf_acct_id = os.getenv("cf_acct_id")

# single token with WorkersAI, Vectorize & D1
api_token = os.getenv("cf_ai_token")

# OR, separate tokens with access to each service
cf_vectorize_token = os.getenv("cf_vectorize_token")
cf_d1_token = os.getenv("cf_d1_token")

## Initialization

In [2]:
import asyncio
import json
import uuid
import warnings

from langchain_community.document_loaders import WikipediaLoader
from langchain_cloudflare.embeddings import (
    CloudflareWorkersAIEmbeddings,
)
from langchain_cloudflare.vectorstores import (
    CloudflareVectorize,
)
from langchain_core.documents import Document
from langchain_text_splitters import RecursiveCharacterTextSplitter

warnings.filterwarnings("ignore")

In [54]:
# name your vectorize index
vectorize_index_name = f"test-langchain-{uuid.uuid4().hex}"

### Embeddings

For storage of embeddings, semantic search and retrieval, you must embed your raw values as embeddings.  Specify an embedding model, one available on WorkersAI

[https://developers.cloudflare.com/workers-ai/models/](https://developers.cloudflare.com/workers-ai/models/)

In [4]:
MODEL_WORKERSAI = "@cf/baai/bge-large-en-v1.5"

In [5]:
cf_ai_token = os.getenv(
    "cf_ai_token"
)  # needed if you want to use workersAI for embeddings

embedder = CloudflareWorkersAIEmbeddings(
    account_id=cf_acct_id, api_token=cf_ai_token, model_name=MODEL_WORKERSAI
)

### Raw Values with D1

Vectorize only stores embeddings, metadata and namespaces. If you want to store and retrieve raw values, you must leverage Cloudflare's SQL Database D1.

You can create a database here and retrieve its id:

[https://dash.cloudflare.com/YOUR-ACCT-NUMBER/workers/d1

In [6]:
# provide the id of your D1 Database
d1_database_id = os.getenv("d1_database_id")

### CloudflareVectorize Class

Now we can create the CloudflareVectorize instance.  Here we passed:

* The `embedding` instance from earlier
* The account ID
* A global API token for all services (WorkersAI, Vectorize, D1)
* Individual API tokens for each service

In [7]:
cfVect = CloudflareVectorize(
    embedding=embedder,
    account_id=cf_acct_id,
    d1_api_token=cf_d1_token,  # (Optional if using global token)
    vectorize_api_token=cf_vectorize_token,  # (Optional if using global token)
    d1_database_id=d1_database_id,  # (Optional if not using D1)
)

### Cleanup
Before we get started, let's delete any `test-langchain*` indexes we have for this walkthrough

In [55]:
# depending on your notebook environment you might need to include:
# import nest_asyncio
# nest_asyncio.apply()

arr_indexes = cfVect.list_indexes()
arr_indexes = [x for x in arr_indexes if "test-langchain" in x.get("name")]
arr_async_requests = [
    cfVect.adelete_index(index_name=x.get("name")) for x in arr_indexes
]
await asyncio.gather(*arr_async_requests);

### Gotchyas

A few "gotchyas" are shown below for various missing token/parameter combinations

D1 Database ID provided but no "global" `api_token` and no `d1_api_token`

In [56]:
try:
    cfVect = CloudflareVectorize(
        embedding=embedder,
        account_id=cf_acct_id,
        # api_token=api_token, # (Optional if using service-specific token)
        ai_api_token=cf_ai_token,  # (Optional if using global token)
        # d1_api_token=cf_d1_token,  # (Optional if using global token)
        vectorize_api_token=cf_vectorize_token,  # (Optional if using global token)
        d1_database_id=d1_database_id,  # (Optional if not using D1)
    )
except Exception as e:
    print(str(e))

`d1_database_id` provided, but no global `api_token` provided and no `d1_api_token` provided.


No "global" `api_token` provided and either missing `ai_api_token` or `vectorize_api_token`

In [57]:
try:
    cfVect = CloudflareVectorize(
        embedding=embedder,
        account_id=cf_acct_id,
        # api_token=api_token, # (Optional if using service-specific token)
        # ai_api_token=cf_ai_token,  # (Optional if using global token)
        d1_api_token=cf_d1_token,  # (Optional if using global token)
        vectorize_api_token=cf_vectorize_token,  # (Optional if using global token)
        d1_database_id=d1_database_id,  # (Optional if not using D1)
    )
except Exception as e:
    print(str(e))

## Manage Vector Store

### Creating an Index

Let's start off this example by creating and index (and first deleting if it exists).  If the index doesn't exist we will get a an error from Cloudflare telling us so.

In [58]:
%%capture

try:
    cfVect.delete_index(index_name=vectorize_index_name, wait=True)
except Exception as e:
    print(e)

In [59]:
r = cfVect.create_index(
    index_name=vectorize_index_name,
    description="A Test Vectorize Index",
    wait=True
)
print(r)

{'created_on': '2025-05-01T20:34:41.33658Z', 'modified_on': '2025-05-01T20:34:41.33658Z', 'name': 'test-langchain-cf108f7ca1b745458aa2fe717be9cc86', 'description': 'A Test Vectorize Index', 'config': {'dimensions': 1024, 'metric': 'cosine'}}


### Listing Indexes

Now, we can list our indexes on our account

In [60]:
indexes = cfVect.list_indexes()
indexes = [x for x in indexes if "test-langchain" in x.get("name")]
print(indexes)

[{'created_on': '2025-05-01T20:34:41.33658Z', 'modified_on': '2025-05-01T20:34:41.33658Z', 'name': 'test-langchain-cf108f7ca1b745458aa2fe717be9cc86', 'description': 'A Test Vectorize Index', 'config': {'dimensions': 1024, 'metric': 'cosine'}}]


### Get Index Info
We can also get certain indexes and retrieve more granular information about an index.

This call returns a `processedUpToMutation` which can be used to track the status of operations such as creating indexes, adding or deleting records.

In [61]:
r = cfVect.get_index_info(index_name=vectorize_index_name)
print(r)

{'dimensions': 1024, 'vectorCount': 0}


### Adding Metadata Indexes

It is common to assist retrieval by supplying metadata filters in quereies.  In Vectorize, this is accomplished by first creating a "metadata index" on your Vectorize Index.  We will do so for our example by creating one on the `section` field in our documents.

**Reference:** [https://developers.cloudflare.com/vectorize/reference/metadata-filtering/](https://developers.cloudflare.com/vectorize/reference/metadata-filtering/)


In [62]:
r = cfVect.create_metadata_index(
    property_name="section",
    index_type="string",
    index_name=vectorize_index_name,
    wait=True,
)
print(r)

{'mutationId': '34dafd37-dd96-4826-b940-c33645a8a1e5'}


### Listing Metadata Indexes

In [63]:
r = cfVect.list_metadata_indexes(index_name=vectorize_index_name)
print(r)

[{'propertyName': 'section', 'indexType': 'String'}]


### Adding Documents
For this example, we will use LangChain's Wikipedia loader to pull an article about Cloudflare.  We will store this in Vectorize and query its contents later.

In [18]:
docs = WikipediaLoader(query="Cloudflare", load_max_docs=2).load()

We will then create some simple chunks with metadata based on the chunk sections.

In [19]:
text_splitter = RecursiveCharacterTextSplitter(
    # Set a really small chunk size, just to show.
    chunk_size=100,
    chunk_overlap=20,
    length_function=len,
    is_separator_regex=False,
)
texts = text_splitter.create_documents([docs[0].page_content])

running_section = ""
for idx, text in enumerate(texts):
    if text.page_content.startswith("="):
        running_section = text.page_content
        running_section = running_section.replace("=", "").strip()
    else:
        if running_section == "":
            text.metadata = {"section": "Introduction"}
        else:
            text.metadata = {"section": running_section}

In [51]:
print(len(texts))
print(texts[0], "\n\n", texts[-1])

55
page_content='Cloudflare, Inc., is an American company that provides content delivery network services,' metadata={'section': 'Introduction'} 

 page_content='attacks, Cloudflare ended up being attacked as well; Google and other companies eventually' metadata={'section': 'DDoS mitigation'}


Now we will add documents to our Vectorize Index.

**Note:**
Adding embeddings to Vectorize happens `asyncronously`, meaning there will be a small delay between adding the embeddings and being able to query them.  By default `add_documents` has a `wait=True` parameter which waits for this operation to complete before returning a response.  If you do not want the program to wait for embeddings availability, you can set this to `wait=False`.


In [21]:
r = cfVect.add_documents(index_name=vectorize_index_name, documents=texts, wait=True)

In [22]:
print(json.dumps(r)[:300])

["fe0d8edb-2f3d-4a97-bffb-c6e90aa1a478", "ea69ca78-0a81-4515-80b0-13526cd37802", "e489b28b-d33c-417d-839f-853667e4fdee", "1e63124b-5f76-40f1-b280-b9a94d247913", "2daa6de1-f1de-4c2b-8fc3-0a9ae67a1922", "543c77b8-bcd8-467c-8ac0-bc140c0ca477", "3c7ff459-1e4a-45b1-aace-06711f0a3fd4", "02d113a4-0366-4582


## Query vector store

We will do some searches on our embeddings.  We can specify our search `query` and the top number of results we want with `k`.


In [23]:
query_documents = cfVect.similarity_search(
    index_name=vectorize_index_name, query="Workers AI", k=100, return_metadata="none"
)

print(f"{len(query_documents)} results:\n{query_documents[:3]}")

55 results:
[Document(id='38f7ed21-17d3-4919-aafe-a62ab0ba3104', metadata={}, page_content="In 2023, Cloudflare launched Workers AI, a framework allowing for use of Nvidia GPU's within"), Document(id='f579a739-c460-4240-b4d0-9ba107c24f66', metadata={}, page_content='based on queries by leveraging Workers AI.Cloudflare announced plans in September 2024 to launch a'), Document(id='bf15454f-5b31-475d-aa67-ea6a9b9d5928', metadata={}, page_content='=== Artificial intelligence ===')]


### Output

If you want to return metadata you can pass `return_metadata="all" | 'indexed'`.  The default is `all`.

If you want to return the embeddings values, you can pass `return_values=True`.  The default is `False`.
Embeddings will be returned in the `metadata` field under the special `_values` field.

**Note:** `return_metadata="none"` and `return_values=True` will return only ther `_values` field in `metadata`.

**Note:**
If you return metadata or values, the results will be limited to the top 20.

[https://developers.cloudflare.com/vectorize/platform/limits/](https://developers.cloudflare.com/vectorize/platform/limits/)

In [24]:
query_documents = cfVect.similarity_search(
    index_name=vectorize_index_name,
    query="Workers AI",
    return_values=True,
    return_metadata="all",
    k=100,
)
print(f"{len(query_documents)} results:\n{str(query_documents[0])[:500]}")

20 results:
page_content='In 2023, Cloudflare launched Workers AI, a framework allowing for use of Nvidia GPU's within' metadata={'section': 'Artificial intelligence', '_values': [0.014350891, 0.0053482056, -0.022354126, 0.002948761, 0.010406494, -0.016067505, -0.002029419, -0.023513794, 0.020141602, 0.023742676, 0.01361084, 0.003019333, 0.02748108, -0.023162842, 0.008979797, -0.029373169, -0.03643799, -0.03842163, -0.004463196, 0.021255493, 0.02192688, -0.005947113, -0.060272217, -0.055389404, -0.031188965


If you'd like the similarity `scores` to be returned, you can use `similarity_search_with_score`


In [25]:
query_documents = cfVect.similarity_search_with_score(
    index_name=vectorize_index_name,
    query="Workers AI",
    k=100,
    return_metadata="all",
)
print(f"{len(query_documents)} results:\n{str(query_documents[0])[:500]}")

20 results:
(Document(id='38f7ed21-17d3-4919-aafe-a62ab0ba3104', metadata={'section': 'Artificial intelligence'}, page_content="In 2023, Cloudflare launched Workers AI, a framework allowing for use of Nvidia GPU's within"), 0.7851709)


### Including D1 for "Raw Values"
All of the `add` and `search` methods on CloudflareVectorize support a `include_d1` parameter (default=True).

This is to configure whether you want to store/retrieve raw values.

If you do not want to use D1 for this, you can set this to `include=False`.  This will return documents with an empty `page_content` field.

**Note:** Your D1 table name MUST MATCH your vectorize index name!  If you run 'create_index' and include_d1=True or  cfVect(d1_database=...,) this D1 table will be created along with your Vectorize Index.

In [26]:
query_documents = cfVect.similarity_search_with_score(
    index_name=vectorize_index_name,
    query="california",
    k=100,
    return_metadata="all",
    include_d1=False,
)
print(f"{len(query_documents)} results:\n{str(query_documents[0])[:500]}")

20 results:
(Document(id='1e63124b-5f76-40f1-b280-b9a94d247913', metadata={'section': 'Introduction'}, page_content=''), 0.60426825)


### Query by turning into retriever

You can also transform the vector store into a retriever for easier usage in your chains. 

In [27]:
retriever = cfVect.as_retriever(
    search_type="similarity",
    search_kwargs={"k": 1, "index_name": vectorize_index_name},
)
r = retriever.get_relevant_documents("california")

### Searching with Metadata Filtering

As mentioned before, Vectorize supports filtered search via filtered on indexes metadata fields.  Here is an example where we search for `Introduction` values within the indexed `section` metadata field.

More info on searching on Metadata fields is here: [https://developers.cloudflare.com/vectorize/reference/metadata-filtering/](https://developers.cloudflare.com/vectorize/reference/metadata-filtering/)


In [28]:
query_documents = cfVect.similarity_search_with_score(
    index_name=vectorize_index_name,
    query="California",
    k=100,
    md_filter={"section": "Introduction"},
    return_metadata="all",
)
print(f"{len(query_documents)} results:\n - {str(query_documents[:3])}")

6 results:
 - [(Document(id='1e63124b-5f76-40f1-b280-b9a94d247913', metadata={'section': 'Introduction'}, page_content="and other services. Cloudflare's headquarters are in San Francisco, California. According to"), 0.60426825), (Document(id='ea69ca78-0a81-4515-80b0-13526cd37802', metadata={'section': 'Introduction'}, page_content='network services, cybersecurity, DDoS mitigation, wide area network services, reverse proxies,'), 0.52082914), (Document(id='fe0d8edb-2f3d-4a97-bffb-c6e90aa1a478', metadata={'section': 'Introduction'}, page_content='Cloudflare, Inc., is an American company that provides content delivery network services,'), 0.50490546)]


You can do more sophisticated filtering as well

https://developers.cloudflare.com/vectorize/reference/metadata-filtering/#valid-filter-examples

In [29]:
query_documents = cfVect.similarity_search_with_score(
    index_name=vectorize_index_name,
    query="California",
    k=100,
    md_filter={"section": {"$ne": "Introduction"}},
    return_metadata="all",
)
print(f"{len(query_documents)} results:\n - {str(query_documents[:3])}")

20 results:
 - [(Document(id='57f499e6-0c6f-4cba-b434-5264f9c4e7fe', metadata={}, page_content='== Products =='), 0.56540567), (Document(id='8dde0e7c-a335-47a3-9754-b92d676e0fc8', metadata={'section': 'History'}, page_content='Since at least 2017, Cloudflare has been using a wall of lava lamps in their San Francisco'), 0.5604333), (Document(id='770d990a-fa17-4c75-ac4e-43336fa77868', metadata={'section': 'History'}, page_content='their San Francisco headquarters as a source of randomness for encryption keys, alongside double'), 0.55573463)]


In [30]:
query_documents = cfVect.similarity_search_with_score(
    index_name=vectorize_index_name,
    query="DNS",
    k=100,
    md_filter={"section": {"$in": ["Products", "History"]}},
    return_metadata="all",
)
print(f"{len(query_documents)} results:\n - {str(query_documents)}")

20 results:
 - [(Document(id='81db288d-636a-4920-be5c-0f1e36a62821', metadata={'section': 'Products'}, page_content='protocols such as DNS over HTTPS, SMTP, and HTTP/2 with support for HTTP/2 Server Push. As of 2023,'), 0.7205538), (Document(id='f2fd94a6-b280-43a1-8347-33d52b88b9e4', metadata={'section': 'Products'}, page_content='utilizing edge computing, reverse proxies for web traffic, data center interconnects, and a content'), 0.58178145), (Document(id='791561ae-d8f7-49af-8f6a-4f5545fb1e4e', metadata={'section': 'Products'}, page_content='and a content distribution network to serve content across its network of servers. It supports'), 0.5797795), (Document(id='fc2108e0-4745-46b7-8a42-98b417b3091e', metadata={'section': 'History'}, page_content='the New York Stock Exchange under the stock ticker NET. It opened for public trading on September'), 0.5678468), (Document(id='07f3de90-b329-4007-8674-6359a2b7dc23', metadata={'section': 'Products'}, page_content='Cloudflare provides networ

### Search by Namespace
We can also search for vectors by `namespace`.  We just need to add it to the `namespaces` array when adding it to our vector database.

https://developers.cloudflare.com/vectorize/reference/metadata-filtering/#namespace-versus-metadata-filtering

In [31]:
namespace_name = f"test-namespace-{uuid.uuid4().hex[:8]}"

new_documents = [
    Document(
        page_content="This is a new namespace specific document!",
        metadata={"section": "Namespace Test1"},
    ),
    Document(
        page_content="This is another namespace specific document!",
        metadata={"section": "Namespace Test2"},
    ),
]

r = cfVect.add_documents(
    index_name=vectorize_index_name,
    documents=new_documents,
    namespaces=[namespace_name] * len(new_documents),
    wait=True,
)

In [32]:
query_documents = cfVect.similarity_search(
    index_name=vectorize_index_name,
    query="California",
    namespace=namespace_name,
)

print(f"{len(query_documents)} results:\n - {str(query_documents)}")

2 results:
 - [Document(id='e6e59c03-666d-42b9-9fd7-604992c04acf', metadata={'section': 'Namespace Test2', '_namespace': 'test-namespace-d6172d71'}, page_content='This is another namespace specific document!'), Document(id='c3c78470-0a7b-4eaf-965c-8e71ce71fa5a', metadata={'section': 'Namespace Test1', '_namespace': 'test-namespace-d6172d71'}, page_content='This is a new namespace specific document!')]


### Search by IDs
We can also retrieve specific records for specific IDs.  To do so, we need to set the vectorize index name on the `index_name` Vectorize state param.

This will return both `_namespace` and `_values` as well as other `metadata`.


In [33]:
sample_ids = [x.id for x in query_documents]

In [34]:
cfVect.index_name = vectorize_index_name

In [35]:
query_documents = cfVect.get_by_ids(
    sample_ids,
)
print(str(query_documents[:3])[:500])

[Document(id='e6e59c03-666d-42b9-9fd7-604992c04acf', metadata={'section': 'Namespace Test2', '_namespace': 'test-namespace-d6172d71', '_values': [-0.0005841255, 0.014480591, 0.040771484, 0.005218506, 0.015579224, 0.0007543564, -0.005138397, -0.022720337, 0.021835327, 0.038970947, 0.017456055, 0.022705078, 0.013450623, -0.015686035, -0.019119263, -0.01512146, -0.017471313, -0.007183075, -0.054382324, -0.01914978, 0.0005302429, 0.018600464, -0.083740234, -0.006462097, 0.0005598068, 0.024230957, -0


The namespace will be included in the `_namespace` field in `metadata` along with your other metadata (if you requested it in `return_metadata`).

**Note:** You cannot set the `_namespace` or `_values` fields in `metadata` as they are reserved.  They will be stripped out during the insert process.

### Upserts

Vectorize supports Upserts which you can perform by setting `upsert=True`.



In [36]:
query_documents[0].page_content = "Updated: " + query_documents[0].page_content
print(query_documents[0].page_content)

Updated: This is another namespace specific document!


In [37]:
new_document_id = "12345678910"
new_document = Document(
    id=new_document_id,
    page_content="This is a new document!",
    metadata={"section": "Introduction"},
)

In [38]:
r = cfVect.add_documents(
    index_name=vectorize_index_name,
    documents=[new_document, query_documents[0]],
    upsert=True,
    wait=True,
)

In [39]:
query_documents_updated = cfVect.get_by_ids([new_document_id, query_documents[0].id])

In [40]:
print(str(query_documents_updated[0])[:500])
print(query_documents_updated[0].page_content)
print(query_documents_updated[1].page_content)

page_content='This is a new document!' metadata={'section': 'Introduction', '_namespace': None, '_values': [-0.007522583, 0.0023021698, 0.009963989, 0.031051636, -0.021316528, 0.0048103333, 0.026046753, 0.01348114, 0.026306152, 0.040374756, 0.03225708, 0.007423401, 0.031021118, -0.007347107, -0.034179688, 0.002111435, -0.027191162, -0.020950317, -0.021636963, -0.0030593872, -0.04977417, 0.018859863, -0.08062744, -0.027679443, 0.012512207, 0.0053634644, 0.008079529, -0.010528564, 0.07312012, 0.02
This is a new document!
Updated: This is another namespace specific document!


### Deleting Records
We can delete records by their ids as well


In [41]:
r = cfVect.delete(index_name=vectorize_index_name, ids=sample_ids, wait=True)
print(r)

True


And to confirm deletion

In [42]:
query_documents = cfVect.get_by_ids(sample_ids)
assert len(query_documents) == 0

### Creating from Documents
LangChain stipulates that all vectorstores must have a `from_documents` method to instantiate a new Vectorstore from documents.  This is a more streamlined method than the individual `create, add` steps shown above.

You can do that as shown here:

In [43]:
vectorize_index_name = "test-langchain-from-docs"

In [44]:
cfVect = CloudflareVectorize.from_documents(
    account_id=cf_acct_id,
    index_name=vectorize_index_name,
    documents=texts,
    embedding=embedder,
    d1_database_id=d1_database_id,
    d1_api_token=cf_d1_token,
    vectorize_api_token=cf_vectorize_token,
    wait=True,
)

In [45]:
# query for documents
query_documents = cfVect.similarity_search(
    index_name=vectorize_index_name,
    query="Edge Computing",
)

print(f"{len(query_documents)} results:\n{str(query_documents[0])[:300]}")

20 results:
page_content='utilizing edge computing, reverse proxies for web traffic, data center interconnects, and a content' metadata={'section': 'Products'}


## Async Examples
This section will show some Async examples


### Creating Indexes

In [52]:
vectorize_index_name1 = f"test-langchain-{uuid.uuid4().hex}"
vectorize_index_name2 = f"test-langchain-{uuid.uuid4().hex}"
vectorize_index_name3 = f"test-langchain-{uuid.uuid4().hex}"

In [53]:
# depending on your notebook environment you might need to include these:
# import nest_asyncio
# nest_asyncio.apply()

async_requests = [
    cfVect.acreate_index(index_name=vectorize_index_name1),
    cfVect.acreate_index(index_name=vectorize_index_name2),
    cfVect.acreate_index(index_name=vectorize_index_name3),
]

res = await asyncio.gather(*async_requests);

### Creating Metadata Indexes

In [None]:
async_requests = [
    cfVect.acreate_metadata_index(
        property_name="section",
        index_type="string",
        index_name=vectorize_index_name1,
        wait=True,
    ),
    cfVect.acreate_metadata_index(
        property_name="section",
        index_type="string",
        index_name=vectorize_index_name2,
        wait=True,
    ),
    cfVect.acreate_metadata_index(
        property_name="section",
        index_type="string",
        index_name=vectorize_index_name3,
        wait=True,
    ),
]

await asyncio.gather(*async_requests);

### Adding Documents

In [None]:
async_requests = [
    cfVect.aadd_documents(index_name=vectorize_index_name1, documents=texts, wait=True),
    cfVect.aadd_documents(index_name=vectorize_index_name2, documents=texts, wait=True),
    cfVect.aadd_documents(index_name=vectorize_index_name3, documents=texts, wait=True),
]

await asyncio.gather(*async_requests);

### Querying/Search

In [None]:
async_requests = [
    cfVect.asimilarity_search(index_name=vectorize_index_name1, query="Workers AI"),
    cfVect.asimilarity_search(index_name=vectorize_index_name2, query="Edge Computing"),
    cfVect.asimilarity_search(index_name=vectorize_index_name3, query="SASE"),
]

async_results = await asyncio.gather(*async_requests);

In [None]:
print(f"{len(async_results[0])} results:\n{str(async_results[0][0])[:300]}")
print(f"{len(async_results[1])} results:\n{str(async_results[1][0])[:300]}")
print(f"{len(async_results[1])} results:\n{str(async_results[2][0])[:300]}")

### Returning Metadata/Values

In [None]:
async_requests = [
    cfVect.asimilarity_search(
        index_name=vectorize_index_name1,
        query="California",
        return_values=True,
        return_metadata="all",
    ),
    cfVect.asimilarity_search(
        index_name=vectorize_index_name2,
        query="California",
        return_values=True,
        return_metadata="all",
    ),
    cfVect.asimilarity_search(
        index_name=vectorize_index_name3,
        query="California",
        return_values=True,
        return_metadata="all",
    ),
]

async_results = await asyncio.gather(*async_requests);

In [None]:
print(f"{len(async_results[0])} results:\n{str(async_results[0][0])[:300]}")
print(f"{len(async_results[1])} results:\n{str(async_results[1][0])[:300]}")
print(f"{len(async_results[1])} results:\n{str(async_results[2][0])[:300]}")

### Searching with Metadata Filtering

In [None]:
async_requests = [
    cfVect.asimilarity_search(
        index_name=vectorize_index_name1,
        query="Cloudflare services",
        k=2,
        md_filter={"section": "Products"},
        return_metadata="all",
        # return_values=True
    ),
    cfVect.asimilarity_search(
        index_name=vectorize_index_name2,
        query="Cloudflare services",
        k=2,
        md_filter={"section": "Products"},
        return_metadata="all",
        # return_values=True
    ),
    cfVect.asimilarity_search(
        index_name=vectorize_index_name3,
        query="Cloudflare services",
        k=2,
        md_filter={"section": "Products"},
        return_metadata="all",
        # return_values=True
    ),
]

async_results = await asyncio.gather(*async_requests);

In [None]:
[doc.metadata["section"] == "Products" for doc in async_results[0]]

In [None]:
print(f"{len(async_results[0])} results:\n{str(async_results[0][-1])[:300]}")
print(f"{len(async_results[1])} results:\n{str(async_results[1][0])[:300]}")
print(f"{len(async_results[1])} results:\n{str(async_results[2][0])[:300]}")

## Cleanup
Let's finish by deleting all of the indexes we created in this notebook.

In [None]:
arr_indexes = cfVect.list_indexes()
arr_indexes = [x for x in arr_indexes if "test-langchain" in x.get("name")]

In [None]:
arr_async_requests = [
    cfVect.adelete_index(index_name=x.get("name")) for x in arr_indexes
]
await asyncio.gather(*arr_async_requests);

## API Reference


https://developers.cloudflare.com/api/resources/vectorize/

https://developers.cloudflare.com/vectorize/