# Working with Voyage AI in Pixeltable

Pixeltable's Voyage AI integration enables you to access state-of-the-art embedding and reranker models via the Voyage AI API.

### Prerequisites

- A Voyage AI account with an API key (https://www.voyageai.com/)

### Important notes

- Voyage AI usage may incur costs based on your Voyage AI plan.
- Be mindful of sensitive data and consider security measures when integrating with external services.

First you'll need to install required libraries and enter your Voyage AI API key.

In [None]:
%pip install -qU voyageai

In [39]:
import os
import getpass
if 'VOYAGE_API_KEY' not in os.environ:
    os.environ['VOYAGE_API_KEY'] = getpass.getpass('Enter your Voyage AI API key:')

Now let's create a Pixeltable directory to hold the tables for our demo.

In [40]:
import pixeltable as pxt

# Remove the 'voyageai_demo' directory and its contents, if it exists
pxt.drop_dir('voyageai_demo', force=True)
pxt.create_dir('voyageai_demo')

Created directory 'voyageai_demo'.


<pixeltable.catalog.dir.Dir at 0x106daf2d0>

## Text embeddings

Voyage AI provides state-of-the-art embedding models for semantic search and RAG applications.

In [41]:
from pixeltable.functions import voyageai

# Create a table for document embeddings
docs_t = pxt.create_table('voyageai_demo.documents', {'text': pxt.String})

# Add computed column with Voyage embeddings
docs_t.add_computed_column(
    embedding=voyageai.embeddings(
        docs_t.text,
        model='voyage-3.5',
        input_type='document'
    )
)

Created table 'documents'.
Added 0 column values with 0 errors.


No rows affected.

In [42]:
# Insert some sample documents
documents = [
    "The Mediterranean diet emphasizes fish, olive oil, and vegetables, believed to reduce chronic diseases.",
    "Photosynthesis in plants converts light energy into glucose and produces essential oxygen.",
    "20th-century innovations, from radios to smartphones, centered on electronic advancements.",
    "Rivers provide water, irrigation, and habitat for aquatic species, vital for ecosystems.",
    "Apple's conference call to discuss fourth fiscal quarter results is scheduled for Thursday, November 2, 2023.",
    "Shakespeare's works, like 'Hamlet' and 'A Midsummer Night's Dream,' endure in literature."
]

docs_t.insert({'text': doc} for doc in documents)

Inserting rows into `documents`: 6 rows [00:00, 2561.67 rows/s]
Inserted 6 rows with 0 errors.


6 rows inserted, 12 values computed.

In [43]:
# View the embeddings
docs_t.select(docs_t.text, docs_t.embedding).head(3)

text,embedding
"The Mediterranean diet emphasizes fish, olive oil, and vegetables, believed to reduce chronic diseases.",[ 0.048 0.016 0.002 0.026 0.038 0.013 ... -0.015 -0.034 -0.016 0.007 0.046 -0.011]
Photosynthesis in plants converts light energy into glucose and produces essential oxygen.,[ 0.013 0.023 -0.004 0.052 0.037 0.022 ... -0.013 -0.042 0.001 0.008 -0.02 -0.016]
"20th-century innovations, from radios to smartphones, centered on electronic advancements.",[ 4.373e-03 4.474e-02 -6.796e-05 2.745e-02 4.904e-02 6.148e-03 ... -1.833e-02 -4.274e-02 -4.713e-03 -1.739e-02 -1.540e-03 -2.306e-02]


## Embedding index for similarity search

You can use Voyage AI embeddings with Pixeltable's embedding index for efficient similarity search.

In [44]:
# Create a table with an embedding index
search_t = pxt.create_table('voyageai_demo.search', {'text': pxt.String})

# Add embedding index for similarity search
embed_fn = voyageai.embeddings.using(model='voyage-3.5', input_type='document')
search_t.add_embedding_index('text', string_embed=embed_fn)

Created table 'search'.


In [45]:
# Insert documents
search_t.insert({'text': doc} for doc in documents)

Inserting rows into `search`: 6 rows [00:00, 973.68 rows/s]
Inserted 6 rows with 0 errors.


6 rows inserted, 12 values computed.

In [46]:
# Perform similarity search
sim = search_t.text.similarity(string="What are the health benefits of Mediterranean food?")
search_t.order_by(sim, asc=False).limit(3).select(search_t.text, score=sim).collect()

text,score
"The Mediterranean diet emphasizes fish, olive oil, and vegetables, believed to reduce chronic diseases.",0.863
Photosynthesis in plants converts light energy into glucose and produces essential oxygen.,0.649
"Shakespeare's works, like 'Hamlet' and 'A Midsummer Night's Dream,' endure in literature.",0.621


## Reranking

Voyage AI's rerankers can refine search results by providing more accurate relevance scores.

In [47]:
# Create a table for reranking
rerank_t = pxt.create_table(
    'voyageai_demo.rerank',
    {'query': pxt.String, 'documents': pxt.Json}
)

# Add computed column with reranking results
rerank_t.add_computed_column(
    reranked=voyageai.rerank(
        rerank_t.query,
        rerank_t.documents,
        model='rerank-2.5',
        top_k=3
    )
)

Created table 'rerank'.
Added 0 column values with 0 errors.


No rows affected.

In [48]:
# Insert query and documents to rerank
rerank_t.insert([{
    'query': "When is Apple's conference call scheduled?",
    'documents': documents
}])

Inserting rows into `rerank`: 1 rows [00:00, 343.65 rows/s]
Inserted 1 row with 0 errors.


1 row inserted, 2 values computed.

In [49]:
# Add computed column to extract top results using JSON path
rerank_t.add_computed_column(top_results=rerank_t.reranked['results'])

Added 1 column value with 0 errors.


1 row updated, 1 value computed.

In [50]:
# Extract the top result's document and score
rerank_t.select(
    rerank_t.query,
    top_document=rerank_t.top_results[0]['document'],
    top_score=rerank_t.top_results[0]['relevance_score']
).collect()


query,top_document,top_score
When is Apple's conference call scheduled?,"Apple's conference call to discuss fourth fiscal quarter results is scheduled for Thursday, November 2, 2023.",0.93


In [51]:
# View reranking results
rerank_t.select(rerank_t.query, rerank_t.top_results).collect()

query,top_results
When is Apple's conference call scheduled?,"[{""index"": 4, ""document"": ""Apple's conference call to discuss fourth fiscal quarter results is scheduled for Thursday, November 2, 2023."", ""relevance_score"": 0.93}, {""index"": 2, ""document"": ""20th-century innovations, from radios to smartphones, centered on electronic advancements."", ""relevance_score"": 0.283}, {""index"": 0, ""document"": ""The Mediterranean diet emphasizes fish, olive oil, and vegetables, believed to reduce chronic diseases."", ""relevance_score"": 0.264}]"


## Multimodal Embeddings

Voyage AI's multimodal model (`voyage-multimodal-3`) can embed both images and text into the same vector space, enabling cross-modal similarity search.

In [52]:
# Create a table for multimodal embeddings
mm_t = pxt.create_table('voyageai_demo.multimodal', {'image': pxt.Image, 'caption': pxt.String}, if_exists='replace')

# Add computed columns for image and text embeddings
# multimodal_embed can embed either images or text independently
mm_t.add_computed_column(
    image_embedding=voyageai.multimodal_embed(mm_t.image, input_type='document')
)
mm_t.add_computed_column(
    text_embedding=voyageai.multimodal_embed(mm_t.caption, input_type='document')
)

Created table 'multimodal'.
Added 0 column values with 0 errors.
Added 0 column values with 0 errors.


No rows affected.

In [53]:
# Insert a sample image with caption
mm_t.insert([{
    'image': 'https://raw.githubusercontent.com/pixeltable/pixeltable/main/docs/resources/images/000000000139.jpg',
    'caption': 'A person standing next to an elephant'
}])

Inserting rows into `multimodal`: 1 rows [00:00, 520.00 rows/s]
Inserted 1 row with 0 errors.


1 row inserted, 5 values computed.

In [54]:
# View the multimodal embeddings
mm_t.select(mm_t.image, mm_t.caption, mm_t.image_embedding, mm_t.text_embedding).head()

image,caption,image_embedding,text_embedding
,A person standing next to an elephant,[-0.017 -0.001 0.015 0.008 -0.025 -0.012 ... 0.01 0.058 -0.031 0.017 0.011 -0.055],[ 0.012 -0.002 -0.024 0.012 -0.036 -0.003 ... -0.041 0.007 0.019 -0.013 0.027 0.001]


### Learn more

To learn more about RAG operations in Pixeltable, check out the [RAG Operations in Pixeltable](https://docs.pixeltable.com/howto/use-cases/rag-operations) tutorial.

For more information about Voyage AI models and features, visit:

- [Voyage AI Documentation](https://docs.voyageai.com/)
- [Text Embeddings](https://docs.voyageai.com/docs/embeddings)
- [Multimodal Embeddings](https://docs.voyageai.com/docs/multimodal-embeddings)
- [Rerankers](https://docs.voyageai.com/docs/reranker)

If you have any questions, don't hesitate to reach out.