Skip to content

abhirockzz/integrated-embeddings-sample

Repository files navigation

Integrated Embeddings sample app

End-to-end sample for Integrated Embeddings in Azure Cosmos DB for NoSQL. Companion to the announcement blog post.

The app creates a container with an embeddingSource policy, inserts outdoor-product items, lets Azure Cosmos DB generate the embeddings, and runs a vector search.

Prerequisites

  • Python 3.10+
  • An Azure Cosmos DB for NoSQL account with vector search and change feed mode enabled
  • A Microsoft Foundry embedding model deployment (text-embedding-3-small, text-embedding-3-large, or text-embedding-ada-002)
  • (RAG agent only) A Microsoft Foundry chat model deployment (e.g., gpt-4o, gpt-4o-mini)
  • The Cosmos DB account's managed identity granted Cognitive Services OpenAI User on the Foundry resource
  • Your signed-in principal granted Cosmos DB Operator (Azure RBAC) and Cosmos DB Built-in Data Contributor (Cosmos DB RBAC) on the Cosmos DB account

See the blog post and the Integrated Embeddings prerequisites for details.

Setup

python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

cp .env.example .env
# edit .env with your account details and Foundry API key

az login

Run

# Step 1: create the database and container with embeddingSource policy
python create_db_and_container.py

# Step 2: insert 100 sample items (embeddings generate in the background)
python insert_sample_data.py

# Step 3: in the Azure portal Data Explorer, run this query until count reaches 100
#   SELECT VALUE COUNT(1) FROM c WHERE IS_DEFINED(c.embedding)

# Step 4: run a vector search
python vector_search.py "I need to stay warm on a cold ski trip"
python vector_search.py "lightweight cookware for backpacking" --top-k 3

# Optional: chat with a RAG agent over the catalog
python rag_agent.py

Files

File Purpose
config.py Loads .env and exposes settings to the other scripts.
create_db_and_container.py Creates the database and container with the vector embedding policy.
insert_sample_data.py Upserts items from items.json.
vector_search.py Embeds a query and runs VectorDistance().
rag_agent.py LangChain RAG agent over the catalog.
items.json 100 sample outdoor-product items.
.env.example Template for the environment variables. Copy to .env.

Auth

  • Azure Cosmos DB calls use Microsoft Entra ID via DefaultAzureCredential.
  • The Cosmos DB to Foundry call (Integrated Embeddings itself) uses the Cosmos DB account's managed identity (authType: "Entra" in the policy).
  • The query-time call in vector_search.py and rag_agent.py uses an API key (FOUNDRY_API_KEY) to call Foundry for the query embedding (and chat, for the RAG agent).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages