# Azure AI Search Simulator - Embedding Skill Demo

This notebook demonstrates how to use the **Azure OpenAI Embedding Skill** with the Azure AI Search Simulator to generate vector embeddings for semantic search.

## What This Notebook Covers

1. **Vector Index Creation** - Create an index with vector fields for embeddings
2. **Skillset with Embedding Skill** - Configure Azure OpenAI Embedding skill
3. **Indexer with Enrichment** - Process documents and generate embeddings
4. **Vector Search** - Perform similarity search using embeddings
5. **Hybrid Search** - Combine keyword and vector search with RRF fusion

## Prerequisites

1. **Configure your `.env` file** in the workspace root:

   ```bash
   cp .env.example .env
   ```
   Then fill in the Azure OpenAI variables:
   ```dotenv
   AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
   AZURE_OPENAI_API_KEY=your-api-key
   AZURE_OPENAI_DEPLOYMENT=text-embedding-3-small
   ```

2. **Start the Azure AI Search Simulator with HTTPS**:

   ```bash
   cd src/AzureAISearchSimulator.Api
   dotnet run --urls "https://localhost:7250"
   ```

3. **Sample data files** are located in `../IndexerTestNotebook/data`

> ‚ö†Ô∏è **Note**: The Azure SDK requires HTTPS. The simulator must run on `https://localhost:7250`

## 1. Import Required Libraries

In [None]:
# Install required packages (uncomment if needed)
# !pip install azure-search-documents requests pandas numpy python-dotenv

import os
import json
import time
import urllib3
import numpy as np
from pathlib import Path
from dotenv import load_dotenv

# Azure AI Search SDK imports
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
from azure.search.documents.indexes import SearchIndexClient, SearchIndexerClient
from azure.search.documents.indexes.models import (
    SearchIndex,
    SearchField,
    SearchFieldDataType,
    SimpleField,
    SearchableField,
    SearchIndexer,
    SearchIndexerDataContainer,
    SearchIndexerDataSourceConnection,
    SearchIndexerSkillset,
    InputFieldMappingEntry,
    OutputFieldMappingEntry,
    FieldMapping,
    IndexingParameters,
    IndexingParametersConfiguration,
    VectorSearch,
    HnswAlgorithmConfiguration,
    VectorSearchProfile,
)
from azure.search.documents.models import VectorizedQuery

# For displaying results
import pandas as pd
from IPython.display import display, HTML

# Suppress SSL warnings for local development
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

# Load environment variables from workspace root .env file
env_path = Path("../../.env")
if env_path.exists():
    load_dotenv(dotenv_path=env_path)
    print(f"‚úÖ Loaded .env from {env_path.resolve()}")
else:
    print(f"‚ö†Ô∏è No .env file found at {env_path.resolve()}")
    print("   Copy .env.example to .env in the workspace root and fill in your values.")

print("‚úÖ Libraries imported successfully!")

## 2. Initialize Azure AI Search Clients

Configure connection to the local Azure AI Search Simulator.

In [None]:
# Configuration for Azure AI Search Simulator
# Values are loaded from the workspace root .env file (see .env.example)
SEARCH_ENDPOINT = os.getenv("BASE_URL", "https://localhost:7250")
ADMIN_API_KEY = os.getenv("ADMIN_KEY", "admin-key-12345")

# Azure OpenAI Configuration (from .env)
AZURE_OPENAI_ENDPOINT = os.getenv("AZURE_OPENAI_ENDPOINT", "")
AZURE_OPENAI_DEPLOYMENT = os.getenv("AZURE_OPENAI_DEPLOYMENT", "text-embedding-3-small")
AZURE_OPENAI_API_KEY = os.getenv("AZURE_OPENAI_API_KEY", "")

if not AZURE_OPENAI_ENDPOINT or not AZURE_OPENAI_API_KEY:
    print("‚ö†Ô∏è WARNING: AZURE_OPENAI_ENDPOINT and/or AZURE_OPENAI_API_KEY are not set!")
    print("   Copy .env.example to .env in the workspace root and fill in your Azure OpenAI values.")
    print("   Required variables: AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_API_KEY")

# Resource names for this demo
INDEX_NAME = "embedding-demo-docs"
DATA_SOURCE_NAME = "embedding-demo-files"
SKILLSET_NAME = "embedding-skillset"
INDEXER_NAME = "embedding-indexer"

# Path to sample data (from IndexerTestNotebook)
DATA_PATH = Path("../IndexerTestNotebook/data").resolve()

# Embedding dimensions (text-embedding-3-small uses 1536)
EMBEDDING_DIMENSIONS = 1536

# Create credentials
admin_credential = AzureKeyCredential(ADMIN_API_KEY)

# Configure HTTP transport to skip SSL certificate validation for local development
import requests as req_lib
from azure.core.pipeline.transport import RequestsTransport

session = req_lib.Session()
session.verify = False
transport = RequestsTransport(session=session, connection_verify=False)

# Create clients
index_client = SearchIndexClient(
    endpoint=SEARCH_ENDPOINT,
    credential=admin_credential,
    transport=transport,
    connection_verify=False
)

indexer_client = SearchIndexerClient(
    endpoint=SEARCH_ENDPOINT,
    credential=admin_credential,
    transport=transport,
    connection_verify=False
)

print(f"‚úÖ Connected to Azure AI Search Simulator at {SEARCH_ENDPOINT}")
print(f"üîë Azure OpenAI endpoint: {AZURE_OPENAI_ENDPOINT}")
print(f"üß† Embedding deployment: {AZURE_OPENAI_DEPLOYMENT}")
print(f"üìÅ Data path: {DATA_PATH}")

# List sample data files
if DATA_PATH.exists():
    json_files = list(DATA_PATH.glob("*.json"))
    txt_files = list(DATA_PATH.glob("*.txt"))
    print(f"üìÑ Found {len(json_files)} JSON metadata files")
    print(f"üìÑ Found {len(txt_files)} TXT content files")
else:
    print(f"‚ö†Ô∏è Data path not found. Make sure IndexerTestNotebook/data exists.")

## 3. Review Sample Data

Let's look at the sample documents we'll be indexing with embeddings.

In [None]:
# Load and display sample data
sample_docs = []

for json_file in sorted(DATA_PATH.glob("*.json")):
    with open(json_file, 'r', encoding='utf-8') as f:
        metadata = json.load(f)
    
    # Read associated content file
    content_file = DATA_PATH / metadata.get('contentFile', '')
    content = ""
    if content_file.exists():
        with open(content_file, 'r', encoding='utf-8') as f:
            content = f.read()
    
    sample_docs.append({
        'id': metadata['id'],
        'title': metadata['title'],
        'author': metadata['author'],
        'category': metadata['category'],
        'content': content,
        'content_preview': content[:150] + "..." if len(content) > 150 else content
    })

# Display as DataFrame
df = pd.DataFrame(sample_docs)
print(f"üìö Sample Documents to Index ({len(sample_docs)} total):\n")
display(df[['id', 'title', 'author', 'category', 'content_preview']])

## 4. Create Search Index with Vector Field

Define an index schema that includes a vector field for embeddings. We'll use HNSW algorithm for efficient vector search.

In [None]:
# Define vector search configuration
vector_search = VectorSearch(
    algorithms=[
        HnswAlgorithmConfiguration(
            name="hnsw-config",
            parameters={
                "m": 4,  # Number of bi-directional links
                "efConstruction": 400,  # Size of dynamic candidate list during indexing
                "efSearch": 500,  # Size of dynamic candidate list during search
                "metric": "cosine"  # Distance metric
            }
        )
    ],
    profiles=[
        VectorSearchProfile(
            name="vector-profile",
            algorithm_configuration_name="hnsw-config"
        )
    ]
)

# Define the index schema with vector field
index = SearchIndex(
    name=INDEX_NAME,
    fields=[
        # Key field (required)
        SimpleField(name="id", type=SearchFieldDataType.String, key=True),
        
        # Searchable text fields
        SearchableField(name="title", type=SearchFieldDataType.String,
                       sortable=True, filterable=True),
        SearchableField(name="author", type=SearchFieldDataType.String,
                       filterable=True, facetable=True),
        SearchableField(name="content", type=SearchFieldDataType.String),
        
        # Filterable/Facetable fields
        SimpleField(name="category", type=SearchFieldDataType.String,
                   filterable=True, facetable=True, sortable=True),
        SimpleField(name="language", type=SearchFieldDataType.String,
                   filterable=True, facetable=True),
        
        # Date field
        SimpleField(name="createdDate", type=SearchFieldDataType.DateTimeOffset,
                   filterable=True, sortable=True),
        
        # Collection field for tags
        SearchField(name="tags", type=SearchFieldDataType.Collection(SearchFieldDataType.String),
                   searchable=True, filterable=True, facetable=True),
        
        # Vector field for embeddings
        SearchField(
            name="contentVector",
            type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
            searchable=True,
            vector_search_dimensions=EMBEDDING_DIMENSIONS,
            vector_search_profile_name="vector-profile"
        ),
    ],
    vector_search=vector_search
)

# Create or update the index
try:
    result = index_client.create_or_update_index(index)
    print(f"‚úÖ Index '{result.name}' created/updated successfully!")
    print(f"   Fields: {len(result.fields)}")
    for field in result.fields:
        vector_info = f", dims={field.vector_search_dimensions}" if field.vector_search_dimensions else ""
        print(f"   - {field.name}: {field.type} (key={field.key}, searchable={field.searchable}{vector_info})")
except Exception as e:
    print(f"‚ùå Error creating index: {e}")

## 5. Create Data Source Connection

Configure a data source pointing to the sample JSON documents.

In [None]:
# Create a data source connection pointing to local files
data_source = SearchIndexerDataSourceConnection(
    name=DATA_SOURCE_NAME,
    type="filesystem",  # Simulator-specific type for local files
    connection_string=str(DATA_PATH),
    container=SearchIndexerDataContainer(name=".", query="*.json")
)

try:
    result = indexer_client.create_or_update_data_source_connection(data_source)
    print(f"‚úÖ Data source '{result.name}' created/updated successfully!")
    print(f"   Type: {result.type}")
    print(f"   Path: {result.connection_string}")
except Exception as e:
    print(f"‚ùå Error creating data source: {e}")

## 6. Create Skillset with Azure OpenAI Embedding Skill

Configure a skillset that uses Azure OpenAI to generate embeddings for the document content.

> **Note**: The Azure OpenAI credentials are loaded from the workspace root `.env` file. Make sure `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_API_KEY`, and `AZURE_OPENAI_DEPLOYMENT` are set.

In [None]:
# Create skillset with Azure OpenAI Embedding skill using REST API
# The Azure SDK doesn't have direct support for AzureOpenAIEmbeddingSkill,
# so we'll use the REST API directly

if not AZURE_OPENAI_API_KEY or not AZURE_OPENAI_ENDPOINT:
    print("‚ö†Ô∏è WARNING: Azure OpenAI credentials are not configured in .env!")
    print("   Set AZURE_OPENAI_ENDPOINT and AZURE_OPENAI_API_KEY in the workspace root .env file.")

skillset_payload = {
    "name": SKILLSET_NAME,
    "description": "Skillset to generate embeddings using Azure OpenAI",
    "skills": [
        {
            "@odata.type": "#Microsoft.Skills.Text.AzureOpenAIEmbeddingSkill",
            "name": "embedding-skill",
            "description": "Generate embeddings for document title",
            "context": "/document",
            "resourceUri": AZURE_OPENAI_ENDPOINT,
            "deploymentId": AZURE_OPENAI_DEPLOYMENT,
            "apiKey": AZURE_OPENAI_API_KEY,  # API key loaded from .env
            "modelName": "text-embedding-3-small",
            "inputs": [
                {
                    "name": "text",
                    "source": "/document/title"
                }
            ],
            "outputs": [
                {
                    "name": "embedding",
                    "targetName": "contentEmbedding"
                }
            ]
        }
    ]
}

# Create skillset using REST API
headers = {
    "Content-Type": "application/json",
    "api-key": ADMIN_API_KEY
}

url = f"{SEARCH_ENDPOINT}/skillsets/{SKILLSET_NAME}?api-version=2024-07-01"

try:
    response = session.put(url, headers=headers, json=skillset_payload)
    response.raise_for_status()
    result = response.json()
    print(f"‚úÖ Skillset '{result['name']}' created/updated successfully!")
    print(f"   Skills: {len(result['skills'])}")
    for skill in result['skills']:
        print(f"   - {skill['name']}: {skill['@odata.type']}")
except Exception as e:
    print(f"‚ùå Error creating skillset: {e}")
    if hasattr(e, 'response') and e.response is not None:
        print(f"   Response: {e.response.text}")

## 7. Create and Run Indexer with Skillset

Create an indexer that processes documents, generates embeddings using the skillset, and indexes everything.

In [None]:
# Create indexer with skillset using REST API
indexer_payload = {
    "name": INDEXER_NAME,
    "dataSourceName": DATA_SOURCE_NAME,
    "targetIndexName": INDEX_NAME,
    "skillsetName": SKILLSET_NAME,
    "parameters": {
        "configuration": {
            "parsingMode": "json"
        }
    },
    "fieldMappings": [
        {"sourceFieldName": "id", "targetFieldName": "id"},
        {"sourceFieldName": "title", "targetFieldName": "title"},
        {"sourceFieldName": "author", "targetFieldName": "author"},
        {"sourceFieldName": "category", "targetFieldName": "category"},
        {"sourceFieldName": "tags", "targetFieldName": "tags"},
        {"sourceFieldName": "createdDate", "targetFieldName": "createdDate"},
        {"sourceFieldName": "language", "targetFieldName": "language"}
    ],
    "outputFieldMappings": [
        {
            "sourceFieldName": "/document/contentEmbedding",
            "targetFieldName": "contentVector"
        }
    ]
}

url = f"{SEARCH_ENDPOINT}/indexers/{INDEXER_NAME}?api-version=2024-07-01"

try:
    response = session.put(url, headers=headers, json=indexer_payload)
    response.raise_for_status()
    result = response.json()
    print(f"‚úÖ Indexer '{result['name']}' created/updated successfully!")
    print(f"   Data Source: {result['dataSourceName']}")
    print(f"   Target Index: {result['targetIndexName']}")
    print(f"   Skillset: {result['skillsetName']}")
except Exception as e:
    print(f"‚ùå Error creating indexer: {e}")
    if hasattr(e, 'response') and e.response is not None:
        print(f"   Response: {e.response.text}")

In [None]:
# Reset and run the indexer to reprocess all documents
print("üîÑ Resetting indexer to reprocess all documents...")

try:
    reset_url = f"{SEARCH_ENDPOINT}/indexers/{INDEXER_NAME}/reset?api-version=2024-07-01"
    response = session.post(reset_url, headers=headers)
    response.raise_for_status()
    print("‚úÖ Indexer reset!")
except Exception as e:
    print(f"‚ö†Ô∏è Could not reset indexer (may not exist yet): {e}")

print("\nüöÄ Running indexer...")

try:
    run_url = f"{SEARCH_ENDPOINT}/indexers/{INDEXER_NAME}/run?api-version=2024-07-01"
    response = session.post(run_url, headers=headers)
    response.raise_for_status()
    print("‚úÖ Indexer run triggered!")
except Exception as e:
    print(f"‚ùå Error running indexer: {e}")

# Wait for indexer to complete
print("\n‚è≥ Waiting for indexer to complete (this may take a while for embedding generation)...")
max_wait = 90  # seconds - increased for embedding generation
wait_interval = 3

for i in range(0, max_wait, wait_interval):
    time.sleep(wait_interval)
    try:
        status_url = f"{SEARCH_ENDPOINT}/indexers/{INDEXER_NAME}/status?api-version=2024-07-01"
        response = session.get(status_url, headers=headers)
        status = response.json()
        
        last_result = status.get('lastResult')
        if last_result:
            status_val = last_result.get('status', 'unknown')
            items = last_result.get('itemsProcessed', 0)
            print(f"   Status: {status_val} (items: {items})")
            if status_val in ["success", "transientFailure", "reset"]:
                break
    except Exception as e:
        print(f"   Checking status... ({e})")

# Get final status
try:
    response = session.get(status_url, headers=headers)
    status = response.json()
    last_result = status.get('lastResult', {})
    
    print(f"\nüìä Indexer Execution Results:")
    print(f"   Status: {last_result.get('status', 'unknown')}")
    print(f"   Items Processed: {last_result.get('itemsProcessed', 0)}")
    print(f"   Items Failed: {last_result.get('itemsFailed', 0)}")
    
    errors = last_result.get('errors', [])
    if errors:
        print(f"   Errors:")
        for error in errors[:5]:  # Show first 5 errors
            print(f"      - {error.get('errorMessage', 'Unknown error')}")
    
    warnings = last_result.get('warnings', [])
    if warnings:
        print(f"   Warnings:")
        for warning in warnings[:3]:
            print(f"      - {warning.get('message', 'Unknown warning')}")
except Exception as e:
    print(f"‚ùå Error getting status: {e}")

## 8. Verify Indexed Documents with Embeddings

Check that documents were indexed with their vector embeddings.

In [None]:
# Create search client to query the index
search_client = SearchClient(
    endpoint=SEARCH_ENDPOINT,
    index_name=INDEX_NAME,
    credential=admin_credential,
    transport=transport,
    connection_verify=False
)

# Get all documents
results = search_client.search(search_text="*", include_total_count=True, select=["id", "title", "category", "contentVector"])
results_list = list(results)

print(f"üìä Document Count Verification:")
print(f"   Expected: 5 documents")
print(f"   Actual:   {len(results_list)} documents")

# Check for embeddings
docs_with_vectors = 0
for doc in results_list:
    vector = doc.get('contentVector')
    has_vector = vector is not None and len(vector) > 0
    if has_vector:
        docs_with_vectors += 1
        vector_preview = str(vector[:5]) + "..." if len(vector) > 5 else str(vector)
        print(f"   ‚úÖ {doc['id']}: {doc['title']} - Vector dims: {len(vector)}")
    else:
        print(f"   ‚ö†Ô∏è {doc['id']}: {doc['title']} - No vector")

print(f"\nüìä Documents with embeddings: {docs_with_vectors}/{len(results_list)}")

## 9. Vector Search (Semantic Search)

Use vector search to find semantically similar documents. We'll generate a query embedding and search for similar content.

> **Note**: For this demo, we'll use a sample query vector. In production, you would generate the query embedding using the same model.

In [None]:
# For demo purposes, we'll use one of the document vectors as a query vector
# In production, you would call Azure OpenAI to generate the query embedding

# Get a document vector to use as query
sample_doc = results_list[0] if results_list else None
if sample_doc and sample_doc.get('contentVector'):
    query_vector = sample_doc['contentVector']
    print(f"üîç Using vector from document '{sample_doc['id']}' as query vector")
    print(f"   Vector dimensions: {len(query_vector)}")
    print(f"   First 5 values: {query_vector[:5]}")
else:
    # Create a random vector as fallback (for demo without Azure OpenAI)
    query_vector = np.random.rand(EMBEDDING_DIMENSIONS).astype(float).tolist()
    print(f"üîç Using random vector as query (no documents with vectors found)")
    print(f"   Vector dimensions: {len(query_vector)}")

In [None]:
# Perform vector search
print("üîç Vector Search Results:\n")

try:
    vector_query = VectorizedQuery(
        vector=query_vector,
        k_nearest_neighbors=5,
        fields="contentVector"
    )
    
    results = search_client.search(
        search_text=None,  # Pure vector search
        vector_queries=[vector_query],
        select=["id", "title", "category", "author"]
    )
    
    results_list = list(results)
    
    if results_list:
        data = []
        for i, doc in enumerate(results_list, 1):
            score = doc.get('@search.score', 0)
            data.append({
                'Rank': i,
                'ID': doc['id'],
                'Title': doc['title'],
                'Category': doc.get('category', 'N/A'),
                'Score': f"{score:.4f}"
            })
        
        display(pd.DataFrame(data))
    else:
        print("No results found. Make sure documents have vectors indexed.")
        
except Exception as e:
    print(f"‚ùå Error performing vector search: {e}")

## 10. Hybrid Search (Keyword + Vector)

Combine traditional keyword search with vector search for best results. The simulator uses Reciprocal Rank Fusion (RRF) to combine results.

In [None]:
# Perform hybrid search
print("üîç Hybrid Search Results (keyword + vector):\n")
print("Query: 'Azure search' (text) + vector similarity\n")

try:
    vector_query = VectorizedQuery(
        vector=query_vector,
        k_nearest_neighbors=5,
        fields="contentVector"
    )
    
    results = search_client.search(
        search_text="Azure search",  # Keyword search
        vector_queries=[vector_query],  # Plus vector search
        select=["id", "title", "category", "author"],
        top=5
    )
    
    results_list = list(results)
    
    if results_list:
        data = []
        for i, doc in enumerate(results_list, 1):
            score = doc.get('@search.score', 0)
            data.append({
                'Rank': i,
                'ID': doc['id'],
                'Title': doc['title'],
                'Category': doc.get('category', 'N/A'),
                'Hybrid Score': f"{score:.4f}"
            })
        
        display(pd.DataFrame(data))
        print("\nüí° Hybrid search combines keyword relevance with semantic similarity using RRF fusion.")
    else:
        print("No results found.")
        
except Exception as e:
    print(f"‚ùå Error performing hybrid search: {e}")

## 11. Compare Search Methods

Compare results from keyword-only, vector-only, and hybrid search.

In [None]:
# Compare different search methods
search_query = "machine learning AI"

print(f"üîç Comparing search methods for: '{search_query}'\n")

# 1. Keyword Search
print("1Ô∏è‚É£ Keyword Search (BM25):")
try:
    results = search_client.search(
        search_text=search_query,
        select=["id", "title"],
        top=3
    )
    for doc in results:
        print(f"   - [{doc['id']}] {doc['title']} (score: {doc.get('@search.score', 0):.4f})")
except Exception as e:
    print(f"   Error: {e}")

print()

# 2. Vector Search
print("2Ô∏è‚É£ Vector Search (Semantic):")
try:
    vector_query = VectorizedQuery(
        vector=query_vector,
        k_nearest_neighbors=3,
        fields="contentVector"
    )
    results = search_client.search(
        search_text=None,
        vector_queries=[vector_query],
        select=["id", "title"],
        top=3
    )
    for doc in results:
        print(f"   - [{doc['id']}] {doc['title']} (score: {doc.get('@search.score', 0):.4f})")
except Exception as e:
    print(f"   Error: {e}")

print()

# 3. Hybrid Search
print("3Ô∏è‚É£ Hybrid Search (RRF Fusion):")
try:
    vector_query = VectorizedQuery(
        vector=query_vector,
        k_nearest_neighbors=3,
        fields="contentVector"
    )
    results = search_client.search(
        search_text=search_query,
        vector_queries=[vector_query],
        select=["id", "title"],
        top=3
    )
    for doc in results:
        print(f"   - [{doc['id']}] {doc['title']} (score: {doc.get('@search.score', 0):.4f})")
except Exception as e:
    print(f"   Error: {e}")

## 12. Cleanup (Optional)

Delete all resources created during this demo.

In [None]:
# Uncomment and run this cell to clean up all resources
# WARNING: This will delete the index, indexer, skillset, and data source!

cleanup = False  # Set to True to enable cleanup

if cleanup:
    print("üßπ Cleaning up resources...")
    
    # Delete indexer first
    try:
        url = f"{SEARCH_ENDPOINT}/indexers/{INDEXER_NAME}?api-version=2024-07-01"
        session.delete(url, headers=headers)
        print(f"   ‚úÖ Deleted indexer: {INDEXER_NAME}")
    except Exception as e:
        print(f"   ‚ö†Ô∏è Could not delete indexer: {e}")
    
    # Delete skillset
    try:
        url = f"{SEARCH_ENDPOINT}/skillsets/{SKILLSET_NAME}?api-version=2024-07-01"
        session.delete(url, headers=headers)
        print(f"   ‚úÖ Deleted skillset: {SKILLSET_NAME}")
    except Exception as e:
        print(f"   ‚ö†Ô∏è Could not delete skillset: {e}")
    
    # Delete data source
    try:
        indexer_client.delete_data_source_connection(DATA_SOURCE_NAME)
        print(f"   ‚úÖ Deleted data source: {DATA_SOURCE_NAME}")
    except Exception as e:
        print(f"   ‚ö†Ô∏è Could not delete data source: {e}")
    
    # Delete index
    try:
        index_client.delete_index(INDEX_NAME)
        print(f"   ‚úÖ Deleted index: {INDEX_NAME}")
    except Exception as e:
        print(f"   ‚ö†Ô∏è Could not delete index: {e}")
    
    print("\n‚úÖ Cleanup complete!")
else:
    print("‚ÑπÔ∏è Cleanup skipped. Set cleanup = True to delete resources.")

## Summary

This notebook demonstrated:

| Feature | Status | Notes |
|---------|--------|-------|
| Vector Index | ‚úÖ | Created index with `Collection(Edm.Single)` vector field |
| HNSW Configuration | ‚úÖ | Configured HNSW algorithm for vector search |
| Embedding Skillset | ‚úÖ | Azure OpenAI Embedding skill for generating vectors |
| Indexer with Skills | ‚úÖ | Processed documents and generated embeddings |
| Vector Search | ‚úÖ | Semantic similarity search using embeddings |
| Hybrid Search | ‚úÖ | Combined keyword + vector with RRF fusion |

### Key Learnings

1. **Vector Fields**: Use `Collection(Edm.Single)` with `vector_search_dimensions` for embeddings
2. **HNSW Algorithm**: Configure M, efConstruction, and efSearch for performance tuning
3. **Embedding Skill**: Azure OpenAI generates 1536-dimensional embeddings (text-embedding-ada-002)
4. **Hybrid Search**: RRF fusion combines keyword relevance with semantic similarity
5. **Output Field Mappings**: Map skillset outputs (embeddings) to index vector fields

The Azure AI Search Simulator successfully replicates the vector search and embedding functionality of Azure AI Search!