# Lab 4: Indexing for Performance — HNSW**Estimated Time:** 5 minutes---

## Step 1: Check the Baseline (No Index)

In [None]:
with connection.cursor() as cursor:
    cursor.execute("""
        EXPLAIN PLAN FOR
        SELECT c.chunk_id
        FROM city_knowledge_chunks c
        ORDER BY VECTOR_DISTANCE(c.embedding,
            VECTOR_EMBEDDING(doc_model USING 'bridge inspection results'),
            COSINE)
        FETCH APPROXIMATE FIRST 5 ROWS ONLY
    """)

    cursor.execute("""
        SELECT plan_table_output
        FROM TABLE(DBMS_XPLAN.DISPLAY('PLAN_TABLE', NULL, 'BASIC'))
    """)
    print("=== QUERY PLAN (NO INDEX) ===\n")
    for row in cursor.fetchall():
        print(row[0])

## Step 2: Create an HNSW Vector Index

In [None]:
with connection.cursor() as cursor:
    cursor.execute("""
        CREATE VECTOR INDEX knowledge_chunks_hnsw_idx
        ON city_knowledge_chunks (embedding)
        ORGANIZATION NEIGHBOR PARTITIONS
        DISTANCE COSINE
        WITH TARGET ACCURACY 95
    """)

print("HNSW index created successfully.")

In [None]:
with connection.cursor() as cursor:
    cursor.execute("""
        EXPLAIN PLAN FOR
        SELECT c.chunk_id
        FROM city_knowledge_chunks c
        ORDER BY VECTOR_DISTANCE(c.embedding,
            VECTOR_EMBEDDING(doc_model USING 'bridge inspection results'),
            COSINE)
        FETCH APPROXIMATE FIRST 5 ROWS ONLY
    """)

    cursor.execute("""
        SELECT plan_table_output
        FROM TABLE(DBMS_XPLAN.DISPLAY('PLAN_TABLE', NULL, 'BASIC'))
    """)
    print("=== QUERY PLAN (WITH HNSW INDEX) ===\n")
    for row in cursor.fetchall():
        print(row[0])

In [None]:
print("=== SEARCH WITH HNSW INDEX ===\n")
run_query("""
    SELECT c.chunk_id,
           SUBSTR(kb.title, 1, 55) AS doc_title,
           ROUND(VECTOR_DISTANCE(c.embedding,
               VECTOR_EMBEDDING(doc_model USING 'bridge shaking'),
               COSINE), 4) AS distance
    FROM city_knowledge_chunks c
    JOIN city_knowledge_base kb ON c.doc_id = kb.doc_id
    ORDER BY distance
    FETCH APPROXIMATE FIRST 5 ROWS ONLY
""")

## Step 3: Understanding Key Parameters

In [None]:
print("=== HNSW INDEX DETAILS ===\n")
run_query("""
    SELECT index_name,
           index_type,
           status
    FROM user_indexes
    WHERE index_name = 'KNOWLEDGE_CHUNKS_HNSW_IDX'
""")

In [None]:
print("""
=== HNSW KEY PARAMETERS ===

BUILD-TIME:
  NEIGHBORS (M)     — Max connections per node. Higher = better recall, more memory. (16-64)
  EFCONSTRUCTION    — Build-time search effort. Higher = better quality, slower build. (100-300)

QUERY-TIME:
  TARGET ACCURACY    — Speed vs. recall trade-off. 95 = 95% chance of true nearest neighbor.

HNSW vs. IVF:
  HNSW  — Better for low-latency queries, < 10M vectors.
  IVF   — Better for very large datasets (10M+), lower memory.
""")

Your vector search is indexed and production-ready. **Proceed to the next lab.**