<div id="singlestore-header" style="display: flex; background-color: rgba(209, 153, 255, 0.25); padding: 5px;">
    <div id="icon-image" style="width: 90px; height: 90px;">
        <img width="100%" height="100%" src="https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/header-icons/vector-circle.png" />
    </div>
    <div id="text" style="padding: 5px; margin-left: 10px;">
        <div id="badge" style="display: inline-block; background-color: rgba(0, 0, 0, 0.15); border-radius: 4px; padding: 4px 8px; align-items: center; margin-top: 6px; margin-bottom: -2px; font-size: 80%">SingleStore Notebooks</div>
        <h1 style="font-weight: 500; margin: 8px 0 0 4px;">Similarity Search on Vector Data</h1>
    </div>
</div>

<div class="alert alert-block alert-warning">
    <b class="fa fa-solid fa-exclamation-circle"></b>
    <div>
        <p><b>Note</b></p>
        <p>This notebook can be run on a Free Starter Workspace. To create a Free Starter Workspace navigate to <tt>Start</tt> using the left nav. You can also use your existing Standard or Premium workspace with this Notebook.</p>
    </div>
</div>

## What's in this notebook:

1. Create and use a database.
2. Create a table to hold vector data and load data.
3. Search based on vector similarity.
4. Search using metadata filtering.
5. Create and use a vector index.
6. Check that your query is using a vector index.
7. Clean up.

## Questions?

Reach out to us through our [forum](https://www.singlestore.com/forum).

## 1. Create and use a database.

To use this notebook, you need to have an active workspace and have selected a database to use. Please select a database using the dropdown above.

## 2. Create a table to hold vector data and load data.

The SQL below creates a table to hold comments as one might find on a restaurant review site. The table contains the comment itself stored as a <code>TEXT</code> column and a vector embedding of that comment stored as a <code>VECTOR</code> ([Vector Type](https://docs.singlestore.com/cloud/vectors/vector-type)) column. [Working with Vector Data](https://docs.singlestore.com/cloud/vectors/working-with-vector-data/) provides more details on this example and information about similarity search over vectors.

In [1]:
%%sql
CREATE TABLE comments(id INT NOT NULL PRIMARY KEY,
   comment TEXT,
   comment_embedding VECTOR(4) NOT NULL,
   category VARCHAR(256));

In [2]:
%%sql
INSERT INTO comments VALUES
      (1, "The cafeteria in building 35 has a great salad bar",
       '[0.2, 0.11, 0.37, 0.05]',
       "Food"),
      (2, "I love the taco bar in the B16 cafeteria.",
       '[0,0.800000012,0.150000006,0]',
       "Food"),
      (3, "The B24 restaurant salad bar is quite good.",
       '[0.1, 0.15, 0.37, 0.05]',
       "Food");

### Verify the data was loaded

Use the following SQL to view the data in the <code>comments</code> table.

In [3]:
%%sql
SELECT * FROM comments;

## 3. Search based on vector similarity.

To find the most similar vectors in a query vector, use an <code>ORDER BY… LIMIT…</code> query. The <code>ORDER BY</code> command will sort the vectors by a similarity score produced by a vector similarity function, with the closest matches at the top.

The SQL below sets up a query vector, then uses the <code>DOT_PRODUCT</code> infix operator (<code><\*></code>) to find the two vectors that are most similar to the query vector.

In [4]:
%%sql
SET @query_vec = ('[0.09, 0.14, 0.5, 0.05]'):>VECTOR(4):>BLOB;

SELECT id, comment, category,
         comment_embedding <*> @query_vec AS score
    FROM comments
    ORDER BY score DESC
    LIMIT 2;

## 4. Search using metadata filtering.

When building vector search applications, you may wish to filter on the fields of a record, with simple filters or via joins, in addition to applying vector similarity operations.

The following query combines the use of an <code>ORDER BY ... LIMIT</code> query and a metadata filter on category. This query will filter to find all comments in the category <code>"Food"</code> and then calculate the score for each of those and rank in descending order.

In [5]:
%%sql
SET @query_vec = ('[0.44, 0.554, 0.34, 0.62]'):>VECTOR(4):>BLOB;

SELECT id, comment, category,
         comment_embedding <*> @query_vec AS score
    FROM comments
    WHERE category = "Food"
    ORDER BY score DESC
    LIMIT 3;

## 5. Create and use a vector index.

The command below creates a vector index on the <code>comment_embedding</code> field of the <code>comments</code> table.

In [6]:
%%sql
ALTER TABLE comments ADD VECTOR INDEX ivf(comment_embedding)
INDEX_OPTIONS '{"index_type":"IVF_FLAT"}';

Optionally optimize the table for best performance.

In [7]:
%%sql
OPTIMIZE TABLE comments FULL;

The following query will use the vector index. Vector indexes can be used to improve performance of queries over large vector data sets. Refer to [Vector Indexing](https://docs.singlestore.com/cloud/vectors/vector-indexing/) for information on creating and using vector indexes.

In [8]:
%%sql
SET @query_vec = ('[0.44, 0.554, 0.34, 0.62]'):>VECTOR(4):>BLOB;

SELECT id, comment, category,
         comment_embedding <*> @query_vec AS score
    FROM comments
    ORDER BY score DESC
    LIMIT 2;

## 6. Check that your query is using a vector index.

The <code>EXPLAIN</code> command can be used to see the query plan and verify that the vector index is being used. In the example below, you can see <code>INTERNAL_VECTOR_SEARCH</code> in the <code>ColumnStoreFilter</code> row. This tells you that the vector index is being used.

In [9]:
%%sql
SET @query_vec = ('[0.09, 0.14, 0.5, 0.05]'):>VECTOR(4):>BLOB;

EXPLAIN
SELECT id, comment, category,
         comment_embedding <*> @query_vec AS score
    FROM comments
    ORDER BY score DESC
    LIMIT 2;

## 7. Clean up.

The command below will drop the table created as part of this notebook. Dropping this table will allow you to rerun the notebook from the beginning.

In [10]:
%%sql
DROP TABLE comments;

<div id="singlestore-footer" style="background-color: rgba(194, 193, 199, 0.25); height:2px; margin-bottom:10px"></div>
<div><img src="https://raw.githubusercontent.com/singlestore-labs/spaces-notebooks/master/common/images/singlestore-logo-grey.png" style="padding: 0px; margin: 0px; height: 24px"/></div>