# Vector Similarity
Vector Similarity Search (VSS) is the process of finding data points that are similar to a given query vector in a vector database. Popular VSS uses include recommendation systems, image and video search, document retrieval, and question answering.

## Index Creation
Before doing vector search, define the schema and create an index.

In [1]:
import redis

from redis.commands.search.indexDefinition import (
    IndexDefinition,
    IndexType
)
from redis.commands.search.query import Query
from redis.commands.search.field import (
    TagField,
    VectorField
)

r = redis.Redis(host="localhost", port=6379)

INDEX_NAME = "index"
VECTOR_DIMENSIONS = 1536
DOC_PREFIX = "doc:"

# Schema
schema = (
    TagField("tag"),
    VectorField("vector",
        "FLAT", {
            "TYPE": "FLOAT32",
            "DIM": VECTOR_DIMENSIONS,
            "DISTANCE_METRIC": "COSINE",
        }
    ),
)

# Index Definition
definition = IndexDefinition(prefix=[DOC_PREFIX], index_type=IndexType.HASH)

# Create Index
r.ft(INDEX_NAME).create_index(fields=schema, definition=definition)

b'OK'

## Adding Vectors to Redis

Next, we add vectors (dummy data) to Redis using `hset`. The search index listens to keyspace notifications and will include any written HASH objects prefixed by `DOC_PREFIX`.

In [2]:
#pip install numpy
import numpy as np

In [None]:
r.hset(f"{DOC_PREFIX}a", mapping={
    "vector": np.random.rand(VECTOR_DIMENSIONS).astype(np.float32).tobytes(),
    "tag": "foo"
})
r.hset(f"{DOC_PREFIX}b", mapping={
    "vector": np.random.rand(VECTOR_DIMENSIONS).astype(np.float32).tobytes(),
    "tag": "foo"
})
r.hset(f"{DOC_PREFIX}c", mapping={
    "vector": np.random.rand(VECTOR_DIMENSIONS).astype(np.float32).tobytes(),
    "tag": "bar"
})

## Searching
You can use VSS queries with the `.ft(...).search(...)` query command. To use a VSS query, you must specify the option `.dialect(2)`.

There are two supported types of vector queries in Redis: `KNN` and `Range`. `Hybrid` queries can work in both settings and combine elements of traditional search and VSS.

### KNN Queries
KNN queries are for finding the topK most similar vectors given a query vector.

In [4]:
query = (
    Query("*=>[KNN 2 @vector $vec as score]")
     .sort_by("score")
     .return_fields("id", "score")
     .paging(0, 2)
     .dialect(2)
)

query_params = {
    "vec": np.random.rand(VECTOR_DIMENSIONS).astype(np.float32).tobytes()
}
r.ft(INDEX_NAME).search(query, query_params).docs

[Document {'id': 'doc:c', 'payload': None, 'score': '0.244824767113'},
 Document {'id': 'doc:b', 'payload': None, 'score': '0.25022560358'}]

### Range Queries
Range queries provide a way to filter results by the distance between a vector field in Redis and a query vector based on some pre-defined threshold (radius).

In [5]:
query = (
    Query("@vector:[VECTOR_RANGE $radius $vec]=>{$YIELD_DISTANCE_AS: score}")
     .sort_by("score")
     .return_fields("id", "score")
     .paging(0, 3)
     .dialect(2)
)

# Find all vectors within 0.8 of the query vector
query_params = {
    "radius": 0.8,
    "vec": np.random.rand(VECTOR_DIMENSIONS).astype(np.float32).tobytes()
}
r.ft(INDEX_NAME).search(query, query_params).docs

[Document {'id': 'doc:a', 'payload': None, 'score': '0.245782673359'},
 Document {'id': 'doc:b', 'payload': None, 'score': '0.25076341629'},
 Document {'id': 'doc:c', 'payload': None, 'score': '0.2565549016'}]

See additional Range Query examples in [this Jupyter notebook](https://github.com/RediSearch/RediSearch/blob/master/docs/docs/vecsim-range_queries_examples.ipynb).

### Hybrid Queries
Hybrid queries contain both traditional filters (numeric, tags, text) and VSS in one single Redis command.

In [6]:
query = (
    Query("(@tag:{ foo })=>[KNN 2 @vector $vec as score]")
     .sort_by("score")
     .return_fields("id", "tag", "score")
     .paging(0, 2)
     .dialect(2)
)

query_params = {
    "vec": np.random.rand(VECTOR_DIMENSIONS).astype(np.float32).tobytes()
}
r.ft(INDEX_NAME).search(query, query_params).docs

[Document {'id': 'doc:b', 'payload': None, 'score': '0.24173271656', 'tag': 'foo'},
 Document {'id': 'doc:a', 'payload': None, 'score': '0.25843077898', 'tag': 'foo'}]

See additional Hybrid Query examples in [this Jupyter notebook](https://github.com/RediSearch/RediSearch/blob/master/docs/docs/vecsim-hybrid_queries_examples.ipynb).

Find several example apps, tutorials, and projects [in this GitHub organization](https://github.com/RedisVentures).