## Storing and querying for embeddings with Redis

Here’s the rough architecture:
![Architecture](images/redsis.png)

The steps above do the following:

- A console app retrieves blog post URLs from an RSS feed and reads all the posts one by one
- For each post, create an embedding with OpenAI which results in a vector of 1536 dimensions to store in redis
- After the embedding is created, store the embedding in a redis search index, we create this index by using redis python client
- Perform a vectorized search, finding the closest post vectors to the query vector using HNSW algorithm

To start with, we need to run redis stack on docker, to get started with the redis stack, run the following command in your terminal:

`docker run -d --name redis-stack-server -p 6379:6379 redis/redis-stack-server:latest`

In [None]:
%pip install --quiet -U openai redis feedparser numpy

### Storing post data in Redis hashes

We will create several Redis hashes, one for each post. Hashes are records structured as collections of field-value pairs. Each hash we store, has the following fields:

- url: url to the blog post
- embedding: embedding of the blog post (a vector), created with the OpenAI embeddings API and the text-embedding-ada-002 model

We need the URL to retrieve the entire post after a closest match has been found. In Pinecone, the URL would be metadata to the vector. In Redis, it’s just a field in a hash, just like the vector itself.

In [18]:
import feedparser
import numpy as np
from openai import AzureOpenAI
import redis
from redis.commands.search.field import VectorField, TextField
from redis.commands.search.query import Query
from redis.commands.search.indexDefinition import IndexDefinition, IndexType
import os
from dotenv import load_dotenv

load_dotenv()

# Redis connection details
redis_host = os.getenv('REDIS_HOST')
redis_port = os.getenv('REDIS_PORT')
redis_password = os.getenv('REDIS_PASSWORD')

# Connect to the Redis server
conn = redis.Redis(host=os.getenv('REDIS_HOST'), 
                   port=os.getenv('REDIS_PORT'), 
                   password=os.getenv('REDIS_PASSWORD'))
client = AzureOpenAI(
    api_key = os.getenv("AZURE_OPENAI_API_KEY"),  
    api_version=os.getenv("OPENAI_API_VERSION"),
    azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT")
)

SCHEMA = [
    TextField("url"),
    VectorField("embedding", "HNSW", {"TYPE": "FLOAT32", "DIM": 1536, "DISTANCE_METRIC": "COSINE"}),
]

# Create the index
try:
    conn.ft("posts").create_index(fields=SCHEMA, definition=IndexDefinition(prefix=["post:"], index_type=IndexType.HASH))
except Exception as e:
    print("Index already exists")

# URL of the RSS feed to parse
url = 'https://devblogs.microsoft.com/landingpage/'

# Parse the RSS feed with feedparser
feed = feedparser.parse(url)

# get number of entries in feed
entries = len(feed.entries)
print("Number of entries: ", entries)

p = conn.pipeline(transaction=False)
for i, entry in enumerate(feed.entries[:50]):
    # report progress
    print("Create embedding and save for entry ", i, " of ", entries)
    
    article = entry.description

    embedding = client.embeddings.create(
        input=article,
        model="text-embedding-ada-002"
    )

    # print the embedding (length = 1536)
    vector = embedding.data[0].embedding

    # convert to numpy array
    vector = np.array(vector).astype(np.float32).tobytes()

    # Create a new hash with the URL and embedding
    post_hash = {
        "url": entry.link,
        "embedding": vector
    }
    
    # add_document() is deprecated
    conn.hset(name=f"post:{i}", mapping=post_hash)

p.execute()

print("Vector upload complete.")

Number of entries:  10
Create embedding and save for entry  0  of  10
Create embedding and save for entry  1  of  10
Create embedding and save for entry  2  of  10
Create embedding and save for entry  3  of  10
Create embedding and save for entry  4  of  10
Create embedding and save for entry  5  of  10
Create embedding and save for entry  6  of  10
Create embedding and save for entry  7  of  10
Create embedding and save for entry  8  of  10
Create embedding and save for entry  9  of  10
Vector upload complete.


### Redis vector queries

With the hashes and the index created, we can now perform a similarity search. We will ask the user for a query string (use natural language) and then check the posts that are similar to the query string. The query string will need to be vectorized as well. We will return several post and rank them.


In [19]:
def search_vectors(query_vector, client, top_k=5):
    base_query = "*=>[KNN 5 @embedding $vector AS vector_score]"
    query = Query(base_query).return_fields("url", "vector_score").sort_by("vector_score").dialect(2)    

    try:
        results = client.ft("posts").search(query, query_params={"vector": query_vector})
    except Exception as e:
        print("Error calling Redis search: ", e)
        return None

    return results


if conn.ping():
    print("Connected to Redis")

# Enter a query
query = "Microsoft"

# Vectorize the query using OpenAI's text-embedding-ada-002 model
print("Vectorizing query...")
embedding = client.embeddings.create(input=query, model="text-embedding-ada-002")
query_vector = embedding.data[0].embedding

# Convert the vector to a numpy array
query_vector = np.array(query_vector).astype(np.float32).tobytes()

# Perform the similarity search
print("Searching for similar posts...")
results = search_vectors(query_vector, conn)

if results:
    print(f"Found {results.total} results:")
    for i, post in enumerate(results.docs):
        score = 1 - float(post.vector_score)
        print(f"\t{i}. {post.url} (Score: {round(score ,3) })")
else:
    print("No results found")

Connected to Redis
Vectorizing query...
Searching for similar posts...
Found 5 results:
	0. https://devblogs.microsoft.com/identity/eng-connect-jun-24 (Score: 0.825)
	1. https://devblogs.microsoft.com/visualstudio/automatically-install-visual-studio-security-updates-through-microsoft-update (Score: 0.816)
	2. https://devblogs.microsoft.com/directx/step-forward-for-gaming-on-arm-devices-2024 (Score: 0.81)
	3. https://devblogs.microsoft.com/qsharp/evaluating-cat-qubits-for-fault-tolerant-quantum-computing-using-azure-quantum-resource-estimator (Score: 0.796)
	4. https://devblogs.microsoft.com/ise/empowering-collaboration-with-tech-savvy-customer (Score: 0.785)


## Storing and querying for embeddings with Redis Vector Library (RedisVL)
RedisVL provides a powerful, dedicated Python client library for using Redis as a [Vector Database](https://redis.com/solutions/use-cases/vector-database). Leverage the speed and reliability of Redis along with vector-based semantic search capabilities to supercharge your application!

In [2]:
%pip install -q -U redisvl

Note: you may need to restart the kernel to use updated packages.


### Define a schema

Consider a dataset composed of 10k SEC filings PDFs, each broken down into manageable text chunks. Each record in this dataset includes:

- Id: A unique identifier for each PDF chunk.
- Content: The actual text extracted from the PDF.
- Content Embedding: A vector representation of the section’s text.
- Company: The name of the associated company.
- Timestamp: A numeric value representing the last update time.

First, define a schema that models this data’s structure in an index named `sec-filings`. Use a YAML file for convenience:

```yaml
index:
  name: sec-filings
  prefix: chunk
 
fields:
  - name: id
    type: tag
    attrs:
      sortable: true
  - name: content
    type: text
    attrs:
      sortable: true
  - name: company
    type: tag
    attrs:
      sortable: true
  - name: timestamp
    type: numeric
    attrs:
      sortable: true
  - name: content_embedding
    type: vector
    attrs:
      dims: 1536
      algorithm: hnsw
      datatype: float32
      distance_metric: cosine
```
Now, load and validate this schema:

In [23]:
from redisvl.schema import IndexSchema
import yaml

yaml_schema = """
index:
  name: sec-filings
  prefix: chunk
 
fields:
  - name: id
    type: tag
    attrs:
      sortable: true
  - name: content
    type: text
    attrs:
      sortable: true
  - name: company
    type: tag
    attrs:
      sortable: true
  - name: timestamp
    type: numeric
    attrs:
      sortable: true
  - name: content_embedding
    type: vector
    attrs:
      dims: 1536
      algorithm: hnsw
      datatype: float32
      distance_metric: cosine
"""

json_schema = yaml.safe_load(yaml_schema)
schema = IndexSchema.from_dict(json_schema)

### Create an index
Now we’ll create the index for our dataset by passing a Redis Python client connection to a `SearchIndex`:

In [24]:
import redis
from redisvl.index import SearchIndex
import os
from dotenv import load_dotenv

load_dotenv()

# Establish a connection with Redis
conn = redis.Redis(host=os.getenv('REDIS_HOST'), 
                   port=os.getenv('REDIS_PORT'), 
                   password=os.getenv('REDIS_PASSWORD'))
 
# Link the schema with our Redis client to create the search index
index = SearchIndex(schema, conn)
 
# Create the index in Redis
index.create()

### Simplify embedding generation

The [vectorizer](https://www.redisvl.com/user_guide/vectorizers_04.html) module provides access to popular embedding providers, below is an example using the Azure OpenAI vectorizer


In [25]:
from redisvl.utils.vectorize import AzureOpenAITextVectorizer

# create a vectorizer
aoai = AzureOpenAITextVectorizer(
    model="text-embedding-ada-002", # Must be your CUSTOM deployment name
    api_config={
        "api_key": os.getenv("AZURE_OPENAI_API_KEY"),
        "api_version": os.getenv("OPENAI_API_VERSION"),
        "azure_endpoint":  os.getenv("AZURE_OPENAI_ENDPOINT")
    },
)
 
# Generate an embedding for a single query
embedding = aoai.embed(
    "How much debt is the company in?", input_type="search_query"
)
 
# Generate embeddings for multiple queries
embeddings = aoai.embed_many([
    "How much debt is the company in?",
    "What do revenue projections look like?"
], input_type="search_query")

###  Load data

Before querying, use the vectorizer to create text embeddings and populate the index with your data. If your dataset is a collection of dictionary objects, the .load() method simplifies insertion. It batches upsert operations, efficiently storing your data in Redis and returning the keys for each record:


In [26]:
import uuid

# Example dataset as a list of dictionaries
data = [
    {
        "id": str(uuid.uuid4()),
        "content": "Material Cybersecurity Incidents",
        "company": "Microsoft",
        "timestamp": 20240127,
        "content_embedding": aoai.embed(
            "As disclosed in the Original Filing, the Company detected that beginning in late November 2023, a nation-state threat actor had gained access to and exfiltrated information from a very small percentage of employee email accounts including members of our senior leadership team and employees in our cybersecurity, legal, and other functions.",  input_type="search_document", as_buffer=True
        )
    },
    {
        "id": str(uuid.uuid4()),
        "content": "Cash Flows",
        "company": "Microsoft",
        "timestamp": 20240130,
        "content_embedding": aoai.embed(
            "Cash from operations increased $15.1 billion to $49.4 billion for the six months ended December 31, 2023, mainly due to an increase in cash received from customers and a decrease in cash paid to suppliers. Cash from financing increased $26.8 billion to $4.6 billion for the six months ended December 31, 2023, mainly due to a $25.4 billion increase in proceeds from issuance of debt, net of repayments. Cash used in investing increased $61.1 billion to $71.4 billion for the six months ended December 31, 2023, mainly due to a $65.2 billion increase in cash used for acquisitions of companies, net of cash acquired, and purchases of intangible and other assets.",  input_type="search_document", as_buffer=True
        )
    },    
    # More records...
]
 
# Insert data into the index
keys = index.load(data)

### Run queries

The VectorQuery is a simple abstraction for performing KNN/ANN style vector searches with optional filters. 

Imagine you want to find the 5 PDF chunks most semantically related to a user’s query, such as "What is the cash flow of this compnay". First, convert the query into a vector using a text embedding model (see below section on vectorizers). Next, define and execute the query:

In [27]:
from redisvl.query import VectorQuery
 
query = "What is the cash flow of this company?"
 
query_vector = aoai.embed(query, input_type="search_query", as_buffer=True)
 
query = VectorQuery(
    vector=query_vector, 
    vector_field_name="content_embedding",
    num_results=5
)
 
results = index.query(query)

To further refine the search results, you can apply various metadata filters. For example, if you’re interested in documents specifically related to "Microsoft", use a Tag filter on the company field:



In [28]:
from redisvl.query.filter import Tag
 
# Apply a filter for the company name
query.set_filter(Tag("company") == "Microsoft")
 
# Execute the filtered query
results = index.query(query)

Filters allow you to combine searches over structured data (metadata) with vector similarity to improve retrieval precision.

The VectorQuery is just the starting point. For those looking to explore more advanced querying techniques and data types (text, tag, numeric, vector, geo), [this dedicated user guide](https://www.redisvl.com/user_guide/hybrid_queries_02.html) will get you started.

### Boost performance with semantic caching

`redisvl` goes beyond facilitating vector search and query operations in Redis; it aims to showcase practical use cases and common LLM design patterns.

[Semantic Caching](https://www.redisvl.com/user_guide/llmcache_03.html) is designed to boost the efficiency of applications interacting with LLMs by caching responses based on semantic similarity. For example, when similar user queries are presented to the app, previously cached responses can be used instead of processing the query through the model again, significantly reducing response times and API costs.

To do this, use the SemanticCache interface. You can store user queries and response pairs in the semantic cache as follows:


In [29]:
from redisvl.extensions.llmcache import SemanticCache

# Set up the LLM cache
llmcache = SemanticCache(
    name="llmcache",        # underlying search index name
    vectorizer = aoai,      # vectorizer object for embeddings
    redis_client = conn,    # Redis client
    distance_threshold=0.2  # semantic cache distance threshold
)
 
# Cache the question, answer, and arbitrary metadata
llmcache.store(
    prompt="What is the capital city of France?",
    response="Paris",
    metadata={"city": "Paris", "country": "france"}
)

'llmcache:942def08d0b47f03895ce7849f25d1d67a5e4311b9c3dcfdf1b0c878acc34f65'

When a new query is received, its embedding is compared against those in the semantic cache. If a sufficiently similar embedding is found, the corresponding cached response is served, bypassing the need for another expensive LLM computation.

In [30]:
# Check for a semantically similar result
question = "What actually is the capital of France?"
 
llmcache.check(prompt=question)[0]['response']

'Paris'

### Inspect with the `rvl` CLI
Use the rvl CLI to inspect the created index and its fields:

In [33]:
!rvl index listall

[32m11:29:43[0m [34m[RedisVL][0m [1;30mINFO[0m   Indices:
[32m11:29:43[0m [34m[RedisVL][0m [1;30mINFO[0m   1. llmcache
[32m11:29:43[0m [34m[RedisVL][0m [1;30mINFO[0m   2. sec-filings


In [34]:
!rvl index info -i sec-filings



Index Information:
╭──────────────┬────────────────┬────────────┬─────────────────┬────────────╮
│ Index Name   │ Storage Type   │ Prefixes   │ Index Options   │   Indexing │
├──────────────┼────────────────┼────────────┼─────────────────┼────────────┤
│ sec-filings  │ HASH           │ ['chunk']  │ []              │          0 │
╰──────────────┴────────────────┴────────────┴─────────────────┴────────────╯
Index Fields:
╭───────────────────┬───────────────────┬─────────┬────────────────┬────────────────┬────────────────┬────────────────┬────────────────┬────────────────┬─────────────────┬────────────────┬────────────────┬────────────────┬─────────────────┬────────────────╮
│ Name              │ Attribute         │ Type    │ Field Option   │ Option Value   │ Field Option   │ Option Value   │ Field Option   │   Option Value │ Field Option    │ Option Value   │ Field Option   │   Option Value │ Field Option    │   Option Value │
├───────────────────┼───────────────────┼─────────┼────────

In [35]:
!rvl stats -i sec-filings


Statistics:
╭─────────────────────────────┬─────────────╮
│ Stat Key                    │ Value       │
├─────────────────────────────┼─────────────┤
│ num_docs                    │ 2           │
│ num_terms                   │ 9           │
│ max_doc_id                  │ 2           │
│ num_records                 │ 17          │
│ percent_indexed             │ 1           │
│ hash_indexing_failures      │ 0           │
│ number_of_uses              │ 4           │
│ bytes_per_record_avg        │ 75.5882     │
│ doc_table_size_mb           │ 0.000202179 │
│ inverted_sz_mb              │ 0.00122547  │
│ key_table_size_mb           │ 8.29697e-05 │
│ offset_bits_per_record_avg  │ 8           │
│ offset_vectors_sz_mb        │ 8.58307e-06 │
│ offsets_per_term_avg        │ 0.529412    │
│ records_per_doc_avg         │ 8.5         │
│ sortable_values_size_mb     │ 0.00030899  │
│ total_indexing_time         │ 1.123       │
│ total_inverted_index_blocks │ 22          │
│ vector_index_sz_mb 