<a href="https://colab.research.google.com/github/denisabrantesredis/denisd-redis-learning-sessions/blob/main/Search/Search.ipynb" target="_newt">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

<div style="display:flex;width=100%;">
<img src="https://redis.io/wp-content/uploads/2024/04/Logotype.svg?auto=webp&quality=85,75&width=120" alt="Redis" width="90"/>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</div>

# Redis Learning Session - Search

<img src="https://github.com/denisabrantesredis/denisd-redis-learning-sessions/blob/main/Search/_assets/images/banner.png?raw=true" alt="Redis Data Types"/>

[Try on online search demo application](https://ecommerce.redisventures.com//)

In this notebook, we will explore the different types of Search provided by Redis.

## Installing the Pre-Reqs

In [None]:
!pip install -q folium
!pip install -q pandas
!pip install -q redis
!pip install -q unzip

## Installing Redis Stack Locally
If you are not using Redis Cloud as a database, uncomment and run the code below to install Redis locally. Then set your connection to 127.0.0.1

In [None]:
# %%sh
# curl -fsSL https://packages.redis.io/gpg | sudo gpg --dearmor -o /usr/share/keyrings/redis-archive-keyring.gpg 
# echo "deb [signed-by=/usr/share/keyrings/redis-archive-keyring.gpg] https://packages.redis.io/deb $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/redis.list 
# sudo apt-get update  > /dev/null 2>&1
# sudo apt-get install redis-stack-server  > /dev/null 2>&1
# redis-stack-server --daemonize yes

## Copying and Unzipping Lab Files

In [None]:
import os

In [None]:
if not os.path.exists("files.zip"):
  !wget https://github.com/denisabrantesredis/denisd-redis-learning-sessions/raw/refs/heads/main/_assets/files/files.zip
  !unzip files.zip

## Connecting to Redis

In [None]:
import redis
from google.colab import userdata

#### Setup the Connection String

<img src="https://github.com/denisabrantesredis/denisd-redis-learning-sessions/blob/main/Search/_assets/images/callout_secrets.png?raw=true" alt="Callout - Use Google Colab secrets instead"/>

In [None]:
try:
  REDIS_HOST = userdata.get('REDIS_HOST')
except:
  REDIS_HOST="127.0.0.1"

try:
  REDIS_PORT = userdata.get('REDIS_PORT')
except:
  REDIS_PORT=6379

try:
  REDIS_PASSWORD = userdata.get('REDIS_PASSWORD')
except:
  REDIS_PASSWORD=""

REDIS_URL = f"redis://default:{REDIS_PASSWORD}@{REDIS_HOST}:{REDIS_PORT}"

#### Testing the Connection to Redis

<img src="https://github.com/denisabrantesredis/denisd-redis-learning-sessions/blob/main/Search/_assets/images/callout_connection.png?raw=true" alt="Callout - Make sure connection works"/>

In [None]:
r = redis.from_url(REDIS_URL)

if r.ping():
    print("Connection successful!")
else:
    print("Connection issue!")

## Geospatial Search

Geospatial data is supported in Redis as a native data type, and as part of our native JSON support. In this lab, we will explore both options.

Redis uses coordinate points to represent geospatial locations. You can store individual points but you can also use a set of points to define a polygon shape (the shape of a town, for example). You can query several types of interactions between points and shapes, such as whether a point lies within a shape or whether two shapes overlap.

### Part 1 - Native Geo data type

In [None]:
import pandas as pd
import json
import folium

#### Load Data

For this lab, we will use a dataset containing Airbnb listings and metrics in New York City for January, 2024. Each listing contains coordinates for the location, which is what we are interested in for this lab.

In [None]:
dataset = pd.read_csv('./new_york_listings_2024.csv')
print(len(dataset))
dataset.head()

#### Understand the distribution of the Room_Type attribute

In [None]:
dataset['room_type'].hist()

#### Create a new Dataframe only with Hotel Room records

In [None]:
hotel_rooms = dataset[dataset['room_type'] == 'Hotel room']
print(len(hotel_rooms))
hotel_rooms.head()

#### Save data to Redis

We will use a pipeline to write data to Redis. The pipeline will gather all commands, and then send the list of commands to the server, where they are executed in order. The pipeline then returns a list containing the response for each command.

In [None]:
pipe = r.pipeline(transaction=False)
keyname = "geo:nyc:hotel_rooms"

for index, row in hotel_rooms.iterrows():
      lat = row['latitude']
      lon = row['longitude']
      id = row['id']
      pipe.geoadd(keyname, [lon, lat, id])
pipe.execute()

<img src="https://github.com/denisabrantesredis/denisd-redis-learning-sessions/blob/main/Search/_assets/images/callout_insight.png?raw=true" alt="Callout - Check Redis Insight"/>

Open Redis Insight and confirm that the geo key was generated. There should be only one key, called `geo:nyc:hotel_rooms`, which contains all the different locations from the DataFrame.

&nbsp;

&nbsp;

#### Search for Hotel Rooms near the Empire State Building 

Next, we will search for Hotel Rooms within 850 meters of the Empire State Building, which is located at coordinates 40.7491301,-73.9924523. You can change the 

<img src="https://github.com/denisabrantesredis/denisd-redis-learning-sessions/blob/main/Search/_assets/images/callout_georadius.png?raw=true" alt="Callout - Change Search Radius"/>

In [None]:
esb_lat = 40.748534150023396
esb_lon = -73.98568519949094
radius = 850

#### Run the search

In [None]:
rooms = r.georadius(keyname, esb_lon, esb_lat, radius, 'm', withdist=True, withcoord=True)
len(rooms)

#### Render a map to see results

In [None]:
map1 = folium.Map(location=[esb_lat, esb_lon], zoom_start=15, tiles="Cartodb Positron")
folium.Marker([esb_lat, esb_lon], popup="Empire State Building", icon=folium.Icon(color='blue')).add_to(map1)
folium.Circle(
    location=[esb_lat, esb_lon], radius=radius, color="cornflowerblue", weight=1, fill_opacity=0.3, opacity=1, stroke=False,
    fill=True, popup="{} meters".format(radius), tooltip="Search Radius"
).add_to(map1)

for room in rooms:
    folium.Marker([room[2][1], room[2][0]], popup=f"{room[0]} {round(room[1],2)} meters", icon=folium.Icon(color='red')).add_to(map1)

In [None]:
map1

### Part 2 - Geospatial data with JSON

In [None]:
from redis.commands.search.field import GeoShapeField
from redis.commands.search.field import TextField
from redis.commands.search.field import TagField
from redis.commands.search.field import NumericField
from redis.commands.search.query import Query
from redis.commands.search.index_definition import IndexDefinition, IndexType

#### Save JSON data to Redis

In this step, we will save the same data from before, but as JSON format, which will allow us to capture more attributes, not just the coordinates. Notice how the `location` attribute is saved as a `POINT()` record, with longitude and latitude. We need this to run our polygon search.

In [None]:
for index, row in hotel_rooms.iterrows():
      id = row['id']
      keyname = f"geo:nyc:hotel_rooms:{id}"
      value = {
            'id': id,
            'location': f"POINT ({row['longitude']} {row['latitude']})",
            'latitude': row['latitude'],
            'longitude': row['longitude'],
            'host_name' : row['host_name'],
            'neighbourhood': row['neighbourhood'],
            'price': row['price'],
            'reviews': row['number_of_reviews'],
            'rating': row['rating'],
            'bedrooms': row['bedrooms']
      }
      pipe.json().set(keyname, "$", value)
result = pipe.execute()
len(result)

#### Create Search Index

JSON data needs to be indexed for search. This command will create the search index for the location attribute (which includes the latitude and longitude information).

In [None]:
geo_schema = (GeoShapeField("$.location", as_name="location"))

try:
    r.ft("idx:rooms").drop_index()
except:
    print("Index does not exist")

try:
  geo_index_create_result = r.ft("idx:rooms").create_index(
    geo_schema,
    definition=IndexDefinition(
        prefix=["geo:nyc:hotel_rooms:"], index_type=IndexType.JSON
    )
  )
  print(geo_index_create_result)

except Exception as e:
  print(e)


#### Check the index

Make sure the index is 100% done, and the total of document is > 0.

In [None]:
# Check index
info = r.ft('idx:rooms').info()
print(f" Percent Indexed: {int(info['percent_indexed'])*100}")
print(f" Total Documents: {info['num_docs']}")

#### Run Search

In [None]:
shape = "POLYGON ((-73.9792654 40.7545612, -73.9928266 40.7549838, -73.9988777 40.7489044, -73.9946291 40.7390201, -73.9805099 40.7385324, -73.9746305 40.7464335, -73.9792654 40.7545612))"
params_dict = {"esb": shape}

q = Query("@location:[WITHIN $esb]").dialect(3)
res = r.ft("idx:rooms").search(q, query_params=params_dict).docs
print(len(res))

### Equivalent Redis Insight command

```
    FT.SEARCH idx:rooms "(@location:[WITHIN $qshape])" 
        PARAMS 2 qshape "POLYGON ((-73.9792654 40.7545612, -73.9928266 40.7549838, -73.9988777 40.7489044, -73.9946291 40.7390201, -73.9805099 40.7385324, -73.9746305 40.7464335, -73.9792654 40.7545612))" 
        RETURN 1 name 
        DIALECT 2
```

&nbsp;

#### Render a new map with the polygon and the results

In [None]:
map2 = folium.Map(location=[esb_lat, esb_lon], zoom_start=15, tiles="Cartodb Positron")
folium.Marker([esb_lat, esb_lon], popup="Empire State Building", icon=folium.Icon(color='blue')).add_to(map2)

locations = [[40.7545612, -73.9792654], [40.7549838, -73.9928266], [40.7489044, -73.9988777], [40.7390201, -73.9946291], [40.7385324, -73.9805099], [40.7464335, -73.9746305], [40.7545612, -73.9792654]]

folium.Polygon(locations=locations, color="cornflowerblue", weight=1, fill_opacity=0.3, opacity=1, stroke=False, fill_color="maroon", fill=True, popup="Polygon", tooltip="Click me!",).add_to(map2)

In [None]:
for room in res:
    room_json = json.loads(room.json)
    lat = room_json[0]['latitude']
    lon = room_json[0]['longitude']
    host_name = room_json[0]['host_name']
    price = room_json[0]['price']
    # print(room_json[0])
    folium.Marker([lat, lon], popup=f"{host_name} ${price}", icon=folium.Icon(color='red')).add_to(map2)

In [None]:
map2

## Streams

A Redis stream is a data structure that acts like an append-only log but also implements several operations to overcome some of the limits of a typical append-only log. These include random access in O(1) time and complex consumption strategies, such as consumer groups. You can use streams to record and simultaneously syndicate events in real time. Examples of Redis stream use cases include:

- Event sourcing (e.g., tracking user actions, clicks, etc.)
- Sensor monitoring (e.g., readings from devices in the field)
- Notifications (e.g., storing a record of each user's notifications in a separate stream)


In this example, we will load IoT data from temperature monitoring devices as streams, and look for data within a certain time range. 

In [None]:
from datetime import datetime, timedelta
import pytz

#### Load data from the CSV file

In [None]:
iot_ds = pd.read_csv('iot.csv')
print(len(iot_ds))
iot_ds.head()

### Using a pipeline to save data to Redis

All messages will be under a single stream (key), named `stream:iot`. The pipeline will gather all commands and execute them once against Redis.

In [None]:
keyname = "streams:iot"
for index, row in iot_ds.iterrows():
      value = {
            'id': row['id'],
            'room': row['room'],
            'date': row['date'],
            'temp' : row['temp'],
            'location': row['location'],
            'timestamp': row['timestamp']
      }
      pipe.xadd(keyname, id=row['timestamp'], fields=value)
result = pipe.execute()
len(result)

#### Getting the first and last timestamps - format DAY/MONTH/YEAR

In [None]:
print(iot_ds.iloc[0]['date'])
print(iot_ds.iloc[len(iot_ds)-1]['date'])

#### Define the time range you want to search on

<img src="https://github.com/denisabrantesredis/denisd-redis-learning-sessions/blob/main/Search/_assets/images/callout_streams.png?raw=true" alt="Callout - Change Time Range"/>

In [None]:
start_date = "01-04-2024 12:00"
end_date = "01-04-2024 12:30"

In [None]:
datetime_format = "%d-%m-%Y %H:%M"
local_tz = pytz.timezone('America/Chicago')

local_start = datetime.strptime(start_date, datetime_format)
utc_start = local_tz.localize(local_start)
first_ts = utc_start.timestamp()

local_end = datetime.strptime(end_date, datetime_format)
utc_end = local_tz.localize(local_end)
last_ts = utc_end.timestamp()

#### Running the Search

In [None]:
messages = r.xrange("streams:iot", int(first_ts), int(last_ts))
for message in messages:
  print(message)

<img src="https://github.com/denisabrantesredis/denisd-redis-learning-sessions/blob/main/Search/_assets/images/callout_insight.png?raw=true" alt="Callout - Check Redis Insight"/>

You can use Redis Insight to check on the stream data. There should be a key called `streams:iot` with all 97,546 messages in it. The UI will show some of the messages, and you can run the same search in the Workbench, using this command:

&nbsp;

```
    XRANGE streams:iot 1711972800 1711974600
```

&nbsp;

## Time Series

The Redis time series data type lets you store real-valued data points along with the time they were collected. You can combine the values from a selection of time series and query them by time or value range. You can also compute aggregate functions of the data over periods of time and create new time series from the results. When you create a time series, you can specify a maximum retention period for the data, relative to the last reported timestamp, to prevent the time series from growing indefinitely.

In this lab, we will load stock data from a few different companies and search for data within a specific time range. The stock data contains the Open and Close values with one data point per day.

### Importing Required Packages

In [None]:
import datetime as dt
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

#### Load data from CSV file

In [None]:
ts_df = pd.read_csv('timeseries.csv')
ts_df = ts_df.sort_values(by='Ticker')
print(len(ts_df))
ts_df.head()

#### Time Series data is accessed through the `r.ts()` object. It has its own pipeline.

In [None]:
redis_ts = r.ts()

datetime_format = "%Y-%m-%d %H:%M:%S%z"
local_tz = pytz.timezone('America/New_York')

keyprefix = "timeseries:stock"
pipets = redis_ts.pipeline(transaction=False)

Get a list of all stock ticks in the file:

In [None]:
ticker_list = ts_df['Ticker'].unique()
ticker_list

Each time series (key) will hold the values of one metric over time. The CSV file contains 2 metrics (Open and Close values) for 5 different stock tickers, which means that we will need to create 10 keys, 2 for each ticker.

The loop will first create 2 keys for each ticker in a try statement (because the keys can only be created once), then it will add the Open and Close values to those keys. We will use labels to filter the keys during the search:

- Ticker will allow us to filter the search by company ("give me all Apple stock data, etc)
- Type will allow us to select the type of metric ("give me all Open values for Dec 1st, 2015")

In [None]:
for ticker in ticker_list:
    df_loop = ts_df[ts_df['Ticker'] == ticker]
    print(f"Creating Keys for Ticker {ticker} : {len(df_loop)} total rows")

    label_open = {'TICKER' : ticker, 'TYPE': 'open'}
    label_close = {'TICKER' : ticker, 'TYPE': 'close'}
    keyopen = f"{keyprefix}:{ticker}:open"
    keyclose = f"{keyprefix}:{ticker}:close"

    try:
      redis_ts.create(keyopen, labels=label_open, duplicate_policy='LAST')
      redis_ts.create(keyclose, labels=label_close, duplicate_policy='LAST')
    except:
      pass

    values = df_loop.values.tolist()
    counter = 0
    for value in values:
      m_date = value[0]
      m_open = value[1]
      m_close = value[2]
      local_start = datetime.strptime(m_date, datetime_format)
      timestamp = int(local_start.timestamp())
      
      _ = pipets.add(keyopen, timestamp, m_open) 
      _ = pipets.add(keyclose, timestamp, m_close)
      counter += 1
      # print(f"Counter: {counter}")
result = pipets.execute()
print(len(result))

#### List the first and last timestamp - date format DAY/MONTH/YEAR

In [None]:
print(ts_df.iloc[0]['Date'])
print(ts_df.iloc[len(ts_df)-1]['Date'])

#### Set the time range and stock ticker for our search

<img src="https://github.com/denisabrantesredis/denisd-redis-learning-sessions/blob/main/Search/_assets/images/callout_timeseries.png?raw=true" alt="Callout - Change Time Range and Stock Ticker"/>

In [None]:
start_time = "2016-05-01 12:00:00-04:00"
end_time = "2016-05-11 12:00:00-04:00"
ticker = "AAPL"

#### Retrieve open and close data from the selected stock ticker

In [None]:
filters = [f"TICKER=({ticker})"]
start_ts = int(datetime.strptime(start_time, datetime_format).timestamp())
end_ts = int(datetime.strptime(end_time, datetime_format).timestamp())
results = redis_ts.mrange(from_time=start_ts, to_time=end_ts, filters=filters)
len(results[0]['timeseries:stock:AAPL:close'][1])

#### Adding the search results to a new DataFrame

In [None]:
open_data, close_data = [], []
for result in results:
  keyname = list(result.keys())[0]
  keydata = result[keyname]
  if 'open' in keyname:
    for item in keydata[1]:
      open_data.append((str(datetime.fromtimestamp(item[0]).strftime('%Y-%m-%d %H:%M:%S.%f')[:-16]), item[1]))
  else:
    for item in keydata[1]:
      close_data.append((str(datetime.fromtimestamp(item[0]).strftime('%Y-%m-%d %H:%M:%S.%f')[:-16]), item[1]))

df_open = pd.DataFrame(open_data, columns=['Date', 'Open'])
df_close = pd.DataFrame(close_data, columns=['Date', 'Close'])
df_open.head()

#### Visualizing the search results in a timeline chart

In [None]:
plt.figure(figsize=(15, 5)) 
plt.plot(df_open['Date'], df_open['Open'], label='Open',)
plt.plot(df_close['Date'], df_close['Close'], label='Close')
plt.title(f"{ticker} - Open")
plt.xlabel("Date")
plt.ylabel("Value")
plt.legend()
plt.show()

&nbsp;

<img src="https://github.com/denisabrantesredis/denisd-redis-learning-sessions/blob/main/Search/_assets/images/callout_insight.png?raw=true" alt="Callout - Check Redis Insight"/>

Open Redis Insight and confirm that all documents were generated. Notice how each document contains the vector that was automatically generated by the Langchain package. You may also notice that the vectors are not presented as a list; this is due to the fact that they are stored as binary strings, which is more efficient for retrieval and storage.

You can also go to the **Workbench** and get a list of indexes using the command:

```
FT._list
```

Finally, you can get more details about the index that was automatically generated by Langchain with this command:
```
FT.info "idx:web"
```
&nbsp;

&nbsp;

## Part 4: Running a Vector Search

<img src="https://github.com/denisabrantesredis/denisd-GenAI-Workshop/blob/main/_assets/images/callout_question.png?raw=true" alt="Callout - Change Question"/>

In [None]:
query = "How does Redis Insight make RDI simpler?"

### Running a Semantic Search

The Langchain integration greatly simplifies the process of running a semantic search. A single function call is enough. Notice how we do not need to generate a vector for our question manually; this is handled automatically by the function, based on the embedding model we've selected before.

For more details on the different ways to run vector searches, check the [Langchain documentation page](https://python.langchain.com/docs/integrations/vectorstores/redis/#query-vector-store).

&nbsp;


In [None]:
timer_start = time.perf_counter()
results = vector_store.similarity_search_with_score(query)
timer_end = time.perf_counter()
total_time = round(timer_end - timer_start, 4)
print(f"Total Time: {total_time}s")

### Visualizing the search results with the score for each result

In [None]:
print(f"Search results for '{query}':")
for doc in results:
    print("----")
    print(f"Score: {doc[1]} - {doc[0].page_content} (Source: {doc[0].metadata['url']})")

&nbsp;

## Part 5: Using a LLM

In this lab, we will use the Gemini Pro 1.5 model from Google to generate a response to the user, based on the documents retrieved from Redis. The GCP API Key that we set before is required to allow access to the model.

### Step 1: Load the Model

In [None]:
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_google_genai import ChatGoogleGenerativeAI

In [None]:
llm = ChatGoogleGenerativeAI(
    model="gemini-1.5-pro",
    temperature=0.5,
    top_p=0.95,
    top_k=64,
    max_output_tokens=8192
    )

### Step 2: Prepare a list with the text from the documents retrieved by the vector search

We will ask the model to respond to the user's questions. To help with the answer, we want to provide the text from the documents that were retrieved by the semantic search.

In [None]:
text_list = []
distance_list = []

for node in results:
    text_list.append(node[0].page_content)
    distance = node[1]
    distance_list.append(distance)

Print the list that will be sent to the model.

In [None]:
text_list

### Step 3: Prepare the Prompt

Since this is just a lab, we will keep the prompt very simple, with just basic instructions for the model to answer based on the documents from the semantic search, and to stick to the documents for the response. Production prompts will benefit from more sophisticated prompts, as well as other controls like guardrails, etc.

In [None]:
def get_system_template(text_list, query):
  system_template = """
  Your task is to answer questions by using a given context.

  Don't invent anything that is outside of the context.

  %CONTEXT%
  {context}

  """
  messages = [
      SystemMessage(content=system_template.format(context=text_list)),
      HumanMessage(content=query)
  ]

  return messages

In [None]:
messages = get_system_template(text_list, query)

### Step 4: Invoke the Model

Since we are not using Redis as cache, the model will be called every time, even if the same question (or a similar) is asked multiple times.

In [None]:
timer_start = time.perf_counter()
llm_response = llm.invoke(messages)
timer_end = time.perf_counter()
total_time = round(timer_end - timer_start, 4)
print(f"Total Time: {total_time}s")

Visualizing the model response:

In [None]:
llm_response.content

&nbsp;

## Part 6: Leveraging Redis for Basic Cache

Redis can be used not only as the Vector Database, but also as a cache to store responses from the Large Language Model, which can significantly improve user experience, by retrieving responses in milliseconds instead of seconds.

In [None]:
from langchain_redis import RedisCache
from langchain.globals import set_llm_cache

To use Redis as a cache, we only need 2 lines of code:

In [None]:
redis_cache = RedisCache(redis_url=REDIS_URL)
set_llm_cache(redis_cache)

We will repeat the same question from before. Since we have just enabled the cache, it will be empty, which means that this next question will require a vector search and will need to go through the Large Language Model again.

In [None]:
query = "How does Redis Insight make RDI simpler?"

In [None]:
timer_start = time.perf_counter()
result_nodes = vector_store.similarity_search_with_score(query)
timer_end = time.perf_counter()
total_time = round(timer_end - timer_start, 4)
print(f"Total Time: {total_time}s")

Prepare the list of texts to send to the LLM:

In [None]:
text_list = []
distance_list = []

for node in result_nodes:
    text_list.append(node[0].page_content)
    distance = node[1]
    distance_list.append(distance)

Display the search results:

In [None]:
print(f"--> Total Documents Found: {len(result_nodes)}")
for node in result_nodes:
  print(f"--> {node[1]} | {node[0].page_content}")

Prepare the prompt:

In [None]:
messages = get_system_template(text_list, query)

Call the model (the `invoke` function will check and populate the cache automatically):

In [None]:
timer_start = time.perf_counter()
llm_response = llm.invoke(messages)
timer_end = time.perf_counter()
total_time = round(timer_end - timer_start, 4)
print(f"Total Time: {total_time}s")

Print the LLM response:

In [None]:
llm_response.content

&nbsp;

<img src="https://github.com/denisabrantesredis/denisd-GenAI-Workshop/blob/main/_assets/images/callout_insight.png?raw=true" alt="Callout - Check Redis Insight"/>

A new document should appear on Redis, of type JSON. This is the cached response from the LLM.
Notice that the key is made from a long this; this is a hash of the question.

Because this is a basic cache, questions from the user will be hashed and compared against the key, which means that for this basic cache, questions must match exactly in order to be used.

&nbsp;


#### Repeating the question to fetch results from the cache

When we ask exactly the same question as before, it should trigger a cache hit, meaning we will receive the answer from the Redis cache, much faster than calling the model.

In [None]:
timer_start = time.perf_counter()
llm_response = llm.invoke(messages)
timer_end = time.perf_counter()
total_time = round(timer_end - timer_start, 4)
print(f"Total Time: {total_time}s")

Print the cached response:

In [None]:
llm_response.content

&nbsp;

#### Asking the same question (worded differently) will cause a cache miss

If the question is not an exact match, it will cause a cache miss. This might be an issue with most of the RAG use cases, which is why we will be exploring Semantic Cache next.

In [None]:
# original query = "How does Redis Insight make RDI simpler?"
query = "What does Redis Insight do to make RDI simpler?"

Prepare the prompt with the new query (PS: we're skipping the vector search on purpose)

In [None]:
messages = get_system_template(text_list, query)

Call the model:

In [None]:
timer_start = time.perf_counter()
llm_response = llm.invoke(messages)
timer_end = time.perf_counter()
total_time = round(timer_end - timer_start, 4)
print(f"Total Time: {total_time}s")

Print the response:

In [None]:
llm_response.content

&nbsp;

## Part 7 - Leveraging Redis for Semantic Cache

The Semantic Cache will generate vectors for each prompt, and store the response from the LLM. That way, new prompts are converted into vectors automatically and a semantic search is executed on Redis, looking for similar questions.

It is possible to set the threshold for for semantic search; for this lab, we are using 20%. In your project, you can run multiple tests with different thresholds, to determined what works best for your use case.

In [None]:
from langchain_redis import RedisSemanticCache

<img src="https://github.com/denisabrantesredis/denisd-GenAI-Workshop/blob/main/_assets/images/callout_threshold.png?raw=true" alt="Callout - Semantic Threshold"/>

In [None]:
redis_cache = RedisSemanticCache(redis_url=REDIS_URL, embeddings=embeddings, distance_threshold=0.2)
set_llm_cache(redis_cache)

Since the Semantic Cache is new, it will be empty. We will ask the original question first, to generate the cache entry:

In [None]:
query = "How does Redis Insight make RDI simpler?"

Prepare the prompt:

In [None]:
messages = get_system_template(text_list, query)

Invoke the model (it will cause a cache miss):

In [None]:
timer_start = time.perf_counter()
llm_response = llm.invoke(messages)
timer_end = time.perf_counter()
total_time = round(timer_end - timer_start, 4)
print(f"Total Time: {total_time}s")

Print the response:

In [None]:
llm_response.content

&nbsp;

<img src="https://github.com/denisabrantesredis/denisd-GenAI-Workshop/blob/main/_assets/images/callout_insight.png?raw=true" alt="Callout - Check Redis Insight"/>

A new Hash document will appear in Redis, with a key prefix of `llmcache`. This is the cached prompt, which includes the question and the answer. The `invoke` function will run a semantic search for these documents, to look for similar questions.

&nbsp;

#### Ask a similar question to trigger a cache hit

In [None]:
query = "What does Redis Insight do to make RDI simpler?"

Prepare the prompt:

In [None]:
messages = get_system_template(text_list, query)

Invoke the model:

In [None]:
timer_start = time.perf_counter()
llm_response = llm.invoke(messages)
timer_end = time.perf_counter()
total_time = round(timer_end - timer_start, 4)
print(f"Total Time: {total_time}s")

Print the response:

In [None]:
llm_response.content

In [None]:
# Search examples

# Time rollup with distinct users by hour
# FT.AGGREGATE myIndex "*" 
#   APPLY "@timestamp - (@timestamp % 3600)" AS hour 
#   GROUPBY 1 @hour 
#   REDUCE COUNT_DISTINCT 1 @user_id AS num_users 
#   SORTBY 2 @hour ASC 
#   APPLY timefmt(@hour) AS hour

In [None]:
# Multiple reducers under one GROUPBY
# FT.AGGREGATE products "*" 
#   GROUPBY 1 @category 
#   REDUCE COUNT 0 AS product_count 
#   REDUCE SUM 1 @price AS total_price 
#   REDUCE AVG 1 @rating AS avg_rating 
#   SORTBY 2 @total_price DESC 
#   LIMIT 0 10

In [None]:
# Derived fields, then aggregate revenue by category
# FT.AGGREGATE products "*" 
#   LOAD 3 @price @discount @quantity 
#   APPLY "@price - @discount" AS final_price 
#   APPLY "@final_price * @quantity" AS total_revenue 
#   GROUPBY 1 @category 
#   REDUCE SUM 1 @total_revenue AS total_category_revenue 
#   SORTBY 2 @total_category_revenue DESC 
#   LIMIT 0 10

In [None]:
# Post-aggregation filtering on a reduced value
# FT.AGGREGATE idx "*" 
#   GROUPBY 1 @foo 
#   REDUCE COUNT 0 AS num 
#   FILTER "@num < 100"

In [None]:
# Geo distance compute and sort
# FT.AGGREGATE libraries-idx "@location:[-73.982254 40.753181 10 km]" 
#   LOAD 1 @location 
#   APPLY "geodistance(@location, -73.982254, 40.753181)" AS dist 
#   SORTBY 2 @dist ASC 
#   LIMIT 0 1

In [None]:
# Parameterized filters with DIALECT 2
# FT.AGGREGATE products "*" 
#   LOAD 3 @price @rating @quantity 
#   FILTER "@price >= $minp" 
#   FILTER "@rating >= $minr" 
#   APPLY "@price * @quantity" AS total_value 
#   SORTBY 2 @total_value DESC 
#   LIMIT 0 10 
#   DIALECT 2 
#   PARAMS 4 minp 500 minr 4.0

In [None]:
# Faceted counts (split TAG-like field to facets)
# Example flow: APPLY split(@category) AS ctg -> GROUPBY @ctg -> REDUCE COUNT AS num_per_ctg -> SORTBY @num_per_ctg DESC

In [None]:
# Portfolio analytics (business calc)
# FT.AGGREGATE idx_trading_security_lot "@accountNo:(ACC10001) @date:[0 1725186610]" 
#   GROUPBY 1 @ticker 
#   REDUCE SUM 1 @lotValue AS totalLotValue 
#   REDUCE SUM 1 @quantity AS totalQuantity 
#   APPLY "(@totalLotValue/(@totalQuantity*100))" AS avgPrice

In [None]:
# Hybrid vector + text with ADDSCORES and derived scores
# FT.AGGREGATE idx:rpv "(@currency:{USD} ~@brand:(nike) ~@retailer:((walmart)) 
#                       ~@title:(red | shoes) ~@description:(red | shoes))=>[KNN 2400 @text_vector32 $vector AS vector_distance]" ADDSCORES 
#   LOAD 5 title brand price currency retailer_display_name 
#   APPLY "(2 - @vector_distance)/2" AS cosine_similarity 
#   APPLY @__score AS bm25_score 
#   APPLY "0.3@bm25_score + 0.7@cosine_similarity" AS hybrid_score 
#   SORTBY 2 @hybrid_score DESC 
#   LIMIT 0 100 
#   PARAMS 2 vector "<BLOB>" 
#   DIALECT 2

In [None]:
# Ordering and placement rules (quick references)
# • LOAD must be before the first GROUPBY; loading after aggregation will error. 
# • Typical order: FILTER → LOAD → APPLY → GROUPBY → REDUCE → SORTBY → LIMIT; DIALECT at end.

&nbsp;


&nbsp;



# Congrats, this is the end of the lab!!