<a href="https://colab.research.google.com/github/denisabrantesredis/denisd-redis-learning-sessions/blob/main/ClientSideCaching/ClientSideCaching.ipynb" target="_newt">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

<div style="display:flex;width=100%;">
<img src="https://redis.io/wp-content/uploads/2024/04/Logotype.svg?auto=webp&quality=85,75&width=120" alt="Redis" width="90"/>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
</div>

# Redis Learning Session - Client-Side Caching

<img src="https://redis.io/docs/latest/images/csc/CSCNoCache.drawio.svg" alt="CSC - No Cache"/>
<br/><br/><br/><br/>
<img src="https://redis.io/docs/latest/images/csc/CSCWithCache.drawio.svg" alt="CSC - Cache"/>

[Try a sample Java application](https://github.com/Redislabs-Solution-Architects/redis-client-side-caching-csc-jedis-demo)

In this notebook, we will explore the Client-Side Caching capabilities provided by Redis.

## Before We Start

<a href="https://redis.io/try-free/" target="_new">
<img src="https://github.com/denisabrantesredis/denisd-redis-learning-sessions/blob/main/ClientSideCaching/_assets/images/callout_rediscloud.png?raw=true" alt="Callout - Create free Redis Cloud Account"/>
</a>

<b>Create a new free Redis Cloud account: <a href="https://redis.io/try-free/" target="_new">Click Here</a>

&nbsp;
&nbsp;
<a href="https://colab.research.google.com/github/denisabrantesredis/denisd-redis-learning-sessions/blob/main/ClientSideCaching/ClientSideCaching_Monitor.ipynb" target="_new">
<img src="https://github.com/denisabrantesredis/denisd-redis-learning-sessions/blob/main/ClientSideCaching/_assets/images/callout_monitor.png?raw=true" alt="Callout - Open Monitor Notebook"/>
</a>

Open the Monitor Notebook in a new tab to keep track of the commands executed by the Redis Server.

***IMPORTANT:*** Don't use this notebook if you are going to run Redis locally.

## Installing the Pre-Reqs

In [None]:
!pip install -q redis
!pip install -q ipython-autotime

In [None]:
%load_ext autotime

## Installing Redis Locally
If you are not using Redis Cloud as a database, uncomment and run the code below to install Redis locally. Then set your connection to 127.0.0.1

In [None]:
# %%sh
# curl -fsSL https://packages.redis.io/gpg | sudo gpg --dearmor -o /usr/share/keyrings/redis-archive-keyring.gpg 
# echo "deb [signed-by=/usr/share/keyrings/redis-archive-keyring.gpg] https://packages.redis.io/deb $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/redis.list 
# sudo apt-get update  > /dev/null 2>&1
# sudo apt-get install redis-stack-server  > /dev/null 2>&1
# redis-stack-server --daemonize yes

## Copying and Unzipping Lab Files

In [None]:
import os

In [None]:
if not os.path.exists("./files"):
  !mkdir files
  !wget https://github.com/denisabrantesredis/denisd-redis-learning-sessions/raw/refs/heads/main/Search/_assets/files/lab_assets.zip
  !mv lab_assets.zip ./files
  !unzip ./files/lab_assets.zip -d ./files

## Connecting to Redis

In [None]:
import redis
from redis.cache import CacheConfig, EvictionPolicy

from google.colab import userdata

#### Setup the Connection String

<img src="https://github.com/denisabrantesredis/denisd-redis-learning-sessions/blob/main/Streams/_assets/images/callout_secrets.png?raw=true" alt="Callout - Use Google Colab secrets instead"/>

In [None]:
try:
  REDIS_HOST = userdata.get('REDIS_HOST')
except:
  REDIS_HOST="127.0.0.1"

try:
  REDIS_PORT = userdata.get('REDIS_PORT')
except:
  REDIS_PORT=6379

try:
  REDIS_PASSWORD = userdata.get('REDIS_PASSWORD')
except:
  REDIS_PASSWORD=""

REDIS_URL = f"redis://default:{REDIS_PASSWORD}@{REDIS_HOST}:{REDIS_PORT}"

#### Testing the Connection to Redis

<img src="https://github.com/denisabrantesredis/denisd-redis-learning-sessions/blob/main/Streams/_assets/images/callout_connection.png?raw=true" alt="Callout - Make sure connection works"/>

In [None]:
r = redis.from_url(
    REDIS_URL,
    protocol=3,
    cache_config=CacheConfig(max_size=10, eviction_policy=EvictionPolicy.LRU),
    decode_responses=True)

if r.ping():
    print("Connection successful!")
else:
    print("Connection issue!")

<img src="https://github.com/denisabrantesredis/denisd-redis-learning-sessions/blob/main/Streams/_assets/images/callout_insight.png?raw=true" alt="Callout - Check Redis Insight"/>

Make sure you can connect to the database using Redis Insight. The database should be empty for now.

&nbsp;

&nbsp;

## Caching String Data

First, let's test Client-Side Caching for Strings. We'll start by setting a simple string:

In [None]:
r.set("myname", "My Name")

Next, read the value of the key:

In [None]:
r.get("myname")

Reading it for a second time should reduce the execution time, because the key will be retrieved from the local cache:

In [None]:
r.get("myname")

Check the MONITOR output in the other notebook; only 1 GET command should be listed there.

## Caching Streams data

Let's store the Streams dataset we used in the Search lab:

In [None]:
from datetime import datetime, timedelta
import pandas as pd
import pytz

Load data from CSV:

In [None]:
iot_ds = pd.read_csv('./files/iot.csv')
print(len(iot_ds))
iot_ds.head()

Using a pipeline to save data to Redis: all messages will be under a single stream (key), named `stream:iot`. The pipeline will gather all commands and execute them once against Redis.

In [None]:
pipe = r.pipeline(transaction=False)
keyname = "streams:iot"
for index, row in iot_ds.iterrows():
      value = {
            'id': row['id'],
            'room': row['room'],
            'date': row['date'],
            'temp' : row['temp'],
            'location': row['location'],
            'timestamp': row['timestamp']
      }
      pipe.xadd(keyname, id=row['timestamp'], fields=value)
result = pipe.execute()
len(result)

<img src="https://github.com/denisabrantesredis/denisd-redis-learning-sessions/blob/main/Streams/_assets/images/callout_insight.png?raw=true" alt="Callout - Check Redis Insight"/>

Open Redis Insight and confirm that the Stream key was generated. The key should be called `streams:iot`, and it contains about 97,000 messages.

&nbsp;

&nbsp;

Let's check the date ranges available in this dataset:

In [None]:
print(iot_ds.iloc[0]['date'])
print(iot_ds.iloc[len(iot_ds)-1]['date'])

Define the date range for our Streams search:

<img src="https://github.com/denisabrantesredis/denisd-redis-learning-sessions/blob/main/Search/_assets/images/callout_streams.png?raw=true" alt="Callout - Change Time Range"/>

In [None]:
start_date = "01-04-2024 12:00"
end_date = "01-04-2024 12:30"

Transform these strings into timestamps:

In [None]:
datetime_format = "%d-%m-%Y %H:%M"
local_tz = pytz.timezone('America/Chicago')

local_start = datetime.strptime(start_date, datetime_format)
utc_start = local_tz.localize(local_start)
first_ts = utc_start.timestamp()

local_end = datetime.strptime(end_date, datetime_format)
utc_end = local_tz.localize(local_end)
last_ts = utc_end.timestamp()

Find all Streams messages within this time range:

In [None]:
messages = r.xrange("streams:iot", int(first_ts), int(last_ts))
for message in messages:
  print(message)

Check the MONITOR notebook for the XRANGE output; now run the command again:

In [None]:
messages = r.xrange("streams:iot", int(first_ts), int(last_ts))
for message in messages:
  print(message)

These results are served from the cache.

Keep in mind that the entire key is cached, so a new entry will trigger the invalidation of the cache, even a partial one. Let's add a new message to this stream:

In [None]:
r.xadd("streams:iot", id="*",
       fields={'id': 'temp_log_51679_4928200b', 'room': 'Room Admin', 'date': '01-02-2026 12:30:00 ', 'temp': '45', 'location': 'Out', 'timestamp': '1711992601'})

Next, let's get the same range again.

In [None]:
messages = r.xrange("streams:iot", int(first_ts), int(last_ts))
for message in messages:
  print(message)

Check the MONITOR notebook; the XRANGE command should be in the output now.

## Caching HASH data

Let's see how hash keys work with client-side caching and TTL.

First, let's create a simple hash key:

In [None]:
hash_fields = {"first_name": "John", "last_name": "Doe"}
r.hset("myhash", mapping=hash_fields)

Check Insight and the new key should be there. The MONITOR notebook will also show the SET command being executed.

Next, get the same key twice; only 1 output should show up in the MONITOR notebook:

In [None]:
r.hgetall("myhash")

In [None]:
r.hgetall("myhash")

Note the execution time difference between them.

Now, how does Client-Side Caching behave with TTL attributes? Let's test this by adding a new attribute to a hash, with the email attribute, set to expire in 60 seconds:

In [None]:
r.hsetex("myhash", ex=60, mapping={"email": "john.doe@gmail.com"})

# If you are using local Redis, the hsetex command is not supported in this version. Use this instead:
# hash_fields = {"first_name": "John", "last_name": "Doe", "email": "john.doe@gmail.com"}
# r.hset("myhash", mapping=hash_fields)

Quickly run another GET to retrieve the updated hash; the email attribute should be there now:

In [None]:
r.hgetall("myhash")

Wait 60 seconds for the email attribute to expire (you can use Insight to check) and run another GET:

In [None]:
r.hgetall("myhash")

This new GET command should show up in the MONITOR notebook.

## Caching JSON data

First, we need to import the dataset into Redis:

In [None]:
import os
import json
import numpy as np

from redis.commands.search.field import TagField
from redis.commands.search.field import TextField
from redis.commands.search.field import NumericField
from redis.commands.search.query import NumericFilter, Query
from redis.commands.search.index_definition import IndexDefinition, IndexType

In [None]:
prod_df = pd.read_pickle('./files/prodjson.pkl')
print(len(prod_df))
prod_df.head()

In [None]:
for index, product in prod_df.iterrows():
  keyname = f"jsonprod:{product['id']}"
  pipe.json().set(keyname, "$", product.to_dict())
results = pipe.execute()
len(results)

&nbsp;

<img src="https://github.com/denisabrantesredis/denisd-redis-learning-sessions/blob/main/Streams/_assets/images/callout_insight.png?raw=true" alt="Callout - Check Redis Insight"/>

Use Insight to make sure the new JSON keys are there. The MONITOR notebook will also show all 1048 keys inserted.

&nbsp;
&nbsp;

Next, let's create a search index for the product keys:

In [None]:
schema = (
    NumericField("$.id", as_name="id", sortable=True),
    NumericField("$.price", as_name="price"),
    NumericField("$.discountedPrice", as_name="discountedPrice"),
    TextField("$.articleNumber", as_name="articleNumber"),
    TextField("$.productDisplayName", as_name="productDisplayName"),
    TextField("$.productDescription", as_name="productDescription", index_missing=True, index_empty=True),
    TextField("$.variantName", as_name="variantName"),
    NumericField("$.catalogAddDate", as_name="catalogAddDate"),
    TagField("$.brandName", as_name="brandName"),
    TagField("$.ageGroup", as_name="ageGroup"),
    TagField("$.gender", as_name="gender"),
    TagField("$.baseColour", as_name="baseColour"),
    TagField("$.fashionType", as_name="fashionType"),
    TagField("$.season", as_name="season"),
    TagField("$.year", as_name="year"),
    NumericField("$.rating", as_name="rating"),
    TagField("$.displayCategories", as_name="displayCategories"),
    TagField("$.masterCategory", as_name="masterCategory"),
    TagField("$.subCategory", as_name="subCategory"),
    TextField("$.articleType", as_name="articleType"),
    NumericField("$.discount_pct", as_name="discount_pct"),
    NumericField("$.inventoryCount", as_name="inventoryCount", index_missing=True)
)
try:
    r.ft("idx:jsonprod").dropindex()
except:
    print("--> JSONProd index doesn't exist; creating it")
try:
  definition = IndexDefinition(prefix=["jsonprod:"], index_type=IndexType.JSON)
  result = r.ft("idx:jsonprod").create_index(fields=schema, definition=definition)
except Exception as ex:
    result = f"FAILED to create index: {ex}"

Make sure the index is populated:

In [None]:
info = r.ft('idx:jsonprod').info()
print(f" Percent Indexed: {int(info['percent_indexed'])*100}")
print(f" Total Documents: {info['num_docs']}")

We know that Client-Side Caching won't work with search commands (but we're going to test this anyway).

Let's run the standard GET command for JSON keys:

In [None]:
r.json().get("jsonprod:10110", "$")

That's a lot of output. Let's say I only need the product description:

In [None]:
r.json().get("jsonprod:10110", "$.productDisplayName")

Note that both commands show up in the MONITOR notebook; a partial retrieval will not use the cache, even if the entire key is there. However, running both commands again:

In [None]:
r.json().get("jsonprod:10110", "$")

In [None]:
r.json().get("jsonprod:10110", "$.productDisplayName")

We can see that none of them show up in the MONITOR output; both commands used the local cache.

Next, let's add a discount to this product (currently set at 0.0):

In [None]:
r.json().set("jsonprod:10110", "$.discount_pct", 0.15)

Check Insight to see the new value in the key.

Re-run both commands, and they should show up in the MONITOR notebook now, because the local cache entry was invalidated.

In [None]:
r.json().get("jsonprod:10110", "$")

In [None]:
r.json().get("jsonprod:10110", "$.productDisplayName")

Check the MONITOR notebook; both commands should appear there.

Finally, let's test the search command:

In [None]:
query = Query('@productDisplayName: (Reebok Women Instant)'
                ).sort_by('id', asc=True
                ).return_fields('productDisplayName', 'price', 'id'
                ).paging(0,10)

results = r.ft("idx:jsonprod").search(query)
for result in results["results"]:
  print(result["extra_attributes"])

Check the MONITOR notebook for the query.

Now, repeat the search:

In [None]:
query = Query('@productDisplayName: (Reebok Women Instant)'
                ).sort_by('id', asc=True
                ).return_fields('productDisplayName', 'price', 'id'
                ).paging(0,10)

results = r.ft("idx:jsonprod").search(query)
for result in results["results"]:
  print(result["extra_attributes"])

As we can see, search commands are not cached by the client.

&nbsp;
&nbsp;

## Client Cache Max Size

In this lab, we set the max size of the cache to **10**. This means only 10 keys (or, to be more accurate, 10 read command outputs) will be stored at a time. We can test this with a loop.

First, let's make a list with 10 key names:

In [None]:
prod_keys = ["jsonprod:10110", "jsonprod:10111", "jsonprod:10112", "jsonprod:10164", "jsonprod:10184", "jsonprod:10193",
             "jsonprod:10214", "jsonprod:10215", "jsonprod:10220", "jsonprod:10229"]

Then, let's run a loop where we GET each key 3 times:

In [None]:
for i in range(3):
  for key in prod_keys:
    prod_data = r.json().get(key, "$")[0]
    print(f"--> Loop {i+1} ID: {prod_data['id']} - Product: {prod_data['productDisplayName']}")

Check the MONITOR notebook; the keys should have been called only once.

Next, let's extend our list of keys to 12:

In [None]:
prod_keys = ["jsonprod:10110", "jsonprod:10111", "jsonprod:10112", "jsonprod:10164", "jsonprod:10184", "jsonprod:10193",
             "jsonprod:10214", "jsonprod:10215", "jsonprod:10220", "jsonprod:10229", "jsonprod:10230", "jsonprod:10246"]

Run the loop again:

In [None]:
for i in range(3):
  for key in prod_keys:
    prod_data = r.json().get(key, "$")[0]
    print(f"--> Loop {i+1} ID: {prod_data['id']} - Product: {prod_data['productDisplayName']}")

Check the MONITOR notebook, see how the keys are eventually evicted from the local cache and need to be retrieved again.

&nbsp;


&nbsp;



# Congrats, this is the end of the lab!!