<a href="https://colab.research.google.com/github/Redislabs-Solution-Architects/Redis-Workshops/blob/main/01-RedisJSON_Search/01-RedisJSON_Search.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# RedisJSON and RediSearch

![Redis](https://redis.io/wp-content/uploads/2024/04/Logotype.svg?auto=webp&quality=85,75&width=120)

This notebook is an adapted and simplified version of the RedisInsight QuickGuide "Working with JSON".

For the full experience we'd recommend installing Redis Insight and going through tutorial there.

https://redis.com/redis-enterprise/redis-insight/

In [1]:
# Install the requirements
!pip install -q redis

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/261.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━[0m [32m163.8/261.3 kB[0m [31m5.5 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m261.3/261.3 kB[0m [31m3.8 MB/s[0m eta [36m0:00:00[0m
[?25h

### Install Redis Stack locally

This installs the Redis Stack database for local usage if you cannot or do not want to create a Redis Cloud database, and it installs the `redis-cli` that will be used for checking database connectivity, etc.

In [2]:
%%sh
curl -fsSL https://packages.redis.io/gpg | sudo gpg --dearmor -o /usr/share/keyrings/redis-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/redis-archive-keyring.gpg] https://packages.redis.io/deb $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/redis.list
sudo apt-get update  > /dev/null 2>&1
sudo apt-get install redis-stack-server  > /dev/null 2>&1
redis-stack-server --daemonize yes


deb [signed-by=/usr/share/keyrings/redis-archive-keyring.gpg] https://packages.redis.io/deb jammy main
Starting redis-stack-server, database path /var/lib/redis-stack


### Setup Redis connection
***Note - Replace Redis connection values if using a Redis Cloud instance!***

In [3]:
import os

REDIS_HOST = os.getenv("REDIS_HOST", "localhost")
REDIS_PORT = os.getenv("REDIS_PORT", "6379")
REDIS_PASSWORD = os.getenv("REDIS_PASSWORD", "")
# Replace values above with your own if using Redis Cloud instance
#REDIS_HOST="redis-18374.c253.us-central1-1.gce.cloud.redislabs.com"
#REDIS_PORT=18374
#REDIS_PASSWORD="1TNxTEdYRDgIDKM2gDfasupCADXXXX"

# Shortcut for redis-cli $REDIS_CONN command
# If SSL is enabled on the endpoint add --tls
if REDIS_PASSWORD!="":
  os.environ["REDIS_CONN"]=f"-h {REDIS_HOST} -p {REDIS_PORT} -a {REDIS_PASSWORD} --no-auth-warning"
else:
  os.environ["REDIS_CONN"]=f"-h {REDIS_HOST} -p {REDIS_PORT}"

# If SSL is enabled on the endpoint, use rediss:// as the URL prefix
REDIS_URL = f"redis://:{REDIS_PASSWORD}@{REDIS_HOST}:{REDIS_PORT}"
INDEX_NAME = f"qna:idx"

# Test Redis connection
!redis-cli $REDIS_CONN PING

PONG


In [4]:
import redis

r = redis.Redis(
  host=REDIS_HOST,
  port=REDIS_PORT,
  password=REDIS_PASSWORD
)

r.ping()

True

## RedisJSON

[RedisJSON](https://redis.io/docs/latest/develop/data-types/json/) adds the JSON data type to Redis so you can work with JSON data natively in Redis, without treating the entire JSON as one big string and constantly serializing/deserializing JSON on the client.

With a client library like `redis-py` (Python) you can use commands like `redis.json().get()` and `redis.json().set()`, and in the Redis CLI `JSON.GET`, `JSON.SET`, and others.

See full list of RedisJSON commands here: https://redis.io/commands/?group=json

Python documentation: https://redis-py.readthedocs.io/en/stable/redismodules.html#redisjson-commands


In [5]:
# Create sample data of schools
schools = [
    {"name":"Hall School","description":"Spanning 10 states, this school award-winning curriculum includes a comprehensive reading system (from letter recognition and phonics to reading full-length books), as well as math, science, social studies, and even  philosophy.","class":"independent","type":["traditional"],"address":{"city":"London","street":"Manor Street"},"students":342,"location":"51.445417, -0.258352","status_log":["new","operating"]},
    {"name":"Garden School","description":"Garden School is a new and innovative outdoor teaching and learning experience, offering rich and varied activities in a natural environment to children and families.","class":"state","type":["forest","montessori"],"address":{"city":"London","street":"Gordon Street"},"students":1452,"location":"51.402926, -0.321523","status_log":["new","operating"]},
    {"name":"Gillford School","description":"Gillford School is an inclusive learning centre welcoming people from all walks of life, here invited to step into their role as regenerative agents, creating new pathways into the future and inciting an international movement of cultural, land, and social transformation.","class":"private","type":["democratic","waldorf"],"address":{"city":"Goudhurst","street":"Goudhurst"},"students":721,"location":"51.112685, 0.451076","status_log":["new","operating","closed"]},
    {"name":"Forest School","description":"The philosophy behind Forest School is based upon the desire to provide young children with an education that encourages appreciation of the wide world in nature while achieving independence, confidence and high self-esteem. ","class":"independent","type":["forest","montessori","democratic"],"address":{"city":"Oxford","street":"Trident Street"},"students":1200,"location":"51.781756, -1.123196","status_log":["new","operating"]}

]

# Load data in Redis as JSON
for id,school in enumerate(schools):
    #print(school)
    r.json().set(f"school_json:{id}", '.', school)

In [6]:
# Verify that keys were inserted into database
!redis-cli $REDIS_CONN JSON.GET school_json:1 $
!redis-cli $REDIS_CONN keys 'school_json:*'

"[{\"name\":\"Garden School\",\"description\":\"Garden School is a new and innovative outdoor teaching and learning experience, offering rich and varied activities in a natural environment to children and families.\",\"class\":\"state\",\"type\":[\"forest\",\"montessori\"],\"address\":{\"city\":\"London\",\"street\":\"Gordon Street\"},\"students\":1452,\"location\":\"51.402926, -0.321523\",\"status_log\":[\"new\",\"operating\"]}]"
1) "school_json:3"
2) "school_json:2"
3) "school_json:1"
4) "school_json:0"


In [7]:
# Retrieve entire JSON doc
res=r.json().get('school_json:0','$')
print(res)

[{'name': 'Hall School', 'description': 'Spanning 10 states, this school award-winning curriculum includes a comprehensive reading system (from letter recognition and phonics to reading full-length books), as well as math, science, social studies, and even  philosophy.', 'class': 'independent', 'type': ['traditional'], 'address': {'city': 'London', 'street': 'Manor Street'}, 'students': 342, 'location': '51.445417, -0.258352', 'status_log': ['new', 'operating']}]


In [8]:
# Retrieve single property
res=r.json().get('school_json:0','$.name')
print(res)

# TODO: Try modifying this line to retreive:
# - Embedded object ($.address)
# - Element of the array "$.status_log[0]":

['Hall School']


In [9]:
# Read number of students
students=r.json().get('school_json:0','$.students')
print(students)

# Set new number
r.json().set('school_json:0','$.students',350)
r.json().get('school_json:0','$.students')

[342]


[350]

In [10]:
# Atomic increment for number of students
r.json().numincrby('school_json:0','$.students',1)

[351]

## RediSearch

[RediSearch](https://redis.io/docs/latest/develop/interact/search-and-query/) adds the ability to query data in your HASH or JSON data structures, essentially turning Redis into a docuemnt database.

With RediSearch you declare indices once and then every database key matching the prefix and schema, defined in the index, will be automatically added to the index in real-time.

For the full list of RediSearch commands see: https://redis.io/commands/?group=search

Python documentation: https://redis-py.readthedocs.io/en/stable/redismodules.html#redisearch-commands

In [11]:
from redis.commands.search.field import (
    NumericField,
    TagField,
    TextField,
    VectorField,
)
from redis.commands.search.indexDefinition import IndexDefinition, IndexType
from redis.commands.search.query import Query

schema = (
    TextField("$.name", as_name="name"),
    TextField("$.description", as_name="description"),
    TagField("$.address.city", as_name="city"),
    NumericField("$.students", as_name="students")
)

r.ft("idx:schools_json").create_index(
    schema,
    definition=IndexDefinition(prefix=["school_json:"],
    index_type=IndexType.JSON)
)

b'OK'

In [12]:
import pandas as pd

# Return the entire document for all keys matching the search term
res=r.ft("idx:schools_json").search("nature")
res_df = pd.DataFrame([t.__dict__ for t in res.docs ])
res_df

Unnamed: 0,id,payload,json
0,school_json:3,,"{""name"":""Forest School"",""description"":""The phi..."
1,school_json:1,,"{""name"":""Garden School"",""description"":""Garden ..."


In [13]:
# Return selected fields only
query=Query("nature") \
   .return_field("$.address.city", as_field="city") \
   .return_field("$.name", as_field="name")

res=r.ft("idx:schools_json").search(query)
res_df = pd.DataFrame([t.__dict__ for t in res.docs ])
res_df

Unnamed: 0,id,payload,city,name
0,school_json:3,,Oxford,Forest School
1,school_json:1,,London,Garden School


In [14]:
# Multi-field query
query=Query('@city:{London} @students:[0, 10000]') \
   .return_field("$.address.city", as_field="city") \
   .return_field("$.name", as_field="name") \
   .return_field("$.students", as_field="students")

res=r.ft("idx:schools_json").search(query)
#print(res)
res_df = pd.DataFrame([t.__dict__ for t in res.docs ])
res_df

Unnamed: 0,id,payload,city,name,students
0,school_json:1,,London,Garden School,1452
1,school_json:0,,London,Hall School,351


#### Cleanup
(Optional) Delete all keys and indexes from the database.

In [15]:
###-- FLUSHDB will wipe out the entire database!!! Use with caution --###
# !redis-cli $REDIS_CONN flushdb