<a href="https://colab.research.google.com/github/michaelskyuan/Redis-Workshops/blob/main/01-RedisJSON_Search/01-RedisJSON_Search.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# RedisJSON and RediSearch

![Redis](https://redis.com/wp-content/themes/wpx/assets/images/logo-redis.svg?auto=webp&quality=85,75&width=120)

This notebook is an adapted and simplified version of the RedisInsight QuickGuide "Working with JSON".

For the full experience we'd recommend installing Redis Insight and going through tutorial there.

https://redis.com/redis-enterprise/redis-insight/

In [1]:
# Install the requirements
!pip install -q redis

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/240.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m240.3/240.3 kB[0m [31m8.0 MB/s[0m eta [36m0:00:00[0m
[?25h

In [2]:
%%sh
curl -fsSL https://packages.redis.io/gpg | sudo gpg --dearmor -o /usr/share/keyrings/redis-archive-keyring.gpg 
echo "deb [signed-by=/usr/share/keyrings/redis-archive-keyring.gpg] https://packages.redis.io/deb $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/redis.list 
sudo apt-get update  > /dev/null 2>&1
sudo apt-get install redis-stack-server  > /dev/null 2>&1
redis-stack-server --daemonize yes 


deb [signed-by=/usr/share/keyrings/redis-archive-keyring.gpg] https://packages.redis.io/deb focal main
Starting redis-stack-server, database path /var/lib/redis-stack


In [3]:
import redis
import os


In [4]:
REDIS_HOST = os.getenv("REDIS_HOST", "localhost")
REDIS_PORT = os.getenv("REDIS_PORT", "6379")
REDIS_PASSWORD = os.getenv("REDIS_PASSWORD", "")
#Replace values above with your own if using Redis Cloud instance
#REDIS_HOST="redis-12110.c82.us-east-1-2.ec2.cloud.redislabs.com"
#REDIS_PORT=12110
#REDIS_PASSWORD="pobhBJP7Psicp2gV0iqa2ZOc1XXXXXX"

#shortcut for redis-cli $REDIS_CONN command
if REDIS_PASSWORD!="":
  os.environ["REDIS_CONN"]=f"-h {REDIS_HOST} -p {REDIS_PORT} -a {REDIS_PASSWORD} --no-auth-warning"
else:
  os.environ["REDIS_CONN"]=f"-h {REDIS_HOST} -p {REDIS_PORT}"

In [5]:
r = redis.Redis(
  host=REDIS_HOST,
  port=REDIS_PORT,
  password=REDIS_PASSWORD)
r.ping()

True

## RedisJSON

RedisJSON adds JSON data type to Redis so you can work with JSON data natively in Redis, without treating the entire JSON as one big string and constantly serializing/deserializing JSON on the client.

With cluent library like Python you can use commands like `redis.json().get()` and `redis.json().get()` and in Redis CLI `JSON.GET`, `JSON.SET` and others. 

See full list of RedisJSON commands here: https://redis.io/commands/?group=json 

Python documentation: https://redis-py.readthedocs.io/en/stable/redismodules.html#redisjson-commands


In [6]:
schools = [
    {"name":"Hall School","description":"Spanning 10 states, this school award-winning curriculum includes a comprehensive reading system (from letter recognition and phonics to reading full-length books), as well as math, science, social studies, and even  philosophy.","class":"independent","type":["traditional"],"address":{"city":"London","street":"Manor Street"},"students":342,"location":"51.445417, -0.258352","status_log":["new","operating"]},
    {"name":"Garden School","description":"Garden School is a new and innovative outdoor teaching and learning experience, offering rich and varied activities in a natural environment to children and families.","class":"state","type":["forest","montessori"],"address":{"city":"London","street":"Gordon Street"},"students":1452,"location":"51.402926, -0.321523","status_log":["new","operating"]},
    {"name":"Gillford School","description":"Gillford School is an inclusive learning centre welcoming people from all walks of life, here invited to step into their role as regenerative agents, creating new pathways into the future and inciting an international movement of cultural, land, and social transformation.","class":"private","type":["democratic","waldorf"],"address":{"city":"Goudhurst","street":"Goudhurst"},"students":721,"location":"51.112685, 0.451076","status_log":["new","operating","closed"]},
    {"name":"Forest School","description":"The philosophy behind Forest School is based upon the desire to provide young children with an education that encourages appreciation of the wide world in nature while achieving independence, confidence and high self-esteem. ","class":"independent","type":["forest","montessori","democratic"],"address":{"city":"Oxford","street":"Trident Street"},"students":1200,"location":"51.781756, -1.123196","status_log":["new","operating"]}
    ]
#load data in Redis as JSON
for id,school in enumerate(schools):
    #print(school)
    r.json().set(f"school_json:{id}", '.', school)

In [7]:
!redis-cli $REDIS_CONN JSON.GET school_json:1 $
!redis-cli $REDIS_CONN keys 'school_json:*'

"[{\"name\":\"Garden School\",\"description\":\"Garden School is a new and innovative outdoor teaching and learning experience, offering rich and varied activities in a natural environment to children and families.\",\"class\":\"state\",\"type\":[\"forest\",\"montessori\"],\"address\":{\"city\":\"London\",\"street\":\"Gordon Street\"},\"students\":1452,\"location\":\"51.402926, -0.321523\",\"status_log\":[\"new\",\"operating\"]}]"
1) "school_json:3"
2) "school_json:2"
3) "school_json:1"
4) "school_json:0"


In [8]:
#retrieve entire JSON
res=r.json().get('school_json:0','$')
print(res)


[{'name': 'Hall School', 'description': 'Spanning 10 states, this school award-winning curriculum includes a comprehensive reading system (from letter recognition and phonics to reading full-length books), as well as math, science, social studies, and even  philosophy.', 'class': 'independent', 'type': ['traditional'], 'address': {'city': 'London', 'street': 'Manor Street'}, 'students': 342, 'location': '51.445417, -0.258352', 'status_log': ['new', 'operating']}]


In [9]:
#retrieve single property
res=r.json().get('school_json:0','$.name')
print(res)

#TODO: Try modifying this line to retreive:
# - Embedded object ($.address)
# - Element of the array "$.status_log[0]":

['Hall School']


In [10]:
# Read number of students
students=r.json().get('school_json:0','$.students')
print(students)
#set new number
r.json().set('school_json:0','$.students',350)
r.json().get('school_json:0','$.students')

[342]


[350]

In [11]:
#atomic increment for number of students
r.json().numincrby('school_json:0','$.students',1)

[351]

## RediSearch

RediSearch adds the ability to query data in your HASH or JSON data structures, essentially turning Redis into the docuemnt database.

With RediSearch you declare indices once and then every database object matching the prefix, defined in the index would be automatically and in real time added to the index.

For the full list of RediSearch commands see: https://redis.io/commands/?group=search 

Python documentation: https://redis-py.readthedocs.io/en/stable/redismodules.html#redisearch-commands 

In [12]:
from redis.commands.search.field import (
    NumericField,
    TagField,
    TextField,
    VectorField,
)
from redis.commands.search.indexDefinition import IndexDefinition, IndexType
from redis.commands.search.query import Query
schema = (
    TextField("$.name", as_name="name"),
    TextField("$.description", as_name="description"),
    TagField("$.address.city", as_name="city"), 
    NumericField("$.students", as_name="students")
    )
r.ft("idx:schools_json").create_index(schema, 
                    definition=IndexDefinition(prefix=["school_json:"], 
                    index_type=IndexType.JSON)
                    )

b'OK'

In [13]:
import pandas as pd
#return the entire document
res=r.ft("idx:schools_json").search("nature")
res_df = pd.DataFrame([t.__dict__ for t in res.docs ])
res_df

Unnamed: 0,id,payload,json
0,school_json:3,,"{""name"":""Forest School"",""description"":""The phi..."
1,school_json:1,,"{""name"":""Garden School"",""description"":""Garden ..."


In [14]:
#return selected fields only
query=Query("nature") \
   .return_field("$.address.city", as_field="city") \
   .return_field("$.name", as_field="name")
res=r.ft("idx:schools_json").search(query)
res_df = pd.DataFrame([t.__dict__ for t in res.docs ])
res_df

Unnamed: 0,id,payload,city,name
0,school_json:3,,Oxford,Forest School
1,school_json:1,,London,Garden School


In [28]:
#Multi-field query
query=Query('@city:{London} @students:[0, 10000]') \
   .return_field("$.address.city", as_field="city") \
   .return_field("$.name", as_field="name") \
   .return_field("$.students", as_field="students")
res=r.ft("idx:schools_json").search(query)
#print(res)
res_df = pd.DataFrame([t.__dict__ for t in res.docs ])
res_df


Unnamed: 0,id,payload,city,name,students
0,school_json:0,,London,Hall School,351
1,school_json:1,,London,Garden School,1452
