<img style="float: right; margin: 30px; padding: 10px; background-color: silver; width: 300px" src="https://docs.redislabs.com/latest/images/icon_logo/redis-logo.svg">

# Redis for Recommendations
Martin Forstner<br>
<i>Solution Architect</i><br>


## About Redis

<img style="float: right; margin: 30px; padding : 10px; background-color: silver" src="https://redis.io/images/redis-white.png">

Think it's a good idea to mention what Redis actually is, whereby Redis is very popular and so most of you might already know what it is. Here a ranking of the most popular database systems (just to highlight how popular Redis is):

* https://db-engines.com/en/ranking

Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes with radius queries and streams. Redis has built-in replication, Lua scripting, LRU eviction, transactions and different levels of on-disk persistence.

### Redis Enterprise

Redis Labs is the home of OSS Redis and the provider of the multi-model in-memory database system 'Redis Enterprise'. Redis Enterprise is based on Redis Open Source and is providing the following addtional features:

* **Easier Operability**: Admin Web UI, Admin REST Service, several CLI tools, Redis Enterprise Cloud (Hosted or VPC), Enterprise Support
* **Enhanced High Availability**: Node based quorum, rack-tone awareness, disaster recovery, periodic backup, faster failover times and different watchdog profiles, ...
* **Improved Scalability and Consistent Performance**: Multiple Redis shards behind a single endpoint, different shards placement policies, built-in resource management for better resource isolation, multi-tenancy, Tunable frontend thread management, ...
* **Active-Active Geo-replicated Databases**: By leveraging Conflict-free Replicated Data Types (resetable PN-Counters, OR-Sets, LW-wins Register, causual consistency via Vector Clocks)
* **Redis on Flash**: Uses Flash drives as RAM extension in order to store more data at lower costs.

### Modules

In addition Redis Labs is maintaining the following modules:

* **RediSearch**: Search engine over Redis
* **Redis-ML**: Machine Learning Model Server
* **Redis Graph**: Graph database with an Open Cypher-based query language
* **ReJSON**: A JSON data type for Redis
* **ReBloom**: Scalable Bloom filters

The source code of all of these modules is available on Github.

## Setup

The following has to be prepared for this course:

* Install Python 3 https://www.python.org/downloads
* Install Docker https://docs.docker.com/install
* Install Jupyter `python3 -m pip install jupyter`
* Install redis-py `python3 -m pip install redis`

To start the course:

* Clone repo `git clone https://github.com/martinez099/rl-recsys`
* Start Redis `rl-recsys/docker/start_redis.bash`
* Run notebook `jupiter notebook rl-recsys/notebooks/Redis_for_Recommendations.ipynb`

In [1]:
import redis

# Vanilla Redis
r = redis.StrictRedis(decode_responses=True, host='redis-17293.mf.demo.redislabs.com', port=17293)

## Content Based Filtering

The idea is to look at what a specific user is interested in and then to recommend things those are similar (i.e. having the same class) as other things the  user is liking.

* Data structures: **Sets**
* Operations: Members/Scans, Union

In [None]:
# Clean
r.flushall()
print("Database cleaned.")

In [None]:
# Helpers

## Format any string to a Redis key
def to_key(val):
    return val.lower().replace(' ', '_')

In [None]:
# Demo data

## Each user has comics
r.sadd('usr:david:items', 'Valerian', 'Batman')
r.sadd('usr:pieter:items', 'Fantastic Four')
r.sadd('usr:martin:items', 'Avatar')

## Each comic belongs to a category
r.set('itm:valerian:category', 'scifi')
r.set('itm:fantastic_four:category', 'scifi')
r.set('itm:batman:category', 'super-heros')
r.set('itm:spiderman::category', 'super-heros')
r.set('itm:wonder_woman:category', 'super-heros')
r.set('itm:avatar:category', 'fantasy')
r.set('itm:dragon_age:category', 'fantasy')

## Each category has comics
r.sadd('ctg:scifi:items','Valerian', 'Fantastic Four')
r.sadd('ctg:super-heros:items', 'Batman', 'Spiderman', 'Wonder Woman')
r.sadd('ctg:fantasy:items', 'Avatar', 'Dragon Age')

print('Demo data created.')

In [None]:
# On reqeust we get the categories of each user's comic
## N.B: SSCAN better for large sets
categories_david = {r.get('itm:{}:category'.format(to_key(item))) for item in r.smembers('usr:david:items')}
categories_pieter = [r.get('itm:{}:category'.format(to_key(item))) for item in r.smembers('usr:pieter:items')]
categories_martin = [r.get('itm:{}:category'.format(to_key(item))) for item in r.smembers('usr:martin:items')]

# Then we get the recommended set of comics
## N.B: SUNIONSTORE for materializing large result sets
recommendation_david = r.sunion(["ctg:" + ctg + ":items" for ctg in categories_david])
recommendation_pieter = r.sunion(["ctg:" + ctg + ":items" for ctg in categories_pieter])
recommendation_martin = r.sunion(["ctg:" + ctg + ":items" for ctg in categories_martin])

# Lastly we substract the comics each user has already
recommendation_david -= r.smembers('usr:david:items')
recommendation_pieter -= r.smembers('usr:pieter:items')
recommendation_martin -= r.smembers('usr:martin:items')

# Print result
print("David could be also interested in: {0}".format(recommendation_david))
print("Pieter could be also interested in: {0}".format(recommendation_pieter))
print("Martin could be also interested in: {0}".format(recommendation_martin))

## Collaborative Filtering

It's mandatory to have details about many other users collected. The underlying idea is that if person A likes the same things as person B, then person B might also like the other items those are liked by person A.

* Data structures: **Sets**
* Operations: Members/Scans, Union, Diff

In [None]:
# Clean
r.flushall()
print("Database cleaned.")

In [None]:
# Demo data

## Each user owns a set of items, i.e. a comic serie
r.sadd('usr:david:items','Spiderman', 'Batman')
r.sadd('usr:pieter:items', 'Wonder Woman', 'Batman')
r.sadd('usr:martin:items', 'Spiderman')

## The following is the reverse mapping per item
r.sadd('itm:spiderman:users', 'david', 'martin')
r.sadd('itm:batman:users', 'david', 'pieter')
r.sadd('itm:wonder_woman:users', 'pieter')

print("Demo data created.")

In [None]:
# These are all the users interested in the same items as Pieter
items = r.smembers('usr:pieter:items')
users = r.sunion(["itm:" + to_key(item) + ":users" for item in items])
print("Users interested in the same items as Pieter: {0}".format(users))

# Pieter is interested in the same items as David, so here the recommendation for Pieter based on David's interests
pieter_key = 'usr:pieter:items'
print("Pieter is interested in: {0}".format(r.smembers(pieter_key)))
for usr in users:
    usr_key = "usr:" + usr + ":items"
    if usr_key != pieter_key:
        print("Pieter could be also interested in: {0}".format(r.sdiff(usr_key, pieter_key)))

## Ratings based Collaborative Filtering

Same as collaborative filtering but we are now interested in 'How much does a user like an item' which allows us to find out if 2 or more users are liking similar things. Things those are also liked by User B but not yet liked by user A could be also interesting for user A.

* Structures: **Sorted Sets**
* Operations: Intersections, Unions, Members/Scans, Ranges, Weights & Aggregations

In [None]:
# Clean
r.flushall()
print("Database cleaned.")

In [None]:
# Demo data

## Ratings by user
r.zadd('usr:david:ratings', {'spiderman': 3.0})
r.zadd('usr:david:ratings', {'batman': 1.0})
r.zadd('usr:david:ratings', {'superman': 3.0})
r.zadd('usr:pieter:ratings', {'batman': 2.0})
r.zadd('usr:pieter:ratings', {'wonder_woman': 1.0})
r.zadd('usr:pieter:ratings', {'aqua_man': 5.0})
r.zadd('usr:pieter:ratings', {'superman': 4.0})
r.zadd('usr:martin:ratings', {'aqua_man': 3.0})
r.zadd('usr:martin:ratings', {'batman': 5.0})

## Ratings by item
r.zadd('itm:spiderman:ratings', {'david': 3.0})
r.zadd('itm:batman:ratings', {'david': 1.0, 'pieter': 3.0, 'martin': 5.0})
r.zadd('itm:wonder_woman:ratings', {'pieter': 1.0})
r.zadd('itm:superman:ratings', {'david': 3.0, 'pieter': 4.0})
r.zadd('itm:aqua_man:ratings', {'pieter': 5.0, 'martin': 3.0})

print("Demo data created.")

### Sorted Set Intersections with Aggregations and Weights

* By default, the resulting score of an element is the sum of its scores in the sorted sets where it exists. 
* Weights are multiplicators for scores
* The weight is (1,-1) means that we subtract the second value from the first

### Root Mean Square

* The RMS value of a set of values is the square root of the arithmetic mean of the squares of the values.

In [None]:
import math
from functools import reduce

# Some helpers

## Root Mean Square
def calc_rms(values):
    sq_sum = reduce(lambda x, y: x + y[1] ** 2, values, 0)
    return math.sqrt(sq_sum / len(values))

## NB: redis-py doesn't support weitghts on zinterstore
def zinterstore(target, keys, weights, agg=None):
    if not agg:
        return r.execute_command('ZINTERSTORE', target, len(keys), *keys, 'WEIGHTS', *weights)
    else:
        return r.execute_command('ZINTERSTORE', target, len(keys), *keys, 'WEIGHTS', *weights, 'AGGREGATE', agg)

## NB: redis-py doesn't support weitghts on zunionstore
def zunionstore(target, keys, weights, agg=None):
    if not agg:
        return r.execute_command('ZUNIONSTORE', target, len(keys), *keys, 'WEIGHTS', *weights)
    else:
        return r.execute_command('ZUNIONSTORE', target, len(keys), *keys, 'WEIGHTS', *weights, 'AGGREGATE', agg)    

In [None]:
# Items rated by David
david_key = 'usr:david:ratings'
ratings_david = r.zrange(david_key, 0, -1)
r.zunionstore('usr:david:ratings:same', ["itm:" + rt + ":ratings" for rt in ratings_david])
users = r.zrange('usr:david:ratings:same', 0, -1)
print("The following users rated the same items then David: {}".format(users))

# Calculate similarities for David
for usr in users:
    usr_key = "usr:" + usr + ':ratings'

    if usr_key != david_key:
        
        # Set of user keys to examine
        usr_keys = [ david_key, usr_key ]

        # Weights are multiplying the scores
        zinterstore("dist:david:" + usr, usr_keys, [1, -1])
        dists = r.zrange("dist:david:" + usr, 0, -1, True, True)
        print("The rating distance to {0} is {1}".format(usr, dists))
        
        # Calculate root mean square
        rms = calc_rms(dists)
        print("The average distance (RMS) to {0} is {1}".format(usr, rms))

        # The user is similar enough to David, add items of other users 
        # to the recommendation list
        if rms <= 1:
            
            # Items those are rated by David will have a negative score
            zunionstore("rec:david", usr_keys, [-1, 1], "MIN")
            
            # Filter only items with a score between 4 and 5 out
            recommendation = r.zrangebyscore('rec:david', 4, 5, withscores=True)
            print("The following is highly recommended David: {}".format(recommendation))

## Social Collaborative Filtering

The previous examples used Sets and Sorted Sets. We are now exploring how to use Graphs. Our example is taking a social ('friend of') aspect into account.


* Data structures: **Graph**
* Operations: Traversals, Aggregations


In [3]:
import json

# RedisGraph
r_g = redis.StrictRedis('redis-17371.mf.demo.redislabs.com', '17371', decode_responses=True)

# Clean
r_g.flushall()
print("Graph database cleaned.")

Graph database cleaned.


In [4]:
# Cypher's property format is slightly different than JSON
# {"hello": "world" } is {hello: 'world'} 
def format_query_props(query):
    query = query.replace(': "', ": '")
    query = query.replace('",', "',")
    query = query.replace('"}', "'}")
    query = query.replace('"', "")
    return query

'''
CREATE (:person { name: 'A', age: B})
'''
def create_vertex(graph, label, props):
    # Some query formatting
    query = 'CREATE ( :{0} {1} )'.format(label, json.dumps(props))
    query = format_query_props(query)
    return r_g.execute_command('GRAPH.QUERY', graph, query)

'''
MATCH (a:Person),(b:Person)
WHERE a.name = 'A' AND b.name = 'B'
CREATE (a)-[r:RELTYPE]->(b)
RETURN type(r)
'''    
def create_edge(graph, slabel, tlabel, source, target, elabel):
    query = "MATCH (a:{0}),(b:{1}) WHERE a.name = '{2}' AND b.name = '{3}' CREATE (a)-[r:{4}]->(b) RETURN type(r)"
    query = query.format(slabel, tlabel, source, target, elabel)
    #DEBUG print(query)
    return r_g.execute_command('GRAPH.QUERY', graph, query)
    
'''
MATCH (a:Person)-[r:RELTYPE]->(b:Person)
WHERE a.name = 'A'
RETURN b.name
'''
def get_neighbours(graph, slabel, tlabel, elabel, source):
    query = "MATCH (a:{0})-[r:{1}]->(b:{2}) WHERE a.name = '{3}' RETURN b.name".format(slabel, elabel, tlabel, source)
    #DEBUG: print(query)
    return r_g.execute_command('GRAPH.QUERY', graph, query)

In [5]:
# Constants

## GRAPH NAME
GRAPH = 'Comics'

## VERTICES
T_PERSON = 'Person'
T_COMIC = 'Comic'
T_CATEGORY = 'Category'

## EDGES
R_FRIEND = 'IS_FRIEND_OF'
R_LIKES = 'LIKES'
R_TYPE = 'TYPE_OF'

In [6]:
# Create some vertices

## Persons
david={"name": "David", "age": 38, "gender": "male"}
pieter={"name": "Pieter", "age": 35, "gender": "male"}
itamar={"name": "Itamar", "age": 40, "gender": "male"}
vassilis={"name": "Vassilis", "age": 39, "gender": "male"}
katrin={"name": "Katrin", "age": 38, "gender": "female"}
romy={"name": "Romy", "age": 35, "gender": "female"}

## Comics
spiderman={"name": "Spiderman"}
batman={"name": "Batman"}
wonder_woman={"name": "Wonder Woman"}
superman={"name": "Superman"}
aquaman={"name": "Auqaman"}
valierian={"name" : "Valerian"}
fantastic_four={"name" : "Fantastic Four"}

## Categories
super_heros = { "name" : "Super Heros" }
scifi = { "name" : "SciFi" }

## Load
v_persons = [ david, pieter, itamar, vassilis, katrin, romy ]  
v_comics = [ spiderman, batman, wonder_woman, superman, aquaman ]
v_categories = [ super_heros, scifi ]
for v in v_persons:
    create_vertex(GRAPH, T_PERSON, v)
    
for v in v_comics:
    create_vertex(GRAPH, T_COMIC, v)

for v in v_categories:
    create_vertex(GRAPH, T_CATEGORY, v)
    
print("Vertices created.")

Vertices created.


In [7]:
# Create some edges

## Person has Friends
create_edge(GRAPH, T_PERSON, T_PERSON, 'David', 'Pieter', R_FRIEND)
create_edge(GRAPH, T_PERSON, T_PERSON, 'David', 'Vassilis', R_FRIEND)
create_edge(GRAPH, T_PERSON, T_PERSON, 'David', 'Katrin', R_FRIEND)

## Person likes Comics
create_edge(GRAPH, T_PERSON, T_COMIC, 'David', 'Spiderman', R_LIKES)
create_edge(GRAPH, T_PERSON, T_COMIC, 'David', 'Batman', R_LIKES)
create_edge(GRAPH, T_PERSON, T_COMIC, 'Pieter', 'Batman', R_LIKES)
create_edge(GRAPH, T_PERSON, T_COMIC, 'Pieter', 'Wonder Woman', R_LIKES)
create_edge(GRAPH, T_PERSON, T_COMIC, 'Vassilis', 'Wonder Woman', R_LIKES)
create_edge(GRAPH, T_PERSON, T_COMIC, 'Vassilis', 'Superman', R_LIKES)

## Comic is type of
create_edge(GRAPH, T_CATEGORY, T_COMIC, 'Super Heros', 'Spiderman', R_TYPE)
create_edge(GRAPH, T_CATEGORY, T_COMIC, 'Super Heros', 'Batman', R_TYPE)
create_edge(GRAPH, T_CATEGORY, T_COMIC, 'Super Heros', 'Wonder Woman', R_TYPE)
create_edge(GRAPH, T_CATEGORY, T_COMIC, 'Super Heros', 'Superman', R_TYPE)
create_edge(GRAPH, T_CATEGORY, T_COMIC, 'Super Heros', 'Aquaman', R_TYPE)
create_edge(GRAPH, T_CATEGORY, T_COMIC, 'SciFi', 'Valerian', R_TYPE)
create_edge(GRAPH, T_CATEGORY, T_COMIC, 'SciFi', 'Fantastic Four', R_TYPE)

print("Edges created.")

Edges created.


In [8]:
## Basic test of the Graph
print("David has the following friends: {0}".format(get_neighbours(GRAPH, T_PERSON, T_PERSON, R_FRIEND, 'David')))
print("David likes {0}".format(get_neighbours(GRAPH, T_PERSON, T_COMIC, R_LIKES, 'David')))

## Super hero comics that David's friends like
'''
MATCH (person:Person)-[:IS_FRIEND_OF]->(friend)-[likes:LIKES]->(comic)<-[:TYPE_OF]-(type) 
WHERE person.name = 'David' 
AND type.name = 'Super Heros' 
RETURN comic.name, count(likes) AS relevance 
ORDER BY relevance DESC
LIMIT 10
'''
query = "MATCH (person:Person)-[:IS_FRIEND_OF]->(friend)-[likes:LIKES]->(comic)<-[:TYPE_OF]-(type) WHERE person.name = 'David' AND type.name = 'Super Heros' RETURN comic.name, count(likes) AS relevance ORDER BY relevance DESC LIMIT 10"
result = r_g.execute_command('GRAPH.QUERY', GRAPH, query)
for i in range(0, len(result[0])):
    if i != 0:
        r = result[0][i]
        comic = r[0]
        relevance = r[1]
        print("Comic {0} with relevance {1}".format(comic, relevance))

David has the following friends: [[['b.name'], ['Pieter'], ['Vassilis'], ['Katrin']], ['Query internal execution time: 0.242109 milliseconds']]
David likes [[['b.name'], ['Spiderman'], ['Batman']], ['Query internal execution time: 0.120604 milliseconds']]
Comic Wonder Woman with relevance 2
Comic Batman with relevance 1
Comic Superman with relevance 1


## Content Relevance via Full Text Search

RediSearch has multiple built-in scoring functions. The default one is T(erm)F(requency)I(inverse)D(ocument)F(requency). The way it works is the following one:

1. Term Frequency: How often does a specific term appear?
2. Inverse Document Frequency: An inverse document frequency factor is incorporated which diminishes the weight of terms that occur very frequently in the document set and increases the weight of terms that occur rarely (i.e. the relevance of the word 'the')

Furhter details about scoring can be found here:

* https://oss.redislabs.com/redisearch/Scoring/

Here, as before, the characteristics of this example:

* Data structures: **Inverted Index**
* Operations: Text search, Scoring


In [None]:
# RediSearch
r_s = redis.StrictRedis('localhost', '9999', decode_responses=True)

# Clean database
r_s.flushall();
print("Full text search database cleaned.")

In [None]:
# Converts a dict into a list of property values
def format_doc_fields(fields):
    result=[]
    for f in fields:
        result.append(f)
        result.append(fields[f])
    return result

# Add a document to the index
def add_doc(index, doc_id, score, fields):
    fields = format_doc_fields(fields);
    r_s.execute_command("FT.ADD", index, doc_id, score, "FIELDS", *fields)

In [None]:
# Create the index
COMIC_IDX = 'comic_idx'

## The schema of the index
schema = { "name" : "TEXT", "type" : "TEXT", "edition" : "NUMERIC", "released" : "NUMERIC", "desc" : "TEXT"}
schema = format_doc_fields(schema)

r_s.execute_command("FT.CREATE", COMIC_IDX, "SCHEMA", *schema)    

# Add some search docs
spiderman={"name": "Spiderman", "type" : "hero", "edition" : 1, "released" : 1962, "desc" : "Spiderman is a fictional super hero created by writer-editor Stan Lee and writer-artist Steve Ditko."}
batman={"name": "Batman", "type" : "dark", "edition" : 1, "released" : 1939, "desc" : "Batman is a fictional hero appearing in American comic books published by DC Comics. The character was created by artist Bob Kane and writer Bill Finger."}

# We are not weighting the docs themselves, 
# if we would do then this would have an impact on the score
add_doc(COMIC_IDX, 'spiderman', 1, spiderman)
add_doc(COMIC_IDX, 'batman', 1, batman)    

print("Demo data created.")

In [None]:
# Search with scorer
# Spiderman belongs more likely to the category 'hero' than Batman 
# as it has the type 'hero' and the word 'hero' appears
# in the derscription
CATEGORY = 'hero'
r_s.execute_command("FT.SEARCH", COMIC_IDX, CATEGORY, 'WITHSCORES' , 'SCORER', 'TFIDF.DOCNORM', 'WITHSCORES', 'RETURN', 1, 'name')

## Probabilistic Data Structures


Probabilistic data structures are defined in the follwoing way. They 

* Use hash functions for randamization purposes
* Return an approximated result
* The error is under a specific threshold
* Much more space efficient than deterministic approaches
* Provide a constant query time

You would use them because sometimes …

* Speed is more important than correctness
* Compactness is more important than correctness
* You only need certain data guarantees

It's possible to combine them with deterministic approaches (i.e. HLL + det. counter for discovering counter manipulations).

We will take a look at the following 2 structures:

* **HyperLogLog**: Cardinality estimation of a set, i.e. unique visits
* **Bloom Filter**: Check if an item is contained in a set whereby false-positves are possible

In [None]:
# ReBloom
r_b = redis.StrictRedis('localhost', '5555', decode_responses=True)

# Clean
r_b.flushall()
print("Bloom filter database cleaned.")

In [None]:
# Track the number of unique users those wanted to have a specific comic edition
r_b.pfadd("wanted:spiderman:1", "david", "pieter", "katrin", "romy")

# Add David again to show that it is about unique users
r_b.pfadd("wanted:spiderman:1", "david")


print("HLL initial size: {0}".format(r_b.execute_command("DEBUG OBJECT", "wanted:spiderman:1")["serializedlength"]))
print("Approx. count: {0}".format(r_b.pfcount("wanted:spiderman:1")))

print("Please wait ...")
for i in range(0, 100000):
    r_b.pfadd("wanted:spiderman:1", "user:{0}".format(i))

print("Final HLL size: {0} bytes".format(r_b.execute_command("DEBUG OBJECT", "wanted:spiderman:1")["serializedlength"]))
print("Approx. count: {0}".format(r_b.pfcount("wanted:spiderman:1")))

In [None]:
# Check if a user is interested a specific comic category w/o actually storing the users per category set
r_b.execute_command("BF.ADD", "ctg:super-heros", "david")
r_b.execute_command("BF.ADD", "ctg:super-heros", "pieter")
r_b.execute_command("BF.ADD", "ctg:super-heros", "vassilis")
r_b.execute_command("BF.ADD", "ctg:fantasy", "katrin")

print("BF size: {0} bytes".format(r_b.execute_command("DEBUG OBJECT", "ctg:super-heros")["serializedlength"]))
print("BF size: {0} bytes".format(r_b.exe2cute_command("DEBUG OBJECT", "ctg:fantasy")["serializedlength"]))

print("Is Katrin interested in Fantasy?: {0}".format(r_b.execute_command("BF.EXISTS", "ctg:fantasy", "katrin")))
print("Is Katrin interested in Super Heros?: {0}".format(r_b.execute_command("BF.EXISTS", "ctg:super-heros", "katrin")))
print("Is David interested in Super Heros?: {0}".format(r_b.execute_command("BF.EXISTS", "ctg:super-heros", "david")))

## Machine Learning for Classifications and Predictions

The idea here is to train a model and then use such a model in order to classify a user. Here 2 examples for such models:

* Decision Tree ensembles (random forests). The idea is to conduct a forest of decision trees at training time. RedisML can be used for the Model Serving by leveraging these decision trees for i.e. classification purposes. The class which appears most often will be the winner.

* Neural networks: Train the weighted connections between neurons by using a learning algorithm (i.e. Backpropagation). Such networks are function approximators, meaning that the input vector will be mapped to an output vector. The output vector can tell you for instance how likely a given input is belonging to a specific class or category.

### Redis-ML and Tree Ensembles

The idea is to train the model by adding decision trees based on known users (i.e. Robin is 14 years old, he has about 5000 comics but he doesn't like Manga comics) by then deriving the class to which a user belongs. The class 'likes Manga Comics' is for instance binary. Either you like them (1) or you hate them (0).

In [None]:
# RedisML
r_m = redis.StrictRedis('localhost', '3333', decode_responses=True)

# Clean DB
r_m.flushall()
print("ML database cleaned.")

In [None]:
# Add a very small binary tree with a criteria name and value
def add_to_tree(forest, tree, cname, cvalue ):
    return r_m.execute_command("ML.FOREST.ADD", forest, tree, ".", "NUMERIC", cname, cvalue, ".l", "LEAF", 1, ".r", "LEAF", 0)
    
# Basic formatting of feature vectors. They need to be in the format a:b,c:d
def format_features(features):
    v = ""
    for f in features:
        e = "{0}:{1}".format(f, features[f])
        v += e + ","
    return v.strip(',')

# Classify based on the provided user features
def classify(forest, features):
    v = format_features(features)
    return r_m.execute_command("ML.FOREST.RUN", forest, v, "CLASSIFICATION")

In [None]:
# Two very small decision trees:
## Persons with an age of lower than or equal 20 are liking Manga Comics
### First tree
add_to_tree("manga", 0, "age", 20 )

## Persons those with more than 1000 comics are not liking Manga Comics
### Second tree
add_to_tree("manga", 1, "numcomics", 1000)

print("Does David likes Manga Comics?: {0}".format(classify("manga", { "age" : 38 })))
print("Does Philip likes Manga Comics?: {0}".format(classify("manga", { "age" : 11 })))
print("Does Robin likes Manga Comics?: {0}".format(classify("manga", { "age" : 14, "numcomics" : 5000 })))

### Other Deep Learning Modules

* **RedisAI**: Is Redis module for serving tensors and executing deep learning graphs. The source code is available here: https://github.com/RedisAI/RedisAI . The idea is to to train your models (i.e. neural networks) in Tensorflow and then to load the model to Redis in order to execute on it.

<center>
<img style="margin: 10px; height: 200px; text-align: center" src="https://www.tensorflow.org/images/tensors_flowing.gif">
</center>

* **Neural Redis**: Is a Redis module that implements feed forward neural networks as a native data type for Redis. The project goal is to provide Redis users with an extremely simple to use machine learning experience.



## Questions?

Follow  me on Twitter!

@martinez099