# Storing User Data in Hashes

## Hashes

Redis provides users with a hash table or dictionary data structure that maps keys to particular values with constant access time. Redis uses the term Hash for this particular data structure. Since Redis uses the term key to refer to the name referencing a data structure, the keys used to look up items in a Redis hash data structure are called fields and the items retrieved are referred to as values.

A Redis Hash has to have at least one field, and values can be any string.

For this example, we will use the Redis Hash data structure to store user session records.

### User  Records

Language vary in the patterns the use to represent database records. Some languages map database records into classes, others map them into structs, and some map them into dictionaries. For our application, we are going to use the class User to represent the user objects we are working with in our database.

Our user object has five properties: `id`, `username`, `fname` (given name), `lname` (last name or surname) and `email` address.  It also has a function `get_key()` to generate the appropriate key for an object. For our database, we will use the convention that user records are stored under the key "user:{id}"

In [None]:
class User(object):

    def __init__(self, **kwargs):
        self.id = None
        self.username = None
        self.fname = None
        self.lname = None
        self.email = None

        for key in kwargs:
            setattr(self, key, kwargs[key])

    def __str__(self):
        return str(self.__dict__)

    def get_key(self):
        if self.id is not None:
            return "user:" + str(self.id)
        else:
            return None
        
# with no arguments, we create an empty user
empty = User()
print ("empty user: {}".format(empty))
print ("empty user key: {}".format(empty.get_key()))

# we can create a dictionary of properties which 
user_info = {
    'id': 147,
    'username': 'ruser',
    'fname': 'Redis',
    'lname': 'User',
    'email': 'ruser@somedomain.net' }
ruser = User(**user_info)
print ("ruser: {}".format(ruser))
print ("ruser key: {}".format(ruser.get_key()))


### Adding data to the database

We can store our user objects in the database using the Redis commands for working with hashes. There are two primary commands [`HSET`](https://redis.io/commands/hset) and [`HMSET`](https://redis.io/commands/hmset) used to add hash data to Redis. The `HSET` command allows us to store a single field of a hash into the database.

In the code below, we create a version of our Redis User object to work with and store the users email only into the database:

In [None]:
import redis

# example connection parameters 
config = {
    "host": "redis",
    "port": 6379
}

r = redis.StrictRedis(**config)
r_info = {
    'id': 147,
    'username': 'ruser',
    'fname': 'Redis',
    'lname': 'User',
    'email': 'ruser@somedomain.net' }
r_user = User(**r_info)

# store only our user's email in the database
res = r.hset(r_user.get_key(), 'email', r_user.email)

print ("Result: {}".format(res))


The result from the call to `hset()` will be 1 if a new hash key was created or 0 if it already exists.

We can also add multiple fields to a Redis Hash in a single call to the database using the `HMSET` command. In the code below, we create a new version of our Redis User object and then create a new Hash in the Redis database by passing the key and a dictionary of all the fields in a database.

Check from redisinsight to see the User174 hash.

In [None]:

# create our Redis user
r_info = {
    'id': 281,
    'username': 'ruser',
    'fname': 'Redis',
    'lname': 'User',
    'email': 'ruser@somedomain.net' }
r_user = User(**r_info)
print (r_user.get_key())

# store our entire user record in the database
res = r.hmset(r_user.get_key(), r_user.__dict__)
print ("Result: {}".format(res))


The results for the `hmset()` call are simply `True` or `False`, if the operation succeeded or not.

### Getting data from the database

We can fetch our user records from Redis using three basic patterns: fetch individual fields, fetch multiple fields, and fetch all the fields of a hash. These patterns are supported by the [`HGET`](https://redis.io/commands/hget), [`HMGET`](https://redis.io/commands/hmget), and [`HGETALL`](https://redis.io/commands/hgetall) commands respectively. To setup the database for this section, please execute the code below:

In [None]:
# setup example data
r_info = {
    'id': 300,
    'username': 'ruser',
    'fname': 'Redis',
    'lname': 'User',
    'email': 'ruser@somedomain.net' }
r_user = User(**r_info)
r.hmset(r_user.get_key(), r_user.__dict__)

# initialize key with our example users key
key = r_user.get_key()


In the first example, we can get just the email address of a user from the user key using the `HGET` call.

In [None]:
print ("Email: {}".format(r.hget(key, 'email').decode('utf-8')))


The result from this call should be the email ("ruser@somedomain.net") of the sample user we created in the first part of this section.

In the next example, we get both the first name and the last name for our sample user based on hash key. In this example, we pass the key and list of fields we are interested in to the `HMGET` call.

In [None]:
print (r.hmget(key, ['fname', 'lname']))


The result from this call is an ordered list of values which matches the order of fields we passed as a list to the `hmget()` call.

Finally, we can get all the fields of our database record using the `HGETALL` command.

In [None]:
print (r.hgetall(key))


The result of the `hgetall()` call is a dictionary mapping the hash fields to the field values.

### Flexible record structures

Redis, unlike some databases, does not require schemas or predefined structures, so as our application evolves, our user records do not always have to contain the same fields.

In the example below, we create two users, one with a `verified` field and one without. We can store both of these users into the database without problems:

In [None]:
r1_info = {
    'id': 281,
    'username': 'ruser',
    'fname': 'Redis', 
    'lname': 'User',
    'email': 'ruser@somedomain.net' }
r1_user = User(**r1_info)

r2_info = {
    'id': 282,
    'verified': 'True',
    'username': 'ruser',
    'fname': 'Redis',
    'lname': 'User',
    'email': 'ruser@somedomain.net' }
r2_user = User(**r2_info)

r.hmset(r1_user.get_key(), r1_user.__dict__)
r.hmset(r2_user.get_key(), r2_user.__dict__)

Using `HGETALL`, we can see that Redis stores the structure of our records independently:

In [None]:
print ("r1: {}".format(r.hgetall(r1_user.get_key())))
print ("r2: {}".format(r.hgetall(r2_user.get_key())))

print ("Verified (r1): {}".format(r.hget(r1_user.get_key(), 'verified')))
print ("Verified (r2): {}".format(r.hget(r2_user.get_key(), 'verified')))


### Working with Hash fields

Redis provides many commands for manipulating the fields of a hash in the database.

To setup the database for these examples, please run the following code:

In [None]:
r_info = {
    'id': 400,
    'verified': 'True',
    'username': 'ruser',
    'fname': 'Redis',
    'lname':'User',
    'email': 'ruser@somedomain.net' }
r_user = User(**r_info)

key = r_user.get_key()
r.hmset(key, r_user.__dict__)


In a development version of our application, we added the verified field to our user.  After discussing the change, we decided not to apply it to our system. Our test data has records with the verified field that we want to clean up. We can use the [`HKEYS`](https://redis.io/commands/hkeys) command to fetch the fields in our hashes, the [`HDEL`](https://redis.io/commands/hdel) command to remove the field if it is there, and the [`HEXISTS`](https://redis.io/commands/hexists) command to verify its removal:

In [None]:
print ("Initial keys: {}".format(r.hkeys(key)))
print ("Delete 'verified': {}".format(r.hdel(key, 'verified')))
print ("'verified' exists: {}".format(r.hexists(key, 'verified')))


When we look at the initial state of our hash, we have a verified key. We run our code to restore our test data to the expected format, and then verified that it does not exist anymore.

### Counting with hashes

Recall the discussion about counting votes. The values in a Redis Hash can, identically to Redis Strings, represent integer and floating point numbers. The [`HINCRBY`](https://redis.io/commands/hincrby) command increments an integer by an integer argument, whereas [`HINCRBYFLOAT`](https://redis.io/commands/hincrbyfloat) manipulates floating point values.

### Hashes as key-value stores

The Redis keyspace is itself just a big hash table, so the Hash data structure can be thought of as a miniature Redis. In the vote counting example, we used one Redis String key for each item votes counter. We can change that design to use a single Hash instead, in which each field corresponds to an item and its value is the votes counter:

In [None]:
def upvote_item(r, item_id):
    "Upvotes an item and stores it in a Hash"

    return r.hincrby('item_vote_counters', item_id, 1)

# vote once for one item and twice for another
upvote_item(r, 1)
upvote_item(r, 2)
upvote_item(r, 2)

print (r.hgetall('item_vote_counters'))

This design aggregates the different counters inside a single data structure. It provides a logical way for managing related data together, much like the using a Hash to store records (that are groups of related fields). However, in this case, the Hash is also somwhat like a table in a relational database - a structure that holds multiple records, each made up of a primary key (the item id) and a single value field (the counter).

### Reducing RAM overhead with Hashes

The major benefit derived from using Hashes as key-value stores is savings in RAM overheads. 

Each individual key in Redis, regardless its name and value, requires about 70 bytes (on 64-bit architectures) for administrative purposes. This overhead is negligible with small datasets, but can become expensive as the volume increases. That is especially true with keys storing small values such as counters.

A Hash is a key like any other, so it also requires the same overhead. However, each field (and value) in the Hash can be stored more efficiently compared to the global keyspace. This form of storage requires more processing power to access, but for small Hashes the CPU penality is minimal. There are two configuration settings that determine the threshold between small (memory-efficient but CPU-intensive) Hash encoding, a.k.a. ziplist, and the default one:

In [None]:
print (r.config_get('hash-max-ziplist*'))

The value of `hash-max-ziplist-entries` is the maximal number of entries (fields and values) that the Hash can have using the ziplist encoding, and the default value means that Hashes up to 256 fields are encoded as ziplists. `hash-max-ziplist-value` is the maximal length of an element (field or value) in a ziplist Hash. Crossing the thresholds upwards triggers an automatic conversion from the ziplist encoding to the default one, but not vice versa.

### Considerations for working with large hashes

Because Hashes can have up to 2^32-1 (4,294,967,295) fields, each potentially a 0.5GB String, they can become quite large. Using commands that operate on the entire Hash - namely `HGETALL`, `HKEYS` and [`HVALS`](https://redis.io/commands/hvals) - could become expensive due to the volume of data involved.

When the requirements make it neccesary to fetch the entire contents of a big Hash, consider iterating the data structure instead of reading whole it with one of the above-mentioned commands. Iterating a Hash, or scanning it, is done with the [`HSCAN`](https://redis.io/commands/hscan) command. Note, however, that `HSCAN` may return duplicates and is not guarenteed to return data added while iterating.

## Review 

In this chapter, we looked at working with the Redis hash datatype to store user records in the database. Redis provides a hash datatype that is similar to the hash table and dictionary types provided in most modern programming languages.

We looked at various commands provided by Redis to manipulate hashes and learned how to store a structured object in the database and read it back from the server. We also saw how we could manipulate the members of a hash on the server to update and delete data directly.  

This chapter has covered a wide range of different hash functionality, but not all of the commands are covered in this chapter. For additional information on Redis hashes as well as a complete command reference, see the [Hash Commands](https://redis.io/commands#hash) page at [Redis.io](http://www.redis.io).