# Introduction to NoSQL and Object Storage.

This lesson walks through the create and read operations on `redis`. We will also fetch data from `google cloud storage`.

## Redis

Redis is an in-memory data structure store, used as a database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs and geospatial indexes with radius queries.

We will be connecting to a redis database hosted on Redis Labs. Redis Labs is a cloud database service that allows you to host redis databases on the cloud.

Prerequisite: The learner is requested to set up an account on Redis [here](https://redis.io/) and set up a (free tier) cluster. 

If you need some guides, please refer to the screenshots below:

[Step 1](../assets/redis_create_db_step1.png)  (create database)

[Step 2](../assets/redis_create_db_step2.png)  (choose **free** cluster, leave all other settings as **default** including `Name`, `Cloud vendor`, `Region`. Click the `Create database` button below.)

[Step 3](../assets/redis_create_db_step3.png)  (click 'connect' to get connect instructions)

[Step 4](../assets/redis_create_db_step4.png)  (choose 'Redis Client' - 'Python')

[Step 5](../assets/redis_create_db_step5.png) (copy and paste the python code into the cell below - Note: please use the `Copy` button provided at the bottom right instead of manually copying and paste. If you manually copy and paste, your password(auto-generated) will not be copied over!)

We will be using the `redis-py` library to connect to the redis database

In [None]:
# Paste your code from Step 5 above below this line
# -------------------------------------------------
"""Basic connection example.
"""
!pip install redis google-cloud-storage #to install the redis package

import redis

r = redis.Redis(
    host='redis-14183.c241.us-east-1-4.ec2.cloud.redislabs.com',
    port=14183,
    decode_responses=True,
    username="default",
    password="nbyO5IvUEC5NrmtxXxo5BDmgZCGUIe7s",
)

success = r.set('foo', 'bar')
# True

result = r.get('foo')
print(result)
# >>> bar







Collecting redis
  Obtaining dependency information for redis from https://files.pythonhosted.org/packages/e9/97/9f22a33c475cda519f20aba6babb340fb2f2254a02fb947816960d1e669a/redis-7.0.1-py3-none-any.whl.metadata
  Downloading redis-7.0.1-py3-none-any.whl.metadata (12 kB)
Collecting google-cloud-storage
  Obtaining dependency information for google-cloud-storage from https://files.pythonhosted.org/packages/20/81/a567236070e7fe79a17a11b118d7f5ce4adefe2edd18caf1824d7e29a30a/google_cloud_storage-3.5.0-py3-none-any.whl.metadata
  Downloading google_cloud_storage-3.5.0-py3-none-any.whl.metadata (13 kB)
Collecting async-timeout>=4.0.3 (from redis)
  Obtaining dependency information for async-timeout>=4.0.3 from https://files.pythonhosted.org/packages/fe/ba/e2081de779ca30d473f21f5b30e0e737c438205440784c7dfc81efc2b029/async_timeout-5.0.1-py3-none-any.whl.metadata
  Using cached async_timeout-5.0.1-py3-none-any.whl.metadata (5.1 kB)
Collecting google-auth<3.0.0,>=2.26.1 (from google-cloud-storag

In [4]:
# # Either use the code provided from Step 5 above or the code below to connect to your Redis database.
# # Make sure to replace <REDIS-URL> and <YOUR-PASSWORD> with your actual Redis database URL and password.
# # If you are using the code from Step5, you can skip this section.
# import redis

# r = redis.Redis(
#   host='<REDIS-URL>', # E.g.'redis-10908.c252.ap-southeast-1-1.ec2.cloud.redislabs.com'
#   port=10908,
#   password='<YOUR-PASSWORD>' 
# )

A Redis database holds `key:value pairs` and supports commands such as GET, SET, and DEL, as well as several hundred additional commands.

- Redis keys are always strings.
- Redis values may be a number of different data types. Some of the more essential value data types are- string, list, hashes, and sets. Some advanced types include geospatial items and stream.

Many Redis commands operate in constant O(1) time, just like retrieving a value from a Python dict or any hash table.

Let's create a new key called `'name'` with the value `'Aaron'`.

In [5]:
r.set('name', 'Aaron')

True

Read the value of the key `'name'`:

In [4]:
r.get('name')

'Aaron'

We can update the value with `.set` too:

In [None]:
r.set('name', 'Bob')

True

In [None]:
r.get('name')

'Bob'

> Set a key `age` with value of `20`.
>
> Then read the value.

To push a list, you need to use `rpush`:

In [None]:
r.rpush("names", "Aaron", "Bob", "Charlie")

6

In [None]:
r.lindex("names", 1)

'Bob'

You can use `mset` to set multiple keys at once.

In [None]:
r.mset({
    "name": "John",
    "age": 30,
})

True

In [None]:
r.mget("name", "age")

['John', '30']

Redis `hashes` are record types structured as collections of field-value pairs. You can use hashes to represent basic objects.

```python
# Create a new hash with my name as the key
r.hset(
    'zane lim',
    mapping={
        "age": 21,
        "email": "zl@gmail.com",
        "hobby": "coding",
    },
)
```

Then get the hash nested value back:


In [None]:
r.hset(
    'zane lim',
    mapping={
        "age": 21,
        "email": "zl@gmail.com",
        "hobby": "coding",
    },
)

3

In [None]:
r.hget("zane lim", "email")

'zl@gmail.com'

Get the object back as a dictionary:

In [None]:
r.hgetall("zane lim")

{'age': '21', 'email': 'zl@gmail.com', 'hobby': 'coding'}

> Create a new hash with your name as the key, and a mapping of `age`, `email`, `hobby`.

Always a good practice to shutdown your Redis cluster if not going to be used in future. Click into your DB and hit `Delete`. See this [screenshot](../assets/redis_terminate_db.png) for a guide.

## Google Cloud Storage

Google Cloud Storage is an Object Storage service in Google Cloud.

### Bucket
- A bucket is a container for objects stored in Google Cloud Storage.
- Every object is contained in a bucket.
- Each bucket is associated with a project.
- A bucket has a unique name across all of Google Cloud Storage.

### Object
- An object is a piece of data, such as a file, that is stored in Google Cloud Storage.
- An object is also called a `blob` (binary large object) in Google Cloud Storage. 
- An object is composed of the object's data and its metadata. 
- Metadata is a collection of name-value pairs that describe the object. You can use metadata to search for objects.

We will be using the `google-cloud-storage` python library to fetch data from the public [Landsat Collection 1](https://console.cloud.google.com/storage/browser/gcp-public-data-landsat;tab=objects?prefix=&forceOnObjectsSortingFiltering=false) dataset demonstrated just now.

In [4]:
from google.cloud import storage

In [5]:
client = storage.Client()

In [6]:
bucket = client.get_bucket('gcp-public-data-landsat')

Note that you need to do `gcloud auth application-default login` to run the cell above. 

If the error persists, you may also need to restart the kernel (in VSCode, click the `Restart` button).

Get bucket metadata:

In [7]:
print("Bucket name: {}".format(bucket.name))
print("Bucket location: {}".format(bucket.location))
print("Bucket storage class: {}".format(bucket.storage_class))

Bucket name: gcp-public-data-landsat
Bucket location: US
Bucket storage class: STANDARD


List blobs in a bucket:

In [8]:
blobs = bucket.list_blobs()

print("Blobs in {}:".format(bucket.name))
for ix, item in enumerate(blobs):
    print("\t" + item.name)
    if ix == 50:
        break

Blobs in gcp-public-data-landsat:
	LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_ANG.txt
	LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B1.TIF
	LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B10.TIF
	LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B11.TIF
	LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B2.TIF
	LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B3.TIF
	LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B4.TIF
	LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B5.TIF
	LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B6.TIF
	LC08/01/

Get a blob and display metadata:

In [9]:
blob = bucket.get_blob("LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B1.TIF")

print("Name: {}".format(blob.id))
print("Size: {} bytes".format(blob.size))
print("Content type: {}".format(blob.content_type))
print("Public URL: {}".format(blob.public_url))

Name: gcp-public-data-landsat/LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B1.TIF/1502391058568908
Size: 75085385 bytes
Content type: application/octet-stream
Public URL: https://storage.googleapis.com/gcp-public-data-landsat/LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B1.TIF


Download a blob to a local directory:

In [10]:
output_file_name = "../output/LC08_L1GT_001002_20160817_20170322_01_T2_B1.TIF"
blob.download_to_filename(output_file_name)

print("Downloaded blob {} to {}.".format(blob.name, output_file_name))

Downloaded blob LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B1.TIF to ../output/LC08_L1GT_001002_20160817_20170322_01_T2_B1.TIF.
