# Introduction to NoSQL and Object Storage.

This lesson walks through the create and read operations on `redis`. We will also fetch data from `google cloud storage`.

2Ô∏è‚É£ ‚ÄúData structure store‚Äù
Redis doesn‚Äôt just store plain text or rows of data.
It stores data structures, like these:
| Data Type      | Description                         | Example Use                                      |
| -------------- | ----------------------------------- | ------------------------------------------------ |
| **String**     | Simple value                        | `user:1 -> "Elton"`                              |
| **List**       | Ordered sequence                    | `recent_searches -> ["python", "pandas", "sql"]` |
| **Set**        | Unique unordered items              | `tags -> {"data", "ai", "ml"}`                   |
| **Hash**       | Key-value pairs (like a small JSON) | `user:1 -> {name:"Elton", age:30}`               |
| **Sorted Set** | Ordered by score                    | `leaderboard -> {"Alice":100, "Bob":90}`         |

So Redis is not just a key-value store ‚Äî it‚Äôs a data structure store. You can manipulate lists, sets, and more directly in memory.


Redis VS MongoDB

| Feature              | **Redis**                                                | **MongoDB**                                               |
| -------------------- | -------------------------------------------------------- | --------------------------------------------------------- |
| **Full Form / Type** | RE**mote DI**ctionary **S**erver                         | ‚ÄúHumongous‚Äù (as in ‚Äúhuge‚Äù) Database                       |
| **Primary Type**     | **In-memory data structure store** (key-value)           | **Document-oriented NoSQL database**                      |
| **Data Storage**     | In **RAM** (very fast, but temporary by default)         | On **Disk** (persistent storage)                          |
| **Data Format**      | Key ‚Üí Value (value can be string, list, set, hash, etc.) | JSON-like documents (BSON format)                         |
| **Speed**            | ‚ö° Extremely fast (microseconds)                          | ‚ö° Fast, but slower than Redis                             |
| **Persistence**      | Optional (can save snapshots to disk)                    | Persistent by default                                     |
| **Best Use Cases**   | Caching, session storage, message queues, leaderboards   | Storing application data, user profiles, product catalogs |
| **Query Language**   | Redis commands (e.g., `SET`, `GET`, `HGETALL`)           | MongoDB Query Language (similar to JSON)                  |
| **Scaling**          | Easy to scale horizontally (clustered caching)           | Scales horizontally with sharding and replication         |
| **Schema**           | No schema (just key-value)                               | Schema-less but structured JSON documents                 |
| **Durability**       | Optional (data can be lost on restart if not persisted)  | Durable ‚Äî data is safely stored on disk                   |


üß† Simplified Explanation
üü• Redis
Think of Redis as a super-fast temporary memory for your application.
It‚Äôs like your brain‚Äôs short-term memory: fast, but not meant to last forever.
Used when you need speed and can afford to recreate the data if lost.
üß© Examples:
Caching results from a slow API or database
Storing user login sessions
Managing real-time counters or leaderboards
Message broker or queue system (Pub/Sub)

üü© MongoDB
Think of MongoDB as a permanent, flexible NoSQL database.
It‚Äôs like your long-term memory: stores complex, structured information safely.
Great for storing large collections of data with flexible schemas.
üß© Examples:
Storing user data ({ name: "Elton", age: 25, country: "SG" })
Product catalogs in e-commerce sites
Blog posts, comments, and reviews
Any structured but schema-flexible data                                               

‚öôÔ∏è Analogy

| Analogy       | Redis                                  | MongoDB                           |
| ------------- | -------------------------------------- | --------------------------------- |
| Like...       | A **whiteboard** (fast to write/erase) | A **notebook** (data stays there) |
| Data lifetime | Short-term / temporary                 | Long-term / permanent             |
| Access speed  | Lightning-fast (RAM)                   | Fast (Disk + indexes)             |


üí¨ Example Scenario
Let‚Äôs say you‚Äôre building a shopping website:
| Task                                  | Best for    | Reason                                  |
| ------------------------------------- | ----------- | --------------------------------------- |
| Store user accounts, products, orders | **MongoDB** | Needs structured, permanent data        |
| Store user session after login        | **Redis**   | Needs fast, temporary access            |
| Cache most-viewed products            | **Redis**   | Avoid reloading from MongoDB repeatedly |
| Store product catalog                 | **MongoDB** | Needs long-term persistence             |


üß© Short Answer:
Yes. Redis and MongoDB are often used together, not as competitors ‚Äî they complement each other.
They play different roles in a modern backend or database cluster:
| Role                                  | Tool           | Why                                                      |
| ------------------------------------- | -------------- | -------------------------------------------------------- |
| **Main database (long-term storage)** | üü© **MongoDB** | Stores all the core, persistent data                     |
| **Cache / high-speed memory layer**   | üü• **Redis**   | Keeps *copies* of frequently used data for faster access |


Redis and MongoDB are often used *together*, not as competitors ‚Äî they complement each other.**

They play **different roles** in a modern backend or database cluster:

| Role                                  | Tool           | Why                                                      |
| ------------------------------------- | -------------- | -------------------------------------------------------- |
| **Main database (long-term storage)** | üü© **MongoDB** | Stores all the core, persistent data                     |
| **Cache / high-speed memory layer**   | üü• **Redis**   | Keeps *copies* of frequently used data for faster access |

---

## ‚öôÔ∏è How they work *hand in hand*

Imagine your web app (or API) like this:

```
User ‚Üí Web App ‚Üí [Redis Cache] ‚Üí [MongoDB Database]
```

Let‚Äôs see what happens when a user requests data:

---

### üß† Step-by-step example (typical flow)

1. **User requests product data**
   e.g. `/product/123`

2. **App checks Redis first (cache)**

   * Does Redis have product `123` already stored in memory?
   * ‚úÖ If **yes** ‚Üí return it instantly (RAM = microseconds)
   * ‚ùå If **no** ‚Üí go to MongoDB

3. **App queries MongoDB (main database)**

   * MongoDB fetches product `123` from disk
   * App sends it back to the user

4. **App stores that result in Redis**

   * Next time, Redis can serve it directly
   * MongoDB doesn‚Äôt need to be hit again for that same query

---

### ‚ö° This pattern is called:

> **‚ÄúCache-aside pattern‚Äù** or **‚ÄúLazy loading cache.‚Äù**

It‚Äôs one of the **most common and powerful architectures** in modern systems.

---

## üèóÔ∏è Example Real-world Architecture

```
          ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
          ‚îÇ   User / Client  ‚îÇ
          ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
                  ‚îÇ
                  ‚ñº
           ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
           ‚îÇ   Web App /  ‚îÇ
           ‚îÇ   API Layer  ‚îÇ
           ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
                  ‚îÇ
   ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îº‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
   ‚îÇ              ‚îÇ                ‚îÇ
   ‚ñº              ‚ñº                ‚ñº
Redis Cache   MongoDB Database   Other APIs
 (Fast)         (Persistent)      (Optional)
```

* Redis = **speed** layer
* MongoDB = **storage** layer
* Together ‚Üí you get both **fast performance** and **safe persistence**

---

## üí¨ Real-world use cases

| Scenario                       | Redis | MongoDB                          |
| ------------------------------ | ----- | -------------------------------- |
| Caching product info           | ‚úÖ     | ‚úÖ (source of truth)              |
| User session or login token    | ‚úÖ     | ‚ùå (no need to persist long-term) |
| Shopping cart data (temporary) | ‚úÖ     | ‚ùå                                |
| Order history                  | ‚ùå     | ‚úÖ                                |
| User profile                   | ‚ùå     | ‚úÖ                                |

---

## üß† Summary

| Feature       | **Redis**                   | **MongoDB**               |
| ------------- | --------------------------- | ------------------------- |
| Role          | Caching / Fast Access Layer | Persistent Storage Layer  |
| Speed         | Ultra-fast (RAM)            | Fast (Disk + Indexes)     |
| Data Lifetime | Temporary                   | Permanent                 |
| Typical Usage | Cache, sessions, queues     | Main application database |
| Together?     | ‚úÖ Yes ‚Äî often used together | ‚úÖ Yes ‚Äî stores main data  |

---

‚úÖ **Bottom line:**

> Use MongoDB for your *main data store*,
> and Redis for *speed and scalability* as a *cache* or *temporary memory layer* in front of it.

---


How MongoDB and Redis Labs (now called Redis Cloud) are different but can work together.


---

## üß© Step 1 ‚Äî Two different companies, two different technologies

| Service / Product                    | Who makes it            | What it does                                                                   |
| ------------------------------------ | ----------------------- | ------------------------------------------------------------------------------ |
| **MongoDB / MongoDB Atlas**          | Company: *MongoDB Inc.* | A **database** for storing data permanently on disk (like JSON documents).     |
| **Redis / Redis Labs (Redis Cloud)** | Company: *Redis Ltd.*   | A **cache / in-memory store** that keeps data in RAM for **very fast access**. |

‚úÖ They are **separate technologies**
‚úÖ Made by **different companies**
‚úÖ Used for **different purposes**

---

## üß† Step 2 ‚Äî What each one is best at

| Tool        | Type                             | What it‚Äôs good at                                       |
| ----------- | -------------------------------- | ------------------------------------------------------- |
| **MongoDB** | Document-oriented NoSQL database | Long-term storage, flexible structure (like JSON).      |
| **Redis**   | In-memory key-value store        | Super-fast temporary access, caching, queues, sessions. |

---

## ‚öôÔ∏è Step 3 ‚Äî How they work *hand in hand* in real apps

They **don‚Äôt connect to each other directly**.
Instead, **your application talks to both** ‚Äî each for a different reason:

```
       ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
       ‚îÇ   Your App   ‚îÇ
       ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
             ‚îÇ
   ‚îå‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îº‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
   ‚ñº                     ‚ñº
Redis (Cache)       MongoDB (Database)
Fast, temporary      Permanent storage
```

### üîπ Typical workflow

1. Your app first checks **Redis**
   üëâ ‚ÄúDo I already have this data in memory (cache)?‚Äù

2. If not found, your app queries **MongoDB**
   üëâ ‚ÄúGet me the full record from the main database.‚Äù

3. Your app then stores that MongoDB result **back in Redis**
   üëâ So next time, it‚Äôs instant.

This is called the **cache-aside pattern**.

---

## üí¨ Example in real life

Imagine you‚Äôre running an **e-commerce site**:

| Task                                                    | Where data lives  | Why                                  |
| ------------------------------------------------------- | ----------------- | ------------------------------------ |
| All product info, users, orders                         | **MongoDB Atlas** | Must be permanent and searchable     |
| Recently viewed products, shopping cart, login sessions | **Redis Cloud**   | Must be super-fast but not permanent |

So Redis **boosts performance** and **reduces load** on MongoDB.
MongoDB remains the **source of truth** (main data).
Redis is the **speed booster** sitting in front of it.

---

## ‚òÅÔ∏è Step 4 ‚Äî Local vs Cloud

You can:

* **Run MongoDB locally** or use **MongoDB Atlas** (the company‚Äôs managed cloud service)
* **Run Redis locally** or use **Redis Cloud** (Redis Ltd.‚Äôs managed service)

In cloud environments, both are often hosted on separate servers or services:

```
Your App (e.g. on AWS)
 ‚îú‚îÄ‚îÄ connects to Redis Cloud (for cache)
 ‚îî‚îÄ‚îÄ connects to MongoDB Atlas (for database)
```

---

## ‚ö° Quick recap

| Feature         | MongoDB                    | Redis                       |
| --------------- | -------------------------- | --------------------------- |
| Company         | MongoDB Inc.               | Redis Ltd.                  |
| Service name    | MongoDB Atlas (cloud)      | Redis Cloud (by Redis Labs) |
| Main purpose    | Long-term document storage | Fast in-memory cache        |
| Data stored     | On disk (BSON/JSON docs)   | In RAM (key-value)          |
| Works together? | ‚úÖ Yes ‚Äî app uses both      | ‚úÖ Yes ‚Äî as cache            |

---

### üß† In one sentence

> MongoDB stores your **main, permanent data**.
> Redis stores **temporary, high-speed copies** of frequently used data.
> Your **application** is the bridge that uses both together.



## Redis

Redis is an in-memory data structure store, used as a database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs and geospatial indexes with radius queries.

We will be connecting to a redis database hosted on Redis Labs. Redis Labs is a cloud database service that allows you to host redis databases on the cloud.

Prerequisite: The learner is requested to set up an account on Redis [here](https://redis.io/) and set up a (free tier) cluster. 

If you need some guides, please refer to the screenshots below:

[Step 1](../assets/redis_create_db_step1.png)  (create database)

[Step 2](../assets/redis_create_db_step2.png)  (choose **free** cluster, leave all other settings as **default** including `Name`, `Cloud vendor`, `Region`. Click the `Create database` button below.)

[Step 3](../assets/redis_create_db_step3.png)  (click 'connect' to get connect instructions)

[Step 4](../assets/redis_create_db_step4.png)  (choose 'Redis Client' - 'Python')

[Step 5](../assets/redis_create_db_step5.png) (copy and paste the python code into the cell below - Note: please use the `Copy` button provided at the bottom right instead of manually copying and paste. If you manually copy and paste, your password(auto-generated) will not be copied over!)

We will be using the `redis-py` library to connect to the redis database

In [1]:
# Paste your code from Step 5 above below this line
# -------------------------------------------------




bar


In [None]:
# # Either use the code provided from Step 5 above or the code below to connect to your Redis database.
# # Make sure to replace <REDIS-URL> and <YOUR-PASSWORD> with your actual Redis database URL and password.
# # If you are using the code from Step5, you can skip this section.
# import redis

# r = redis.Redis(
#   host='<REDIS-URL>', # E.g.'redis-10908.c252.ap-southeast-1-1.ec2.cloud.redislabs.com'
#   port=10908,
#   password='<YOUR-PASSWORD>' 
# )

A Redis database holds `key:value pairs` and supports commands such as GET, SET, and DEL, as well as several hundred additional commands.

- Redis keys are always strings.
- Redis values may be a number of different data types. Some of the more essential value data types are- string, list, hashes, and sets. Some advanced types include geospatial items and stream.

Many Redis commands operate in constant O(1) time, just like retrieving a value from a Python dict or any hash table.

Let's create a new key called `'name'` with the value `'Aaron'`.

In [5]:
r.set('name', 'Aaron')

True

Read the value of the key `'name'`:

In [6]:
r.get('name')

'Aaron'

We can update the value with `.set` too:

In [7]:
r.set('name', 'Bob')

True

In [8]:
r.get('name')

'Bob'

> Set a key `age` with value of `20`.
>
> Then read the value.

In [9]:
r.set('age', 20)

True

In [10]:
r.get('age')

'20'

To push a list, you need to use `rpush`:

In [13]:
r.rpush("names", "Aaron", "Bob", "Charlie")

9

Summary 

| Method   | Behavior                            | Example                                           |
| -------- | ----------------------------------- | ------------------------------------------------- |
| `SET`    | Overwrites                          | `r.set("name", "Alice")` replaces old value       |
| `APPEND` | Adds to end of string               | `r.append("name", ", Alice")` ‚Üí `"Elton, Alice"`  |
| `LIST`   | Supports multiple entries naturally | `r.rpush("names", "Alice")` keeps previous values |

Note: .rpush allows duplicate entries


1Ô∏è‚É£ Redis Lists
Redis lists are ordered sequences of elements (like Python lists).
You can add elements to the left (front) or right (end) of the list.
That‚Äôs where .lpush and .rpush come in.
    
2Ô∏è‚É£ .rpush vs .lpush

| Command     | Meaning                                  | Example                                                       | Resulting list        |
| ----------- | ---------------------------------------- | ------------------------------------------------------------- | --------------------- |
| **`rpush`** | Push to the **right / end** of the list  | `r.rpush("fruits", "apple")`<br>`r.rpush("fruits", "banana")` | `["apple", "banana"]` |
| **`lpush`** | Push to the **left / front** of the list | `r.lpush("fruits", "apple")`<br>`r.lpush("fruits", "banana")` | `["banana", "apple"]` |


In [14]:
r.lindex("names", 1)

'Bob'

You can use `mset` to set multiple keys at once.

In [15]:
r.mset({
    "name": "John",
    "age": 30,
})

True

In [16]:
r.mget("name", "age")

['John', '30']

1Ô∏è‚É£ What is a Redis hash?
A hash in Redis is like a dictionary or map inside a single key.
Think of it as a key that holds multiple field-value pairs.
Perfect for storing objects or records instead of just single values.

üîπ Analogy

| Concept          | Python equivalent                                  | Redis example                           |
| ---------------- | ------------------------------------------------- | ----------------------------------------- |
| Single key-value | `"name": "Elton"`                                 | `SET name "Elton"`                        |
| Hash             | `{ "name": "Elton", "age": 30, "country": "SG" }` | `HSET user:1 name "Elton" age 30 country "SG"` |

So instead of having 3 separate keys (name, age, country), you can store all fields in one key (user:1).

Redis `hashes` are record types structured as collections of field-value pairs. You can use hashes to represent basic objects.

```python
# Create a new hash with my name as the key
r.hset(
    'zane lim',
    mapping={
        "age": 21,
        "email": "zl@gmail.com",
        "hobby": "coding",
    },
)
```

Then get the hash nested value back:


In [20]:
r.hset(
    'zane lim',
    mapping={
        "age": 21,
        "email": "zl@gmail.com",
        "hobby": "coding",
    },
)

3

In [21]:
r.hget("zane lim", "email")

'zl@gmail.com'

Get the object back as a dictionary:

In [None]:
r.hgetall("zane lim")

{'age': '21', 'email': 'zl@gmail.com', 'hobby': 'coding'}

> Create a new hash with your name as the key, and a mapping of `age`, `email`, `hobby`.

Always a good practice to shutdown your Redis cluster if not going to be used in future. Click into your DB and hit `Delete`. See this [screenshot](../assets/redis_terminate_db.png) for a guide.

## Google Cloud Storage

Google Cloud Storage is an Object Storage service in Google Cloud.

### Bucket
- A bucket is a container for objects stored in Google Cloud Storage.
- Every object is contained in a bucket.
- Each bucket is associated with a project.
- A bucket has a unique name across all of Google Cloud Storage.

### Object
- An object is a piece of data, such as a file, that is stored in Google Cloud Storage.
- An object is also called a `blob` (binary large object) in Google Cloud Storage. 
- An object is composed of the object's data and its metadata. 
- Metadata is a collection of name-value pairs that describe the object. You can use metadata to search for objects.

We will be using the `google-cloud-storage` python library to fetch data from the public [Landsat Collection 1](https://console.cloud.google.com/storage/browser/gcp-public-data-landsat;tab=objects?prefix=&forceOnObjectsSortingFiltering=false) dataset demonstrated just now.

In [4]:
from google.cloud import storage

In [5]:
client = storage.Client()

In [6]:
bucket = client.get_bucket('gcp-public-data-landsat')

Note that you need to do `gcloud auth application-default login` to run the cell above. 

If the error persists, you may also need to restart the kernel (in VSCode, click the `Restart` button).

Get bucket metadata:

In [7]:
print("Bucket name: {}".format(bucket.name))
print("Bucket location: {}".format(bucket.location))
print("Bucket storage class: {}".format(bucket.storage_class))

Bucket name: gcp-public-data-landsat
Bucket location: US
Bucket storage class: STANDARD


List blobs in a bucket:

In [8]:
blobs = bucket.list_blobs()

print("Blobs in {}:".format(bucket.name))
for ix, item in enumerate(blobs):
    print("\t" + item.name)
    if ix == 50:
        break

Blobs in gcp-public-data-landsat:
	LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_ANG.txt
	LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B1.TIF
	LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B10.TIF
	LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B11.TIF
	LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B2.TIF
	LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B3.TIF
	LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B4.TIF
	LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B5.TIF
	LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B6.TIF
	LC08/01/

Get a blob and display metadata:

In [9]:
blob = bucket.get_blob("LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B1.TIF")

print("Name: {}".format(blob.id))
print("Size: {} bytes".format(blob.size))
print("Content type: {}".format(blob.content_type))
print("Public URL: {}".format(blob.public_url))

Name: gcp-public-data-landsat/LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B1.TIF/1502391058568908
Size: 75085385 bytes
Content type: application/octet-stream
Public URL: https://storage.googleapis.com/gcp-public-data-landsat/LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B1.TIF


Download a blob to a local directory:

In [10]:
output_file_name = "../output/LC08_L1GT_001002_20160817_20170322_01_T2_B1.TIF"
blob.download_to_filename(output_file_name)

print("Downloaded blob {} to {}.".format(blob.name, output_file_name))

Downloaded blob LC08/01/001/002/LC08_L1GT_001002_20160817_20170322_01_T2/LC08_L1GT_001002_20160817_20170322_01_T2_B1.TIF to ../output/LC08_L1GT_001002_20160817_20170322_01_T2_B1.TIF.
