## Cassandra:

Data Model Creation:
* Design a data model for an e-commerce platform to handle products, orders, and user information.
* Define appropriate column families and primary keys to ensure efficient querying.

In [9]:
from cassandra.cluster import Cluster
import uuid
import random
from decimal import Decimal
from datetime import datetime
from datetime import timedelta

In [2]:
uuid.uuid4()

UUID('07602618-ef04-4705-b968-4b1a552542ba')

In [3]:
cluster = Cluster(["127.0.0.1"],port=9042)
session = cluster.connect()

In [4]:
session.execute("""CREATE KEYSPACE IF NOT EXISTS e_commerce
    WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}""")

<cassandra.cluster.ResultSet at 0x182480b0490>

In [85]:
session.set_keyspace('e_commerce')

In [86]:
session.execute("""CREATE TABLE IF NOT EXISTS users (
        user_id UUID PRIMARY KEY,
        username TEXT,
        email TEXT)""")

session.execute("""CREATE TABLE IF NOT EXISTS products (
        product_id UUID PRIMARY KEY,
        product_name TEXT,
        price DECIMAL)""")

session.execute("""CREATE TABLE IF NOT EXISTS user_orders (
        user_id UUID,
        order_id UUID,
        order_date TIMESTAMP,
        product_name TEXT,
        quantity INT,
        total_amount DECIMAL,
        PRIMARY KEY (user_id, order_id))""")

<cassandra.cluster.ResultSet at 0x24a4a6542d0>

Data Insertion and Retrieval:
* Insert sample data into the Cassandra database, including user information and product details.
* Retrieve a user's order history using CQL (Cassandra Query Language).

In [87]:
users_data = [
    (uuid.uuid4(), 'user1', 'user1@example.com'),
    (uuid.uuid4(), 'user2', 'user2@example.com'),
    (uuid.uuid4(), 'user3', 'user3@example.com')]

for user_data in users_data:
    session.execute(
        """INSERT INTO users (user_id, username, email)
        VALUES (%s, %s, %s)""",
        user_data)

products_data = [
    (uuid.uuid4(), 'Product A', Decimal('2')),
    (uuid.uuid4(), 'Product B', Decimal('2.5')),
    (uuid.uuid4(), 'Product C', Decimal('2.99'))]

for product_data in products_data:
    session.execute(
        """INSERT INTO products (product_id, product_name, price)
        VALUES (%s, %s, %s)""",
        product_data)

user_orders = [
    (uuid.uuid4(), users_data[0][0], datetime.now(), 'Product A', 1, Decimal('2')),
    (uuid.uuid4(), users_data[0][0], datetime.now(), 'Product C', 2, Decimal('5.98'))]

for order_data in user_orders:
    session.execute("""INSERT INTO user_orders (order_id, user_id, order_date, product_name, quantity, total_amount)
        VALUES (%s, %s, %s, %s, %s, %s)""",
        order_data)

In [88]:
user_id_to_retrieve = users_data[0][0]

result = session.execute(
    """SELECT * FROM user_orders WHERE user_id = %s""",
    (user_id_to_retrieve,))

for row in result:
    print(row.order_id, row.order_date, row.product_name, row.total_amount)

65277254-7289-41e4-bab0-33040908bf06 2023-08-11 20:22:57.712000 Product A 2
a96f856e-a98f-4e86-8245-5324f930c09f 2023-08-11 20:22:57.712000 Product C 5.98


Time-Series Data:
* Design a schema to handle time-series data, such as IoT sensor readings.
* Insert and retrieve time-series data efficiently, using appropriate time-based partitioning.

In [5]:
session.execute("""CREATE KEYSPACE IF NOT EXISTS time_series_data
    WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}""")

<cassandra.cluster.ResultSet at 0x18247f1ea90>

In [6]:
session.set_keyspace('time_series_data')

In [7]:
session.execute("""
    CREATE TABLE IF NOT EXISTS sensor_readings (
        timestamp TIMESTAMP,
        sensor_id UUID,
        value DOUBLE,
        PRIMARY KEY ((timestamp), sensor_id)
    ) WITH CLUSTERING ORDER BY (sensor_id ASC)
""")

<cassandra.cluster.ResultSet at 0x182482732d0>

In [15]:
for _ in range(100):
    timestamp = datetime.now() - timedelta(days=1)
    sensor_id = uuid.uuid4()
    value = round(18 + random.random(), 2)

    session.execute(
        """ INSERT INTO sensor_readings (timestamp, sensor_id, value)
        VALUES (%s, %s, %s)""",
        (timestamp, sensor_id, value)
    )

In [16]:
from datetime import datetime, timedelta

start_time = datetime.now() - timedelta(days=10)
end_time = datetime.now()

result = session.execute("""SELECT * FROM sensor_readings
    WHERE timestamp >= %s AND timestamp <= %s
    ALLOW FILTERING""",
    (start_time, end_time)
)

for row in result:
    print(row.timestamp, row.sensor_id, row.value)

2023-08-10 21:10:05.829000 6dd6dee9-2237-4b62-9bf9-89dd24053adf 18.02
2023-08-10 21:10:03.375000 1ad48620-2337-4dc5-b48d-b61ae9c3b17a 18.29
2023-08-10 21:10:01.137000 f5fb1d45-9b5e-47fe-a0e0-853d3a1fd02c 18.79
2023-08-10 21:10:05.031000 1ec76182-5f2a-460b-96b1-1e1ebebcd1ad 18.5
2023-08-10 21:09:58.468000 cb1e8fab-a1de-4adb-8f9a-f7c065ac74f7 18.19
2023-08-10 21:10:10.934000 1a1f34be-f9d7-4990-9a7c-95f9b02900ee 18.4
2023-08-10 21:10:00.340000 1901cbd1-ad9d-475a-bcad-0efdbbf87dbc 18.43
2023-08-10 21:10:02.639000 310ed6e5-a3f4-43f1-89c2-6b58948d973a 18.55
2023-08-10 21:10:07.987000 5176032d-678e-40da-ba06-27bdcc4e522f 18.87
2023-08-10 21:10:15.080000 f3644c85-7317-4f97-81da-08ca1fa008e4 18.13
2023-08-10 21:10:11.351000 2ea1e525-a41a-4a7b-82ad-124c49d5eda8 18.62
2023-08-10 21:10:07.585000 da7e49cf-ee6c-4126-8a6b-8b99b45b3d1a 18.32
2023-08-10 21:06:36.036000 945f84f3-fd43-4a1a-852b-67a2cc764b40 101.98
2023-08-10 21:10:07.707000 5ce3ac45-3c35-47b3-b044-f1cae524a57e 18.74
2023-08-10 21:10:09.3