**Scenario**: 

Image a car rental company that operates nationwide and needs to store information about each car in the fleet, including the make, model, registration number, and the time that each car is rented out.  The data is required to be held for `30 days`.  

- All over the customers that rented a Toyota Camry off the lot in the last 25 days were over charged, so you would like to contact them to inform them of the overcharge. 

<br>

```QUERY: Find the contact details of all the customers that have rented a Toyota Camry from the lot within the last 25 days.```

We can construct a table to model this scenario using:

**Table**
rentals

**Columns**
car_make, car_model, rental_start_time, rental_end_time, registration, car_category, car_year, driver_license, first_name, last_name, contact_number

**Composite key of ((car_make, employee_id), rental_start_time, rental_end_time)** where:

    car_make, car_model = partition key (tells us which node)
    rental_start_time, rental_end_time = clustering column (defines the order of the data)

In [1]:
# Jupyter relative import (else it interupts the module)
import os, sys
parent_dir = os.path.abspath('..')
if parent_dir not in sys.path:
    sys.path.append(parent_dir)
    
from cassandra.auth import PlainTextAuthProvider
from cassandra.cluster import Cluster
from cassandra.cqlengine import connection, management, query
from cassandra.cqlengine.connection import log as cql_logger
from cassandra.concurrent import execute_concurrent_with_args
from config import get_settings
from db import create_session
import pandas as pd


In [2]:
session = create_session()

# Create a keyspace
session.execute("""
CREATE KEYSPACE IF NOT EXISTS cql_keyspace
WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': 1 } 
AND durable_writes = 'true';
""")

# Create our table
session.execute("""
CREATE TABLE IF NOT EXISTS cql_keyspace.rentals (
car_make text,
car_model text,
rental_start_time timestamp,
rental_end_time timestamp,
car_category text,
car_year int,
registration_number text,
driver_license text,
first_name text,
last_name text,
contact_number text,
PRIMARY KEY ((car_make, car_model), rental_start_time, rental_end_time)
) WITH default_time_to_live = 2592000;
""")

<cassandra.cluster.ResultSet at 0x2638b8b3280>

In [3]:
from data_generator import DataGenerator

statement = session.prepare("""
INSERT INTO cql_keyspace.rentals (car_make,
car_model,
rental_start_time,
rental_end_time,
car_category,
car_year,
registration_number,
driver_license,
first_name,
last_name,
contact_number) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
""")


# Prepare dummy data then execute a sequence of (statement, parameters) concurrently with args
records = DataGenerator().generate_records(num_records=10000)
parameters = [x for x in records]
execute_concurrent_with_args(session, statement, parameters, concurrency=50)


# Show results of the load
result = list(session.execute("SELECT * FROM cql_keyspace.rentals"))
print(len(result), f".. where {10000-len(result)} cars where rented multiple times within the last 30 days.")
pd.DataFrame(result).head()

9603 .. where 397 cars where rented multiple times within the last 30 days.


Unnamed: 0,car_make,car_model,rental_start_time,rental_end_time,car_category,car_year,contact_number,driver_license,first_name,last_name,registration_number
0,Mitsubishi,F-TYPE,2023-01-08 05:45:13.284,2023-01-10 06:45:13.284,Sedan,2017,493599966,KT134219,Brandi,Hodge,518BSP8B
1,Toyota,fortwo,2023-01-06 04:45:12.717,2023-01-07 08:45:12.717,SUV,1999,409316672,EO121137,Benjamin,Chavez,6432531T
2,Jeep,Equinox,2023-01-27 08:45:12.814,2023-01-31 07:45:12.814,"Coupe, Convertible",2007,426130452,ZZ568621,Kayla,Branch,731SA4RZ
3,BMW,Monterey,2023-01-18 07:45:12.955,2023-01-27 00:45:12.955,SUV,2001,437912421,RN798652,Russell,Wilson,7864AGHX
4,Toyota,Crossfire,2023-01-12 06:45:12.575,2023-01-22 03:45:12.575,"Sedan, Coupe",2011,424321595,AG635848,Julie,Hill,882H7TCX


```QUERY: Find the contact details of all the customers that have rented a Toyota Camry from the lot within the last 25 days.```


In [4]:
from datetime import datetime, timedelta

def unix_time(dt):
    epoch = datetime.utcfromtimestamp(0)
    delta = dt - epoch
    return delta.total_seconds()

def unix_time_millis(dt):
    return (unix_time(dt) * 1000.0)

search_date = int(unix_time_millis(datetime.now() - timedelta(25)))
result = session.execute(f"SELECT * FROM cql_keyspace.rentals WHERE car_make = 'Toyota' and car_model = 'Camry' and rental_start_time >= {search_date}")
pd.DataFrame(list(result)).head()

Unnamed: 0,car_make,car_model,rental_start_time,rental_end_time,car_category,car_year,contact_number,driver_license,first_name,last_name,registration_number
0,Toyota,Camry,2023-01-12 05:45:13.199,2023-01-16 08:45:13.199,Sedan,2015,413302424,OJ573429,Lisa,Hill,940MVIWU
