#### CTE (Common Table Expression):
- A CTE is a named, temporary query that exists only for a single SQL statement. It improves readability and structure but cannot be reused across multiple queries.

#### Temporary Table:
- A temporary table is a session-scoped table that persists for the duration of the database connection. It can be queried, reused, indexed, and modified across multiple statements within that session.

#### Rule of thumb:
- Use a CTE for clarity in one query; use a temporary table when you need reuse or better performance across multiple steps.

In [18]:
from db_connection import get_connection
from tabulate import tabulate

In [15]:
sql = """
 SELECT * FROM orders;
"""

In [20]:
with get_connection() as connection:
    with connection.cursor() as cursor:
        cursor.execute(sql)
        rows = cursor.fetchall()


headers = ["order_id", "customer_id", "order_date", "amount"]
print(tabulate(rows, headers=headers, tablefmt="psql"))



+------------+---------------+--------------+----------+
|   order_id |   customer_id | order_date   |   amount |
|------------+---------------+--------------+----------|
|          1 |           101 | 2024-01-10   |      500 |
|          2 |           101 | 2024-01-15   |      750 |
|          3 |           102 | 2024-01-12   |      300 |
|          4 |           103 | 2024-01-14   |     1200 |
|          5 |           101 | 2024-01-20   |      200 |
|          6 |           102 | 2024-01-18   |      800 |
+------------+---------------+--------------+----------+


Problem: Calculate total sales per customer, then find customers who spent more than $1000.


In [None]:
sql = """
with customer_totals as (
    select
        customer_id,
        sum(amount) as total_spent,
        count(*) as total_orders
    from orders
    group by customer_id
)

SELECT
    customer_id,
    total_orders,
    total_spent
FROM customer_totals
WHERE total_spent > 1000
ORDER BY total_spent DESC;
"""

In [43]:
with get_connection() as connection:
    with connection.cursor() as cursor:
        cursor.execute(sql)
        rows = cursor.fetchall()
        headers = [desc[0] for desc in cursor.description]

print(tabulate(rows, headers=headers, tablefmt="psql"))



+---------------+----------------+---------------+
|   customer_id |   total_orders |   total_spent |
|---------------+----------------+---------------|
|           101 |              3 |          1450 |
|           103 |              1 |          1200 |
|           102 |              2 |          1100 |
+---------------+----------------+---------------+


Problem: Calculate monthly sales trends and identify months with above-average performance.

In [70]:
sql = """
WITH monthly_sales AS (
    SELECT
        DATE_TRUNC('month', sale_date) AS month,
        SUM(revenue) AS monthly_revenue,
        SUM(quantity) AS monthly_quantity
    FROM daily_sales
    GROUP BY DATE_TRUNC('month', sale_date)
),
average_metrics AS (
    SELECT
        AVG(monthly_revenue) AS avg_revenue,
        AVG(monthly_quantity) AS avg_quantity
    FROM monthly_sales
)
SELECT
    m.month,
    m.monthly_revenue,
    m.monthly_quantity
FROM monthly_sales m
CROSS JOIN average_metrics a
WHERE
    m.monthly_revenue > a.avg_revenue
ORDER BY m.month;
"""

In [71]:
with get_connection() as connection:
    with connection.cursor() as cursor:
        cursor.execute(sql)
        rows = cursor.fetchall()
        headers = [desc[0] for desc in cursor.description]

print(tabulate(rows, headers=headers, tablefmt="psql"))


+---------------------------+-------------------+--------------------+
| month                     |   monthly_revenue |   monthly_quantity |
|---------------------------+-------------------+--------------------|
| 2024-02-01 00:00:00+05:30 |              1350 |                 27 |
| 2024-03-01 00:00:00+05:30 |              1300 |                 26 |
+---------------------------+-------------------+--------------------+
