**RANK** vs. **DENSE_RANK** vs. **ROW_NUMBER:**

|Function|	Tie Handling|	Result Range|	Use Case|
|--------|-----------|----------|--------|
|ROW_NUMBER()|	No Ties.| Assigns a unique, consecutive integer starting from 1 to every row.|	1, 2, 3, 4, 5...	Paginating results,| selecting the "first" or "latest" record per group.|
|RANK()|Skips Ranks.| Assigns the same rank to rows with identical values (ties). It then skips the subsequent rank(s) before assigning the next unique rank.|	1, 1, 3, 4, 4, 6...	Competition ranking where shared ranks consume the next spots (e.g., Olympic medals).|
|DENSE_RANK()|	No Skips.| Assigns the same rank to rows with identical values (ties). It does not skip the subsequent rank(s); the next rank is always the next consecutive integer.|	1, 1, 2, 3, 3, 4...	Assigning salary tiers or levels where you want a clean sequence of ranks.|

Q. **Provide a scenario where choosing RANK() over DENSE_RANK() would lead to an incorrect business metric.**

In [None]:
SELECT
    Category,
    TotalRevenue,
    -- DENSE_RANK assigns rank based on TotalRevenue, ordered descending.
    DENSE_RANK() OVER (ORDER BY TotalRevenue DESC) AS RevenueRank
FROM
    MonthlyCategorySales
ORDER BY
    TotalRevenue DESC;

|Category,|Total Revenue,|RevenueRank|
|---------|--------------|-----------|
|Electronics,|"150,000",|1|
|Furniture,|"120,000",|2|
|Apparel,|"100,000",|3|
|Home Goods,|"100,000",|3|
|Books,|"90,000",|4|

2. **LEAD and LAG Functions:**

In [None]:
2.(a) What are the LEAD() and LAG() functions used for?

In [None]:
 LAG() Function
Purpose: Retrieves a column value from the row preceding the current row (i.e., looking backward).

Syntax: LAG(column_name, offset, default_value) OVER (PARTITION BY ... ORDER BY ...)

Common Use Case: Calculating the difference between the current metric and the previous period's metric (e.g., current month's sales versus last month's sales)

In [None]:
LEAD() Function

**Purpose:** Retrieves a column value from the row following the current row (i.e., looking forward).

**Syntax:** LEAD(column_name, offset, default_value) OVER (PARTITION BY ... ORDER BY ...)

Common Use Case: Calculating the difference between the current metric and the metric of the next period (e.g., current month's sales versus next month's projected sales).

In [None]:
The most common use of LAG() is to compare the current value to the value from the previous period.

**Goal: Calculate the dollar value change in sales compared to the immediate previous month.**

In [None]:
SELECT
    SaleMonth,
    MonthlyRevenue,
    -- 1. Get the revenue from the previous row (offset 1)
    LAG(MonthlyRevenue, 1) OVER (ORDER BY SaleMonth) AS PreviousMonthRevenue,

    -- 2. Calculate the difference (Current - Previous)
    MonthlyRevenue - LAG(MonthlyRevenue, 1) OVER (ORDER BY SaleMonth) AS MoM_Revenue_Change
FROM
    MonthlySales
ORDER BY
    SaleMonth;

In [None]:
SELECT
    customer_id,
    order_id,
    amount,
    LAG(amount, 1) OVER (
        ORDER BY order_date
    ) AS previous_amount,
    amount - LAG(amount, 1) OVER (
        ORDER BY order_date
    ) AS previous_amount,
     ((amount - LAG(amount, 1) OVER (ORDER BY order_date))*100)/(LAG(amount, 1) OVER (
        ORDER BY order_date
    )) AS percentage
FROM sales;


In [None]:
INSERT INTO sales(order_id, customer_id, order_date, amount) 
VALUES (1, 101, '2024-01-01', 200),
       (2, 101, '2024-01-05', 350),
       (3, 101, '2024-02-10', 400),
       (4, 102, '2024-01-07', 500),
       (5, 102, '2024-03-01', 450),
       (6, 103, '2024-01-20', 600); 

In [None]:
SELECT 
  first_name,
  last_name,
  COALESCE(marital_status, 'Unknown')
FROM persons

2. **LEAD() Function**
* Purpose: Retrieves a column value from the row following the current row (i.e., looking forward).

* Syntax: LEAD(column_name, offset, default_value) OVER (PARTITION BY ... ORDER BY ...)

* Common Use Case: Calculating the difference between the current metric and the metric of the next period (e.g., current month's sales versus next month's projected sales).

* Both functions rely heavily on the ORDER BY clause within the OVER statement, as this clause determines the sequence in which the "preceding" and "following" rows are defined.

In [None]:
SELECT
    Month,
    CurrentInventory,
    MonthlyDemand,
    -- 1. Get the demand from the next row (offset 1)
    LEAD(MonthlyDemand, 1) OVER (ORDER BY Month) AS NextMonthDemand,

    -- 2. Calculate the necessary inventory gap for the next month
    CurrentInventory - LEAD(MonthlyDemand, 1, 0) OVER (ORDER BY Month) AS Inventory_vs_Next_Demand_Gap
FROM
    InventoryLog
ORDER BY
    Month;

In [None]:
Use Case 1: Calculating Month-over-Month (MoM) Difference using LAG()
The most common use of LAG() is to compare the current value to the value from the previous period.

Goal: Calculate the dollar value change in sales compared to the immediate previous month.

In [None]:
SELECT
    SaleMonth,
    MonthlyRevenue,
    -- 1. Get the revenue from the previous row (offset 1)
    LAG(MonthlyRevenue, 1) OVER (ORDER BY SaleMonth) AS PreviousMonthRevenue,

    -- 2. Calculate the difference (Current - Previous)
    MonthlyRevenue - LAG(MonthlyRevenue, 1) OVER (ORDER BY SaleMonth) AS MoM_Revenue_Change
FROM
    MonthlySales
ORDER BY
    SaleMonth;

In [None]:
Use Case 2: Forecasting Next Period's Gap using LEAD()
The LEAD() function is useful for analyzing sequential gaps, identifying when the next event will occur, or calculating the required inventory for the next period.

Goal: Calculate how much of the current month's inventory was consumed by the next month's demand.

In [None]:
SELECT
    Month,
    CurrentInventory,
    MonthlyDemand,
    -- 1. Get the demand from the next row (offset 1)
    LEAD(MonthlyDemand, 1) OVER (ORDER BY Month) AS NextMonthDemand,

    -- 2. Calculate the necessary inventory gap for the next month
    CurrentInventory - LEAD(MonthlyDemand, 1, 0) OVER (ORDER BY Month) AS Inventory_vs_Next_Demand_Gap
FROM
    InventoryLog
ORDER BY
    Month;