# Module 7 — Window Functions & Subqueries (All-in-One)

This notebook follows the instructional flow you provided, with teaching notes (Markdown) followed by annotated, runnable SQL examples against the `farmers_market` database.

##  Setup — Connect to `farmers_market`

> Note: credentials here are for your local demo only. Do **not** use `root` or hard-coded passwords in production.

In [1]:
import mysql.connector
import pandas as pd

# Establish connection to local MySQL server
conn = mysql.connector.connect(
    host="localhost",
    user="root",       # classroom demo only
    password="William2025!!",   # replace in real deployments
    database="farmers_market"
)
print(f"Connected to {conn.database}!")

def run_query(sql: str, params: tuple = None, preview: int = 10):
    """Execute SQL and return a pandas DataFrame (optionally preview head)."""
    cur = conn.cursor()
    cur.execute(sql, params or ())
    rows = cur.fetchall()
    cols = [d[0] for d in cur.description] if cur.description else []
    cur.close()
    df = pd.DataFrame(rows, columns=cols)
    return df.head(preview) if preview is not None else df

Connected to farmers_market!


---
## 1) Introduction

**Goal.** Learn to compute analytical metrics that are hard or inefficient with plain `GROUP BY` by using **window functions** and **subqueries**.

- **Window functions**: compute values over a *window* (set) of related rows while still returning the original row (no collapse like `GROUP BY`).
- **Subqueries**: queries nested inside another query (scalar, list, or table), including **correlated** subqueries that reference the outer row.

We’ll use these tables from your schema:
- `vendor_inventory(market_date, vendor_id, product_id, quantity, original_price)`
- `customer_purchases(market_date, transaction_time, customer_id, product_id, quantity, cost_to_customer_per_qty)`
- `vendor(vendor_id, vendor_name, ...)`
- `product(product_id, product_name, product_category_id, product_qty_type, ...)`

---
## 2) Anatomy of a Window Function

General pattern:
```sql
<function>() OVER (
    PARTITION BY <cols>   -- optional, divides rows into groups/windows
    ORDER BY <cols>       -- optional, defines ordering inside each window
    ROWS/RANGE ...        -- optional, frame clause for moving calcs
)
```

Common categories:
- **Aggregate windows:** `SUM()`, `AVG()`, `MIN()`, `MAX()`, `COUNT()`
- **Ranking windows:** `ROW_NUMBER()`, `RANK()`, `DENSE_RANK()`, `NTILE(n)`
- **Value windows:** `LAG()`, `LEAD()`, `FIRST_VALUE()`, `LAST_VALUE()`

> MySQL 8+ supports these window functions.

### Example — Compare `GROUP BY` vs Window `SUM() OVER()`

In [2]:
# Per-vendor per-date totals using GROUP BY
run_query('''
SELECT 
    market_date, vendor_id,
    SUM(quantity) AS total_qty
FROM vendor_inventory
GROUP BY market_date, vendor_id
ORDER BY market_date, vendor_id
LIMIT 20;
''')

Unnamed: 0,market_date,vendor_id,total_qty
0,2019-04-03,7,40.0
1,2019-04-03,8,34.0
2,2019-04-06,7,40.0
3,2019-04-06,8,39.0
4,2019-04-10,7,30.0
5,2019-04-10,8,37.0
6,2019-04-13,7,30.0
7,2019-04-13,8,38.0
8,2019-04-17,7,40.0
9,2019-04-17,8,39.0


In [3]:
# Window version: total per vendor_id per date *across the same rows* (no collapse)
# (Shows same total repeated per row in the partition)
run_query('''
SELECT 
    market_date, vendor_id, product_id, quantity,
    SUM(quantity) OVER (PARTITION BY market_date, vendor_id) AS total_qty_window
FROM vendor_inventory
ORDER BY market_date, vendor_id, product_id
LIMIT 20;
''')

Unnamed: 0,market_date,vendor_id,product_id,quantity,total_qty_window
0,2019-04-03,7,4,40.0,40.0
1,2019-04-03,8,5,16.0,34.0
2,2019-04-03,8,7,8.0,34.0
3,2019-04-03,8,8,10.0,34.0
4,2019-04-06,7,4,40.0,40.0
5,2019-04-06,8,5,23.0,39.0
6,2019-04-06,8,7,8.0,39.0
7,2019-04-06,8,8,8.0,39.0
8,2019-04-10,7,4,30.0,30.0
9,2019-04-10,8,5,23.0,37.0


---
## 3) Ranking Windows (ROW_NUMBER, RANK, DENSE_RANK, NTILE)

These functions assign ranks or buckets within each partition.
- `ROW_NUMBER()` — strict sequence (1,2,3,…) with no ties.
- `RANK()` — gaps on ties (1,1,3,…).
- `DENSE_RANK()` — no gaps on ties (1,1,2,…).
- `NTILE(n)` — splits ordered rows into `n` buckets as evenly as possible.

### 3.1) Top products per vendor by quantity (ROW_NUMBER)

This query retrieves the **top 3 products for each vendor**, ranked by the total quantity sold.

**Step-by-step breakdown:**

1. **Group and Summarize**  
   - `GROUP BY vi.vendor_id, vi.product_id` collapses the data into one row per **vendor–product** pair.  
   - `SUM(vi.quantity) AS qty` computes the **total quantity** for each vendor–product.

2. **Rank Within Each Vendor**  
   - `ROW_NUMBER() OVER (PARTITION BY vi.vendor_id ORDER BY SUM(vi.quantity) DESC)`  
     assigns a **rank (`rn`)** within each vendor’s products, ordering from **highest to lowest total quantity**.

3. **Filter Top 3 Products**  
   - The outer query uses `WHERE rn <= 3` to keep only the **top 3 products per vendor**.  
   - `ORDER BY vendor_id, rn` ensures results are listed vendor by vendor, and sorted by rank.

**Final Output Columns:**
- `vendor_id` → the vendor  
- `product_id` → product sold by that vendor  
- `qty` → total quantity for that vendor–product  
- `rn` → the rank of the product **within the vendor’s portfolio** (1 = most sold, 2 = second, etc.)

In [4]:
run_query('''
SELECT *
FROM (
  SELECT
      vi.vendor_id,
      vi.product_id,
      SUM(vi.quantity) AS qty,
      ROW_NUMBER() OVER (PARTITION BY vi.vendor_id ORDER BY SUM(vi.quantity) DESC) AS rn
  FROM vendor_inventory AS vi
  GROUP BY vi.vendor_id, vi.product_id
) t
WHERE rn <= 3
ORDER BY vendor_id, rn;
''')

Unnamed: 0,vendor_id,product_id,qty,rn
0,4,16,13650.0,1
1,7,4,5230.0,1
2,7,3,3480.0,2
3,7,2,1556.23,3
4,8,5,2850.0,1
5,8,7,1172.0,2
6,8,8,1145.0,3


### 3.2) Rank products per date by quantity (RANK vs DENSE_RANK)

### Query Explanation: Ranking Products by Quantity per Market Date

This query lists products sold on each **market_date**, ranked by the **total quantity** sold on that date.

**Step-by-step breakdown:**

1. **Group by Market Date and Product**  
   - `GROUP BY market_date, product_id` ensures that totals are calculated **per product on each market date**.  
   - `SUM(quantity) AS qty` computes the **total quantity sold** for that product on that date.

2. **Apply Two Ranking Functions**  
   - `RANK() OVER (PARTITION BY market_date ORDER BY SUM(quantity) DESC)`  
     assigns a rank to each product **within the same market_date**, ordered by total quantity (highest first).  
     - If two products tie, they receive the **same rank**, but the next rank is **skipped**.  
       (e.g., if two products tie for rank 1, the next product gets rank 3).  
   - `DENSE_RANK() OVER (PARTITION BY market_date ORDER BY SUM(quantity) DESC)`  
     is similar, but it does **not skip ranks** after ties.  
       (e.g., if two products tie for rank 1, the next product gets rank 2).

3. **Ordering and Limiting Results**  
   - `ORDER BY market_date, rk, product_id` sorts the output first by date, then by rank.  
   - `LIMIT 40` restricts the result to the first 40 rows.

**Final Output Columns:**  
- `market_date` → the date of the market  
- `product_id` → product sold on that date  
- `qty` → total quantity sold for that product on that date  
- `rk` → rank with gaps if there are ties  
- `drk` → rank without gaps (dense ranking)

**Example (from result):**  
- On `2019-04-03`, product `4` and product `5` are ranked `1` and `2` respectively.  
- If two products had the same `qty`, `RANK()` would skip a number but `DENSE_RANK()` would keep consecutive numbers.

- The difference only shows up when ties exist in the ordering column (in your case, SUM(quantity)).


In [11]:
run_query('''
SELECT
    market_date, product_id, SUM(quantity) AS qty,
    RANK()        OVER (PARTITION BY market_date ORDER BY SUM(quantity) DESC) AS rk,
    DENSE_RANK()  OVER (PARTITION BY market_date ORDER BY SUM(quantity) DESC) AS drk
FROM vendor_inventory
GROUP BY market_date, product_id
ORDER BY market_date, rk, product_id
LIMIT 80;
''')

Unnamed: 0,market_date,product_id,qty,rk,drk
0,2019-04-03,4,40.0,1,1
1,2019-04-03,5,16.0,2,2
2,2019-04-03,8,10.0,3,3
3,2019-04-03,7,8.0,4,4
4,2019-04-06,4,40.0,1,1
5,2019-04-06,5,23.0,2,2
6,2019-04-06,7,8.0,3,3
7,2019-04-06,8,8.0,3,3
8,2019-04-10,4,30.0,1,1
9,2019-04-10,5,23.0,2,2


### 3.3) Split vendors into quartiles by daily inventory value (NTILE)

### Query Explanation: Vendor Inventory Value Quartiles

This query calculates each vendor’s **inventory value per market date** by multiplying `quantity * original_price` and summing it.  
The `NTILE(4)` function then divides vendors into **quartiles (1–4)** for each market date, ranking them by inventory value in descending order.  

- `value_quartile = 1` → top 25% of vendors for that date  
- `value_quartile = 2` → next 25%, and so on  

The output shows how vendors compare to each other in terms of inventory value on each market date.


In [8]:
run_query('''
SELECT
    market_date,
    vendor_id,
    ROUND(SUM(quantity * IFNULL(original_price,0)), 2) AS inventory_value,
    NTILE(4) OVER (PARTITION BY market_date ORDER BY SUM(quantity * IFNULL(original_price,0)) DESC) AS value_quartile
FROM vendor_inventory
GROUP BY market_date, vendor_id
ORDER BY market_date, value_quartile, vendor_id;
''')

Unnamed: 0,market_date,vendor_id,inventory_value,value_quartile
0,2019-04-03,8,428.0,1
1,2019-04-03,7,160.0,2
2,2019-04-06,8,437.5,1
3,2019-04-06,7,160.0,2
4,2019-04-10,8,401.5,1
5,2019-04-10,7,120.0,2
6,2019-04-13,8,396.5,1
7,2019-04-13,7,120.0,2
8,2019-04-17,8,449.0,1
9,2019-04-17,7,160.0,2


---
## 4) Aggregate Windows — Running Totals & Moving Averages

Window aggregates with a frame enable time-series analytics.

- **Running total** up to current row:
  $$ \text{running\_sum}_t = \sum_{i \le t} x_i $$

- **Trailing moving average** (example uses trailing 3):
  $$ \text{MA}_t = \frac{1}{3} (x_t + x_{t-1} + x_{t-2}) $$

### 4.1) Running total of quantity by vendor across dates

### Query Explanation: Running Total of Quantities

This query calculates two things for each vendor per market date:

1. **`qty_day`** → total quantity sold on that specific date.  
2. **`running_qty`** → cumulative sum of quantities for the vendor across all past dates (running total).

The `ROWS UNBOUNDED PRECEDING` clause ensures the cumulative total starts at the first date and keeps adding up to the current row.


In [5]:
run_query('''
SELECT
    vendor_id,
    market_date,
    SUM(quantity) AS qty_day,
    SUM(SUM(quantity)) OVER (
        PARTITION BY vendor_id
        ORDER BY market_date
        ROWS UNBOUNDED PRECEDING
    ) AS running_qty
FROM vendor_inventory
GROUP BY vendor_id, market_date
ORDER BY vendor_id, market_date;
''')

Unnamed: 0,vendor_id,market_date,qty_day,running_qty
0,4,2019-06-01,120.0,120.0
1,4,2019-06-05,140.0,260.0
2,4,2019-06-08,100.0,360.0
3,4,2019-06-12,120.0,480.0
4,4,2019-06-15,140.0,620.0
5,4,2019-06-19,120.0,740.0
6,4,2019-06-22,120.0,860.0
7,4,2019-06-26,140.0,1000.0
8,4,2019-06-29,100.0,1100.0
9,4,2019-07-03,300.0,1400.0


### 4.2) 3-day trailing moving average of quantity by vendor

### Query Explanation: 3-Period Moving Average

This query calculates:

1. **`qty_day`** → total quantity sold by each vendor on a given date.  
2. **`qty_ma3`** → the 3-day moving average of daily quantities.  

The `ROWS 2 PRECEDING` window means the average is taken over the current row plus the two previous rows, giving a smoothed trend of sales.


In [None]:
run_query('''
SELECT
    vendor_id,
    market_date,
    SUM(quantity) AS qty_day,
    ROUND(AVG(SUM(quantity)) OVER (
        PARTITION BY vendor_id
        ORDER BY market_date
        ROWS 2 PRECEDING
    ), 2) AS qty_ma3
FROM vendor_inventory
GROUP BY vendor_id, market_date
ORDER BY vendor_id, market_date;
''')

Unnamed: 0,vendor_id,market_date,qty_day,qty_ma3
0,4,2019-06-01,120.0,120.0
1,4,2019-06-05,140.0,130.0
2,4,2019-06-08,100.0,120.0
3,4,2019-06-12,120.0,120.0
4,4,2019-06-15,140.0,120.0
5,4,2019-06-19,120.0,126.67
6,4,2019-06-22,120.0,126.67
7,4,2019-06-26,140.0,126.67
8,4,2019-06-29,100.0,120.0
9,4,2019-07-03,300.0,180.0


---
## 5) Value Windows — Period-over-Period with LAG/LEAD

Use `LAG()`/`LEAD()` to access prior/next row values within a partition, enabling change calculations.

### 5.1) Day-over-day change in inventory value per vendor

### Query Explanation: Value Change with LAG()

This query calculates:

1. **`value_day`** → total inventory value per vendor per date.  
2. **`prev_value`** → the previous day’s inventory value using `LAG()`.  
3. **`delta_value`** → difference between the current day and previous day values.  

This highlights how each vendor’s sales value changes over time (positive, negative, or no change).


In [6]:
run_query('''
WITH daily AS (
  SELECT
      vi.vendor_id,
      vi.market_date,
      ROUND(SUM(vi.quantity * IFNULL(vi.original_price,0)), 2) AS value_day
  FROM vendor_inventory AS vi
  GROUP BY vi.vendor_id, vi.market_date
)
SELECT
    vendor_id,
    market_date,
    value_day,
    LAG(value_day, 1) OVER (PARTITION BY vendor_id ORDER BY market_date) AS prev_value,
    ROUND(value_day - LAG(value_day, 1) OVER (PARTITION BY vendor_id ORDER BY market_date), 2) AS delta_value
FROM daily
ORDER BY vendor_id, market_date;
''')

Unnamed: 0,vendor_id,market_date,value_day,prev_value,delta_value
0,4,2019-06-01,60.0,,
1,4,2019-06-05,70.0,60.0,10.0
2,4,2019-06-08,50.0,70.0,-20.0
3,4,2019-06-12,60.0,50.0,10.0
4,4,2019-06-15,70.0,60.0,10.0
5,4,2019-06-19,60.0,70.0,-10.0
6,4,2019-06-22,60.0,60.0,0.0
7,4,2019-06-26,70.0,60.0,10.0
8,4,2019-06-29,50.0,70.0,-20.0
9,4,2019-07-03,150.0,50.0,100.0


### 5.2) Next purchase price per product (LEAD)

### Query Explanation: Using LEAD() for Next Value

This query calculates:

1. **`avg_price`** → the average selling price per product on each market date.  
2. **`next_avg_price`** → the average price on the *next market date* using `LEAD()`.  

This allows comparison of today’s price with the following date’s price, making it easier to track future price changes for each product.


In [15]:
run_query('''
SELECT
    product_id,
    market_date,
    ROUND(AVG(cost_to_customer_per_qty), 2) AS avg_price,
    LEAD(ROUND(AVG(cost_to_customer_per_qty), 2), 1) OVER (
        PARTITION BY product_id
        ORDER BY market_date
    ) AS next_avg_price
FROM customer_purchases
GROUP BY product_id, market_date
ORDER BY product_id, market_date;
''')

Unnamed: 0,product_id,market_date,avg_price,next_avg_price
0,1,2019-07-03,6.99,6.99
1,1,2019-07-06,6.99,6.99
2,1,2019-07-10,6.99,6.99
3,1,2019-07-13,6.99,6.99
4,1,2019-07-17,6.99,6.99
5,1,2019-07-20,6.99,6.99
6,1,2019-07-24,6.99,6.99
7,1,2019-07-27,6.99,6.99
8,1,2019-07-31,6.99,6.99
9,1,2019-08-03,6.99,6.99


---
## 6) Subqueries — Scalar, List, Table & Correlated

Types we’ll demo:
- **Scalar subquery** (returns one value)
- **IN subquery** (returns a list)
- **Derived table** (subquery in FROM)
- **Correlated subquery** (references outer row)

### 6.1) Scalar subquery — compare to global average price

### Query Explanation: Subquery for Global Average

This query calculates:

1. **`avg_price`** → the average selling price per product.  
2. **`global_avg_price`** → the overall average price across *all products* using a scalar subquery.  

The result shows each product’s average price compared against the global market average.


In [7]:
run_query('''
SELECT
    cp.product_id,
    ROUND(AVG(cp.cost_to_customer_per_qty), 2) AS avg_price,
    (SELECT ROUND(AVG(cost_to_customer_per_qty), 2) FROM customer_purchases) AS global_avg_price
FROM customer_purchases AS cp
GROUP BY cp.product_id
ORDER BY cp.product_id;
''')

Unnamed: 0,product_id,avg_price,global_avg_price
0,1,6.99,7.66
1,2,3.48,7.66
2,3,0.5,7.66
3,4,3.94,7.66
4,5,6.5,7.66
5,7,18.0,7.66
6,8,18.0,7.66
7,16,0.49,7.66


### 6.2) IN subquery — products purchased on a specific date

### Query Explanation: Subquery with `IN`

1. **Inner-most subquery** → `(SELECT MAX(market_date) FROM customer_purchases)`  
   - Finds the latest purchase date.  

2. **Middle subquery** →  
   ```sql
   SELECT DISTINCT product_id
   FROM customer_purchases
   WHERE market_date = (latest date)


In [18]:
run_query('''
SELECT product_id, product_name
FROM product
WHERE product_id IN (
    SELECT DISTINCT product_id
    FROM customer_purchases
    WHERE market_date = (SELECT MAX(market_date) FROM customer_purchases)
)
ORDER BY product_id;
''')

Unnamed: 0,product_id,product_name
0,4,Banana Peppers - Jar
1,5,Whole Wheat Bread
2,7,Apple Pie
3,8,Cherry Pie


### 6.3) Derived table — top products by revenue, then join names

### Query Explanation: Subquery with `JOIN`

1. **Subquery (`t`)**  
   - Groups purchases by `product_id`.  
   - Calculates total `revenue` = `SUM(quantity * cost_to_customer_per_qty)`.  
   - Orders by revenue (highest first) and limits to the **top 5 products**.  

2. **Outer query**  
   - Joins the subquery result (`t`) with the `product` table to retrieve product names.  
   - Orders final output by revenue in descending order.  

✅ The result shows the **top 5 revenue-generating products** with their names and total revenue.  


In [3]:
run_query('''
SELECT
    p.product_name,
    t.revenue
FROM (
    SELECT
        product_id,
        ROUND(SUM(quantity * cost_to_customer_per_qty), 2) AS revenue
    FROM customer_purchases
    GROUP BY product_id
    ORDER BY revenue DESC
    LIMIT 5
) AS t
JOIN product AS p ON p.product_id = t.product_id
ORDER BY revenue DESC;
''')

Unnamed: 0,product_name,revenue
0,Cherry Pie,18324.0
1,Apple Pie,17838.0
2,Whole Wheat Bread,13468.0
3,Banana Peppers - Jar,11855.0
4,Jalapeno Peppers - Organic,3192.52


### 6.4) Correlated subquery — vendor’s share of daily quantity

### Query Explanation: Correlated Subquery for Daily Percentages

1. For each `vendor_id` and `market_date`, the query:
   - Calculates the vendor’s **total quantity** (`vendor_qty`).  
   - Divides it by the **total quantity of all vendors on that same day** using a correlated subquery.  

2. The result gives each vendor’s **share of daily sales (`pct_of_day`)**, expressed as a percentage.  

Output shows how much each vendor contributed to the market’s total sales on each date.


In [4]:
run_query('''
SELECT
    vi.vendor_id,
    vi.market_date,
    ROUND(SUM(vi.quantity), 2) AS vendor_qty,
    ROUND(
        100 * SUM(vi.quantity) /
        (SELECT SUM(vj.quantity)
         FROM vendor_inventory AS vj
         WHERE vj.market_date = vi.market_date),
    2) AS pct_of_day
FROM vendor_inventory AS vi
GROUP BY vi.vendor_id, vi.market_date
ORDER BY vi.market_date, pct_of_day DESC;
''')

Unnamed: 0,vendor_id,market_date,vendor_qty,pct_of_day
0,7,2019-04-03,40.0,54.05
1,8,2019-04-03,34.0,45.95
2,7,2019-04-06,40.0,50.63
3,8,2019-04-06,39.0,49.37
4,8,2019-04-10,37.0,55.22
5,7,2019-04-10,30.0,44.78
6,8,2019-04-13,38.0,55.88
7,7,2019-04-13,30.0,44.12
8,7,2019-04-17,40.0,50.63
9,8,2019-04-17,39.0,49.37


---
## 7) Filtering & Categorization with Windows

We can filter *after* computing window columns (wrap windows in a subquery/CTE), or use `CASE` to bucket rows.

### 7.1) Keep only each vendor’s top product by quantity (filter on ROW_NUMBER)

### Query Explanation: Top Product per Vendor

1. The inner query:
   - Aggregates total quantity (`SUM(quantity)`) per `vendor_id` and `product_id`.  
   - Assigns a **row number (`ROW_NUMBER()`)** for each product within a vendor, ordered by total quantity descending.  

2. The outer query filters to only `rn = 1`, keeping the **top-selling product per vendor**.  

Output shows each vendor’s **best-selling product** and its total sales quantity.


In [5]:
run_query('''
SELECT vendor_id, product_id, qty, rn
FROM (
  SELECT
      vi.vendor_id,
      vi.product_id,
      SUM(vi.quantity) AS qty,
      ROW_NUMBER() OVER (PARTITION BY vi.vendor_id ORDER BY SUM(vi.quantity) DESC) AS rn
  FROM vendor_inventory AS vi
  GROUP BY vi.vendor_id, vi.product_id
) x
WHERE rn = 1
ORDER BY vendor_id;
''')

Unnamed: 0,vendor_id,product_id,qty,rn
0,4,16,13650.0,1
1,7,4,5230.0,1
2,8,5,2850.0,1


### 7.2) Categorize vendors by daily inventory value using CASE + window rank
### Query Explanation: Categorize Vendors by Daily Inventory Value (CASE + Window RANK)

This query ranks vendors **within each market_date** by their daily **inventory_value** (sum of `quantity * original_price`), then labels tiers.

**How it works:**
1. **Inner query**
   - `GROUP BY market_date, vendor_id` to get one row per vendor per day.
   - `SUM(quantity * original_price)` → `inventory_value`.
   - `RANK() OVER (PARTITION BY market_date ORDER BY SUM(...) DESC)` → `rk` (1 = highest value on that date).

2. **Outer query**
   - Uses `CASE` on `rk` to assign tiers:
     - `rk <= 3` → **Top tier**
     - `rk <= 6` → **Middle tier**
     - else → **Long tail**
   - Orders by date, then rank.

**Output columns:**  
`market_date`, `vendor_id`, `inventory_value`, `rk` (rank within the date), `tier_label` (tier based on rank).


In [6]:
run_query('''
SELECT
    market_date,
    vendor_id,
    inventory_value,
    rk,
    CASE
        WHEN rk <= 3 THEN 'Top tier'
        WHEN rk <= 6 THEN 'Middle tier'
        ELSE 'Long tail'
    END AS tier_label
FROM (
  SELECT
      vi.market_date,
      vi.vendor_id,
      ROUND(SUM(vi.quantity * IFNULL(vi.original_price,0)), 2) AS inventory_value,
      RANK() OVER (PARTITION BY vi.market_date ORDER BY SUM(vi.quantity * IFNULL(vi.original_price,0)) DESC) AS rk
  FROM vendor_inventory AS vi
  GROUP BY vi.market_date, vi.vendor_id
) t
ORDER BY market_date, rk, vendor_id;
''', preview=30)

Unnamed: 0,market_date,vendor_id,inventory_value,rk,tier_label
0,2019-04-03,8,428.0,1,Top tier
1,2019-04-03,7,160.0,2,Top tier
2,2019-04-06,8,437.5,1,Top tier
3,2019-04-06,7,160.0,2,Top tier
4,2019-04-10,8,401.5,1,Top tier
5,2019-04-10,7,120.0,2,Top tier
6,2019-04-13,8,396.5,1,Top tier
7,2019-04-13,7,120.0,2,Top tier
8,2019-04-17,8,449.0,1,Top tier
9,2019-04-17,7,160.0,2,Top tier
