# 3230 — Customer Purchasing Behavior Analysis

**Source:** LeetCode problem post. ([Leetcode][1])

---

## Schema

**Transactions** table

|           Column | Type    |
| ---------------: | :------ |
|   transaction_id | int     |
|      customer_id | int     |
|       product_id | int     |
| transaction_date | date    |
|           amount | decimal |

**Products** table

|     Column | Type    |
| ---------: | :------ |
| product_id | int     |
|   category | varchar |
|      price | decimal |

---

## Problem

Write a query to analyze each customer's purchasing behavior. For **each customer**, return the following columns:

1. `customer_id`
2. `total_amount` — total amount the customer spent (rounded to 2 decimal places).
3. `transaction_count` — number of transactions by the customer.
4. `unique_categories` — the number of distinct product categories the customer purchased.
5. `avg_transaction_amount` — average amount per transaction (rounded to 2 decimal places).
6. `top_category` — the most frequently purchased product category for that customer. If there is a tie on frequency, choose the category whose most recent transaction (for that customer) is later (i.e., choose the tied category with the newest `transaction_date`).
7. `loyalty_score` — computed as `(transaction_count * 10) + (total_amount / 100)`, rounded to 2 decimal places.

Return rows ordered by `loyalty_score` **descending**, then by `customer_id` **ascending**.

Round `total_amount`, `avg_transaction_amount`, and `loyalty_score` to 2 decimal places.

---

## Example

**Input**

Transactions:

| transaction_id | customer_id | product_id | transaction_date | amount |
| -------------: | ----------: | ---------: | :--------------: | -----: |
|              1 |         101 |          1 |    2023-01-01    | 100.00 |
|              2 |         101 |          2 |    2023-01-15    | 150.00 |
|              3 |         102 |          1 |    2023-01-01    | 100.00 |
|              4 |         102 |          3 |    2023-01-22    | 200.00 |
|              5 |         101 |          3 |    2023-02-10    | 200.00 |

Products:

| product_id | category |  price |
| ---------: | :------: | -----: |
|          1 |     A    | 100.00 |
|          2 |     B    | 150.00 |
|          3 |     C    | 200.00 |

**Output**

| customer_id | total_amount | transaction_count | unique_categories | avg_transaction_amount | top_category | loyalty_score |
| ----------: | -----------: | ----------------: | ----------------: | ---------------------: | :----------: | ------------: |
|         101 |       450.00 |                 3 |                 3 |                 150.00 |       C      |         34.50 |
|         102 |       300.00 |                 2 |                 2 |                 150.00 |       C      |         23.00 |

**Explanation (brief)**

* Customer 101 spent 100 + 150 + 200 = 450.00 across 3 transactions → avg 150.00. Categories A, B, C each purchased once; tie broken by most recent transaction (C on 2023-02-10). Loyalty = 3*10 + 450/100 = 34.50.
* Customer 102 spent 100 + 200 = 300.00 across 2 transactions → avg 150.00. Categories A and C tie; choose C since its transaction is more recent (2023-01-22). Loyalty = 2*10 + 300/100 = 23.00.

---


In [None]:
WITH CTE AS (
    SELECT
        t.transaction_id,
        t.customer_id,
        t.product_id,
        t.transaction_date,
        p.category,
        -- how many times this customer bought this category
        COUNT(*) OVER (PARTITION BY t.customer_id, p.category) AS category_count,
        t.price
    FROM Transactions t
    LEFT JOIN Products p
      ON t.product_id = p.product_id
),

CTE2 AS (
    -- assign rank per customer: highest category_count first, tie-break by most recent transaction_date
    SELECT
        *,
        ROW_NUMBER() OVER (
            PARTITION BY customer_id
            ORDER BY category_count DESC, transaction_date DESC
        ) AS rnk
    FROM CTE
)

SELECT
    c.customer_id,
    ROUND(SUM(c.price), 2) AS total_amount,
    COUNT(DISTINCT c.transaction_id) AS transaction_count,
    COUNT(DISTINCT c.category) AS unique_categories,
    ROUND(AVG(c.price), 2) AS avg_transaction_amount,
    -- compute top_category in the SELECT using rnk (1 = top)
    MAX(CASE WHEN c.rnk = 1 THEN c.category ELSE NULL END) AS top_category,
    -- loyalty_score formula (can be changed): transaction_count * 10 + total_amount / 100
    ROUND(COUNT(DISTINCT c.transaction_id) * 10 + SUM(c.price) / 100.0, 2) AS loyalty_score
FROM CTE2 c
GROUP BY c.customer_id
ORDER BY c.customer_id;
