**⭐ 1. What This Pattern Solves**

Transform long-format data into a wide, aggregated format.

Compute metrics per key across multiple categories (like SQL pivot).

Common in analytics pipelines for reporting or feature engineering.

Works well for ETL stages that require summarization per group.

**⭐ 2. SQL Equivalent**

In [0]:
%sql
SELECT
    user_id,
    SUM(CASE WHEN event_type = 'click' THEN 1 ELSE 0 END) AS clicks,
    SUM(CASE WHEN event_type = 'view' THEN 1 ELSE 0 END) AS views
FROM events
GROUP BY user_id;

**⭐ 3. Core Idea**

Group by a key, then create new columns for each category by aggregating values.
Essentially “row → column + aggregate”.

**⭐ 4. Template Code (MEMORIZE THIS)**

In [0]:
from collections import defaultdict

data = [
    {"key": "A", "category": "X", "value": 10},
    {"key": "A", "category": "Y", "value": 5},
    {"key": "B", "category": "X", "value": 7},
]

pivot = defaultdict(lambda: defaultdict(int))

for row in data:
    pivot[row["key"]][row["category"]] += row["value"]

# Convert to regular dict if needed
pivot = {k: dict(v) for k, v in pivot.items()}
print(pivot)


**⭐ 5. Detailed Example**

In [0]:
data = [
    {"user": "Alice", "action": "click", "count": 2},
    {"user": "Alice", "action": "view", "count": 3},
    {"user": "Bob", "action": "click", "count": 1},
]

from collections import defaultdict

pivot = defaultdict(lambda: defaultdict(int))
for row in data:
    pivot[row["user"]][row["action"]] += row["count"]

pivot = {k: dict(v) for k, v in pivot.items()}
print(pivot)

{
    'Alice': {'click': 2, 'view': 3},
    'Bob': {'click': 1}
}


**⭐ 6. Mini Practice Problems**

Pivot sales data per store_id for each product_id, summing quantity.

Aggregate website events per session_id for each event_type.

Transform employee_id × month × hours_worked into a wide-format dictionary.

**⭐ 7. Full Data Engineering Scenario**

Problem Statement:
You have a log of user interactions: (user_id, action_type, count). You need a report per user showing total counts for each action type.

Expected Output:

In [0]:
{
    "user1": {"click": 10, "view": 5},
    "user2": {"click": 3, "view": 7},
}


In [0]:
from collections import defaultdict

pivot = defaultdict(lambda: defaultdict(int))
for row in logs:
    pivot[row["user_id"]][row["action_type"]] += row["count"]

pivot = {k: dict(v) for k, v in pivot.items()}


**⭐ 8. Time & Space Complexity**

Time Complexity: O(n) — iterate through each row once.

Space Complexity: O(k * c) — k = unique keys, c = unique categories per key.

**⭐ 9. Common Pitfalls & Mistakes**

❌ Using nested loops instead of defaultdict, leading to slower O(n*c) behavior.
❌ Forgetting to initialize inner dictionary → KeyError.
✔ Correct approach: defaultdict(lambda: defaultdict(int)).
✔ Convert to plain dict for final reporting to avoid confusing nested defaultdict.
✔ Avoid recomputing aggregates inside loops; accumulate directly.