**1. What This Pattern Solves**

This pattern aggregates numeric values grouped by a key.

**Used for:**

SUM of amounts per customer

Total views per day

Total bytes transferred per host

Aggregating transaction amounts

Summing durations, counts, distances

Transforming Bronze logs → Silver aggregates

Pre-aggregating before writing to Delta

Rollups in Python (small/medium datasets)

**2. SQL Equivalent**
SELECT key, SUM(value)
FROM table
GROUP BY key;

**3. Core Idea**

Maintain a dictionary:

key = grouping key

value = running total

**4. Template Code (MEMORIZE THIS)**

In [0]:
agg = {}
for row in rows:
    key = row[key_field]
    val = row[value_field]
    agg[key] = agg.get(key, 0) + val

In [0]:
for x in data:
    agg[k] = agg.get(k, 0) + x

**5. Detailed Example**

In [0]:
## You have transactions

rows = [
  {"cust":"a", "amt": 10},
  {"cust":"b", "amt": 20},
  {"cust":"a", "amt": 5}
]


In [0]:
## Apply pattern
agg = {}
for r in rows:
    cust = r["cust"]
    amt = r["amt"]
    agg[cust] = agg.get(cust, 0) + amt

# Output : {"a": 15, "b": 20}

**6. Mini Practice Problems**

In [0]:
## Problem 1 — Total spend per user
[ {"u":"a","amt":5}, {"u":"b","amt":10}, {"u":"a","amt":7} ]

In [0]:
## Problem 2 — Count events by type : Same as frequency map, but using this pattern.

In [0]:
## Problem 3 — Sum of durations per day : Logs like:
{"day":"2025-01-01", "dur":120}

**7. Full Data Engineering Problem**

In [0]:
## Problem: Given API logs:
logs = [
  {"url":"/home", "ms":120},
  {"url":"/login", "ms":80},
  {"url":"/home", "ms":200},
  {"url":"/products", "ms":350}
]

In [0]:
## Compute the total response time per URL. Solution Skeleton:
agg = {}
for r in logs:
    url = r["url"]
    ms  = r["ms"]
    agg[url] = agg.get(url, 0) + ms

**8. Time & Space Complexity**

Time: O(n)

Space: O(k) where k = unique keys

**9. Common Pitfalls**

❌ Forgetting .get()
❌ Using lists for aggregation (incorrect)
❌ Using pandas for simple iteration (slow for interview coding)
❌ Assuming sorted input — aggregation never needs sorting

✔ Use a dictionary counter-style summation.