**1. What This Pattern Solves**

Count errors

Count duplicates

Count events by category

Build fact aggregations

Top-K problems

Daily user counts

CDC row-change counts

Used constantly in logs, JSON, ETL pipelines, and Silver-layer aggregations.

**2. SQL Equivalent**
SELECT key, COUNT(*)
FROM table
GROUP BY key;

**3. Core Idea**

Use a dictionary where:

key = item

value = count

4. Template Code (MEMORIZE THIS)

In [0]:
freq = {}
for x in arr:
    freq[x] = freq.get(x, 0) + 1

5. Detailed Example

In [0]:
## You have log events:

events = ["200","200","404","500","200"]

## Apply template:

freq = {}
for code in events:
    freq[code] = freq.get(code, 0) + 1

## Output:
# {"200": 3, "404": 1, "500": 1}


**6. Mini Practice Problems**

Solve from scratch:

Count how many times each user appears in:
["alice","bob","alice","sam"]

Count characters in the string "banana".

Count error types in:
["timeout","ok","timeout","fail"]

**7. Full DE Problem**

In [0]:
## Given a list of API logs:

logs = [
  {"user":"a", "status":200},
  {"user":"b", "status":200},
  {"user":"a", "status":500},
  {"user":"c", "status":200},
  {"user":"b", "status":500}
]

## Count how many times each status code appears.
## Solution skeleton:
freq = {}
for r in logs:
    code = r["status"]
    freq[code] = freq.get(code, 0) + 1


**8. Complexity**

Time: O(n)

Space: O(k) where k = unique keys

**9. Pitfalls**

Forgetting .get() leads to KeyError

Using list .count() → O(n²)

Using sort before counting (wasteful)