**⭐ 1. What This Pattern Solves**

Quickly counts occurrences of elements in a dataset (e.g., logs, events, transactions).

Useful for aggregations, frequency analysis, and histogram creation.

Simplifies ETL tasks where summarizing categorical data is required.

Eliminates manual loops for counting, reducing boilerplate and errors.

**⭐ 2. SQL Equivalent**

In [0]:
%sql
SELECT item, COUNT(*) as frequency
FROM table
GROUP BY item;

**⭐ 3. Core Idea**

Use a hash map to maintain element → count mapping.

Built-in Counter handles initialization and incrementation automatically.

**⭐ 4. Template Code (MEMORIZE THIS)**

In [0]:
from collections import Counter

# Count elements in a list
data = ['a', 'b', 'a', 'c', 'b', 'a']
counts = Counter(data)

# Access counts
print(counts['a'])  # 3
print(counts.most_common(2))  # [('a', 3), ('b', 2)]

**⭐ 5. Detailed Example**

In [0]:
transactions = ['apple', 'banana', 'apple', 'orange', 'banana', 'apple']

from collections import Counter

transaction_count = Counter(transactions)
print(transaction_count)
print(transaction_count.most_common(1))

Counter({'apple': 3, 'banana': 2, 'orange': 1})
[('apple', 3)]

**Step-by-step:**

Counter creates a dictionary with default 0 counts.

Iterates through transactions and increments counts.

most_common() sorts by frequency for easy top-k retrieval.

**⭐ 6. Mini Practice Problems**

Count the frequency of words in a log file list.

Find the top 3 most frequent error codes from a list of codes.

Merge two Counters of product sales and get total counts.

**⭐ 7. Full Data Engineering Scenario**

Problem Statement:

A streaming pipeline receives event types: login, logout, purchase, click.

Generate a real-time frequency table of events for analytics dashboard.

In [0]:
{'login': 120, 'logout': 80, 'purchase': 50, 'click': 300}


from collections import Counter

# incoming batch of events
events = ['login', 'click', 'purchase', 'login', 'click']

event_counter = Counter(events)

# For streaming, you could update a global Counter per batch
global_counter = Counter()
global_counter.update(event_counter)


**⭐ 8. Time & Space Complexity**

Time Complexity: O(n) — iterate through list once.

Space Complexity: O(k) — store counts for k unique elements.

**⭐ 9. Common Pitfalls & Mistakes**

❌ Using dict.get() manually for counting — verbose and error-prone.
❌ Forgetting that Counter allows negative counts (can affect logic).
❌ Using most_common() without checking list size — can crash if list is empty.

✔ Always use Counter for frequency aggregation.
✔ Use update() for merging counts efficiently.
✔ Use elements() if raw expansion of counts is needed.