# 347. Top K Frequent Elements

## Topic Alignment
- **Role Relevance**: Identifying heavy hitters is essential for feature selection and monitoring drift in ML systems.
- **Scenario**: Helps surface top queries, entities, or categorical features for model introspection.

## Metadata Summary
- Source: [LeetCode - Top K Frequent Elements](https://leetcode.com/problems/top-k-frequent-elements/)
- Tags: `Array`, `Hash Table`, `Heap`
- Difficulty: Medium
- Recommended Priority: High

## Problem Statement
Given an integer array `nums` and an integer `k`, return the `k` most frequent elements. You may return the answer in any order. The algorithm must run in better than `O(n log n)` time.

## Constraints
- `1 <= nums.length <= 10^5`
- `-10^4 <= nums[i] <= 10^4`
- `1 <= k <= unique elements in nums`

## Progressive Hints
- Hint 1: Count element frequencies with a hash map.
- Hint 2: Use a bucket array indexed by frequency or a heap to extract top counts quickly.
- Hint 3: Bucket sort delivers linear time by grouping numbers by occurrence count.

## Solution Overview
Compute frequencies with a hash map, then bucket numbers by their frequencies and scan buckets from high to low to collect the top `k` elements in linear time.

## Detailed Explanation
1. Count each number's occurrences using a dictionary.
2. Create buckets where index `i` stores numbers appearing `i` times.
3. Iterate buckets in reverse order to gather the most frequent elements.
4. Stop once `k` elements have been collected.

## Complexity Trade-off Table
| Approach | Time Complexity | Space Complexity | Notes |
| --- | --- | --- | --- |
| Heap of size k | O(n log k) | O(n) | Works well when k is small. |
| Bucket sort on frequency | O(n) | O(n) | Meets the strict runtime requirement. |

In [None]:
from collections import defaultdict
from typing import List


def topKFrequent(nums: List[int], k: int) -> List[int]:
    """Return the k elements with highest frequency using bucket sort."""
    freq = defaultdict(int)
    for num in nums:
        freq[num] += 1  # Count each occurrence.

    buckets = [[] for _ in range(len(nums) + 1)]
    for num, count in freq.items():
        buckets[count].append(num)  # Place number in bucket by frequency.

    result = []
    for count in range(len(buckets) - 1, 0, -1):
        for num in buckets[count]:
            result.append(num)
            if len(result) == k:
                return result
    return result


## Complexity Analysis
- Time Complexity: `O(n)` because counting and bucket traversal are linear in input size.
- Space Complexity: `O(n)` for the frequency map and buckets.
- Bottleneck: Bucket creation and iteration; both remain linear.

## Edge Cases & Pitfalls
- Ensure k equals the number of unique elements; the loops still work.
- Negative numbers are handled naturally by dictionary keys.
- When many elements share frequencies, order within buckets is arbitrary.

## Follow-up Variants
- Stream the data and maintain top-k using a min-heap.
- Return not just the elements but also their relative frequencies.
- Extend to strings or more complex keys by reusing the same counting logic.

## Takeaways
- Combining hash maps with bucket structures yields linear-time heavy hitter detection.
- Frequency analysis is central to monitoring feature drift in production ML.
- Flexibility in output order simplifies implementation without harming utility.

## Similar Problems
| Problem ID | Problem Title | Technique |
| --- | --- | --- |
| 692 | Top K Frequent Words | Hash map + heap |
| 451 | Sort Characters By Frequency | Hash map + bucket sort |
| 973 | K Closest Points to Origin | Heap selection |