# Top K Elements

This notebook covers problems that involve finding the top/smallest/most frequent K elements using heaps and other efficient data structures.

## Key Concepts
- Min heap and max heap operations
- Using heaps to maintain K elements efficiently
- Frequency counting with heaps
- Quick select algorithm
- Priority queue patterns

## Problems (10 total)
Problems are ordered from easier to more challenging.

In [None]:
# Setup - Run this cell first!
import sys

sys.path.insert(0, '..')

from dsa_helpers import check, hint

# Quick reference:
# - check(function_name) - Run tests for your solution
# - check(function_name, verbose=True) - See detailed test output
# - check(function_name, performance=True) - Run performance tests
# - hint("problem_name") - Get progressive hints (call multiple times for more)
# - hint("problem_name", reset=True) - Reset hints and start over

---
## Problem 1: Top K Frequent Elements

### Description
Given an integer array `nums` and an integer `k`, return the `k` most frequent elements. You may return the answer in any order.

### Constraints
- `1 <= nums.length <= 10^5`
- `-10^4 <= nums[i] <= 10^4`
- `k` is in the range `[1, the number of unique elements in the array]`
- It is guaranteed that the answer is unique

### Examples

**Example 1:**
```
Input: nums = [1, 1, 1, 2, 2, 3], k = 2
Output: [1, 2]
```

**Example 2:**
```
Input: nums = [1], k = 1
Output: [1]
```

In [None]:
def top_k_frequent(nums: list[int], k: int) -> list[int]:
    """
    Find the k most frequent elements.

    Args:
        nums: List of integers
        k: Number of top frequent elements to return

    Returns:
        List of k most frequent elements (any order)
    """
    # Your implementation here
    pass

In [None]:
# Test your solution
check(top_k_frequent)

In [None]:
# Need help? Get progressive hints
hint("top_k_frequent")

---
## Problem 2: Kth Largest Element in an Array

### Description
Given an integer array `nums` and an integer `k`, return the `k`th largest element in the array.

Note that it is the `k`th largest element in the sorted order, not the `k`th distinct element.

Can you solve it without sorting?

### Constraints
- `1 <= k <= nums.length <= 10^5`
- `-10^4 <= nums[i] <= 10^4`

### Examples

**Example 1:**
```
Input: nums = [3, 2, 1, 5, 6, 4], k = 2
Output: 5
```

**Example 2:**
```
Input: nums = [3, 2, 3, 1, 2, 4, 5, 5, 6], k = 4
Output: 4
```

In [None]:
def kth_largest_element(nums: list[int], k: int) -> int:
    """
    Find the kth largest element in the array.

    Args:
        nums: List of integers
        k: The k value (1-indexed from largest)

    Returns:
        The kth largest element
    """
    # Your implementation here
    pass

In [None]:
check(kth_largest_element)

In [None]:
hint("kth_largest_element")

---
## Problem 3: K Closest Points to Origin

### Description
Given an array of `points` where `points[i] = [xi, yi]` represents a point on the X-Y plane and an integer `k`, return the `k` closest points to the origin `(0, 0)`.

The distance between two points on the X-Y plane is the Euclidean distance (i.e., `sqrt((x1 - x2)^2 + (y1 - y2)^2)`).

You may return the answer in any order. The answer is guaranteed to be unique (except for the order that it is in).

### Constraints
- `1 <= k <= points.length <= 10^4`
- `-10^4 <= xi, yi <= 10^4`

### Examples

**Example 1:**
```
Input: points = [[1, 3], [-2, 2]], k = 1
Output: [[-2, 2]]
Explanation: Distance of (1, 3) = sqrt(10), (-2, 2) = sqrt(8). Closest is (-2, 2).
```

**Example 2:**
```
Input: points = [[3, 3], [5, -1], [-2, 4]], k = 2
Output: [[3, 3], [-2, 4]]
```

In [None]:
def k_closest_points(points: list[list[int]], k: int) -> list[list[int]]:
    """
    Find k closest points to origin.

    Args:
        points: List of [x, y] coordinates
        k: Number of closest points to return

    Returns:
        List of k closest points (any order)
    """
    # Your implementation here
    pass

In [None]:
check(k_closest_points)

In [None]:
hint("k_closest_points")

---
## Problem 4: Minimum Cost to Connect Ropes

### Description
Given `n` ropes of different lengths represented by an array, connect them into one rope. The cost to connect two ropes is equal to the sum of their lengths. Connect the ropes with minimum cost.

### Constraints
- `1 <= ropes.length <= 10^4`
- `1 <= ropes[i] <= 10^4`

### Examples

**Example 1:**
```
Input: ropes = [1, 2, 3, 4, 5]
Output: 33
Explanation:
- Connect 1 + 2 = 3, cost = 3. Ropes: [3, 3, 4, 5]
- Connect 3 + 3 = 6, cost = 6. Ropes: [4, 5, 6]
- Connect 4 + 5 = 9, cost = 9. Ropes: [6, 9]
- Connect 6 + 9 = 15, cost = 15. Ropes: [15]
Total cost = 3 + 6 + 9 + 15 = 33
```

**Example 2:**
```
Input: ropes = [1, 3, 11, 5]
Output: 33
```

**Example 3:**
```
Input: ropes = [5]
Output: 0
Explanation: Only one rope, no connections needed.
```

In [None]:
def connect_ropes(ropes: list[int]) -> int:
    """
    Find minimum cost to connect all ropes.

    Args:
        ropes: List of rope lengths

    Returns:
        Minimum total cost to connect all ropes
    """
    # Your implementation here
    pass

In [None]:
check(connect_ropes)

In [None]:
hint("connect_ropes")

---
## Problem 5: Top K Frequent Words

### Description
Given an array of strings `words` and an integer `k`, return the `k` most frequent strings.

Return the answer sorted by the frequency from highest to lowest. Sort the words with the same frequency by their lexicographical order.

### Constraints
- `1 <= words.length <= 500`
- `1 <= words[i].length <= 10`
- `words[i]` consists of lowercase English letters
- `k` is in the range `[1, The number of unique words[i]]`

### Examples

**Example 1:**
```
Input: words = ["i","love","leetcode","i","love","coding"], k = 2
Output: ["i","love"]
Explanation: "i" and "love" are the two most frequent words.
Note that "i" comes before "love" due to a lower alphabetical order.
```

**Example 2:**
```
Input: words = ["the","day","is","sunny","the","the","the","sunny","is","is"], k = 4
Output: ["the","is","sunny","day"]
```

In [None]:
def top_k_frequent_words(words: list[str], k: int) -> list[str]:
    """
    Find k most frequent words, sorted by frequency then alphabetically.

    Args:
        words: List of words
        k: Number of top frequent words to return

    Returns:
        List of k most frequent words in specified order
    """
    # Your implementation here
    pass

In [None]:
check(top_k_frequent_words)

In [None]:
hint("top_k_frequent_words")

---
## Problem 6: Sort Characters by Frequency

### Description
Given a string `s`, sort it in decreasing order based on the frequency of the characters. The frequency of a character is the number of times it appears in the string.

Return the sorted string. If there are multiple answers, return any of them.

### Constraints
- `1 <= s.length <= 5 * 10^5`
- `s` consists of uppercase and lowercase English letters and digits

### Examples

**Example 1:**
```
Input: s = "tree"
Output: "eert"
Explanation: 'e' appears twice while 'r' and 't' both appear once.
So 'e' must appear before both 'r' and 't'. "eetr" is also valid.
```

**Example 2:**
```
Input: s = "cccaaa"
Output: "aaaccc" or "cccaaa"
Explanation: Both 'c' and 'a' appear three times, so both are valid.
```

**Example 3:**
```
Input: s = "Aabb"
Output: "bbAa" or "bbaA"
Explanation: 'A' and 'a' are treated as two different characters.
```

In [None]:
def sort_by_frequency(s: str) -> str:
    """
    Sort characters by frequency in descending order.

    Args:
        s: Input string

    Returns:
        String with characters sorted by frequency (highest first)
    """
    # Your implementation here
    pass

In [None]:
check(sort_by_frequency)

In [None]:
hint("sort_by_frequency")

---
## Problem 7: Kth Largest Element in a Stream

### Description
Design a class to find the `k`th largest element in a stream. Note that it is the `k`th largest element in the sorted order, not the `k`th distinct element.

Implement `KthLargest` class:
- `KthLargest(int k, int[] nums)` Initializes the object with the integer `k` and the stream of integers `nums`.
- `int add(int val)` Appends the integer `val` to the stream and returns the element representing the `k`th largest element in the stream.

### Constraints
- `1 <= k <= 10^4`
- `0 <= nums.length <= 10^4`
- `-10^4 <= nums[i] <= 10^4`
- `-10^4 <= val <= 10^4`
- At most `10^4` calls will be made to `add`
- It is guaranteed that there will be at least `k` elements when you search for the `k`th element

### Examples

**Example 1:**
```
Input:
["KthLargest", "add", "add", "add", "add", "add"]
[[3, [4, 5, 8, 2]], [3], [5], [10], [9], [4]]
Output: [null, 4, 5, 5, 8, 8]

Explanation:
KthLargest kthLargest = new KthLargest(3, [4, 5, 8, 2]);
kthLargest.add(3);   // return 4
kthLargest.add(5);   // return 5
kthLargest.add(10);  // return 5
kthLargest.add(9);   // return 8
kthLargest.add(4);   // return 8
```

In [None]:
class KthLargestInStream:
    """
    Class to find kth largest element in a stream.
    """

    def __init__(self, k: int, nums: list[int]):
        """
        Initialize with k and initial numbers.

        Args:
            k: The k value for kth largest
            nums: Initial list of numbers
        """
        # Your implementation here
        pass

    def add(self, val: int) -> int:
        """
        Add a value and return current kth largest.

        Args:
            val: Value to add to stream

        Returns:
            Current kth largest element
        """
        # Your implementation here
        pass

# Wrapper function for testing
def kth_largest_in_stream(k: int, nums: list[int], operations: list[int]) -> list[int]:
    """
    Test wrapper for KthLargestInStream class.

    Args:
        k: The k value
        nums: Initial numbers
        operations: List of values to add

    Returns:
        List of kth largest after each add operation
    """
    kth = KthLargestInStream(k, nums)
    return [kth.add(val) for val in operations]

In [None]:
check(kth_largest_in_stream)

In [None]:
hint("kth_largest_in_stream")

---
## Problem 8: K Closest Numbers

### Description
Given a sorted integer array `arr`, two integers `k` and `x`, return the `k` closest integers to `x` in the array. The result should also be sorted in ascending order.

An integer `a` is closer to `x` than an integer `b` if:
- `|a - x| < |b - x|`, or
- `|a - x| == |b - x|` and `a < b`

### Constraints
- `1 <= k <= arr.length`
- `1 <= arr.length <= 10^4`
- `arr` is sorted in ascending order
- `-10^4 <= arr[i], x <= 10^4`

### Examples

**Example 1:**
```
Input: arr = [1, 2, 3, 4, 5], k = 4, x = 3
Output: [1, 2, 3, 4]
```

**Example 2:**
```
Input: arr = [1, 2, 3, 4, 5], k = 4, x = -1
Output: [1, 2, 3, 4]
```

In [None]:
def closest_numbers(arr: list[int], k: int, x: int) -> list[int]:
    """
    Find k closest numbers to x in sorted array.

    Args:
        arr: Sorted list of integers
        k: Number of closest elements to return
        x: Target value

    Returns:
        List of k closest numbers, sorted in ascending order
    """
    # Your implementation here
    pass

In [None]:
check(closest_numbers)

In [None]:
hint("closest_numbers")

---
## Problem 9: Maximum Distinct Elements

### Description
Given an array of integers `nums` and an integer `k`, you can remove exactly `k` elements from the array. Return the maximum number of distinct elements that can remain in the array after the removals.

### Constraints
- `1 <= nums.length <= 10^5`
- `1 <= nums[i] <= 10^5`
- `0 <= k <= nums.length`

### Examples

**Example 1:**
```
Input: nums = [5, 5, 4], k = 1
Output: 2
Explanation: Remove one 5, array becomes [5, 4]. 2 distinct elements.
```

**Example 2:**
```
Input: nums = [4, 3, 1, 1, 3, 3, 2], k = 3
Output: 3
Explanation: Remove two 3s and one 1, array becomes [4, 1, 3, 2]. 4 elements but we can only keep 3 distinct after removing k=3 elements total.
```

In [None]:
def maximum_distinct_elements(nums: list[int], k: int) -> int:
    """
    Find maximum distinct elements after removing k elements.

    Args:
        nums: List of integers
        k: Number of elements to remove

    Returns:
        Maximum number of distinct elements possible
    """
    # Your implementation here
    pass

In [None]:
check(maximum_distinct_elements)

In [None]:
hint("maximum_distinct_elements")

---
## Problem 10: Sum of Elements Between K1 and K2 Smallest

### Description
Given an array of integers `nums` and two integers `k1` and `k2`, find the sum of all elements between the `k1`th smallest and `k2`th smallest elements (exclusive).

### Constraints
- `1 <= nums.length <= 10^4`
- `1 <= k1 < k2 <= nums.length`
- `-10^4 <= nums[i] <= 10^4`

### Examples

**Example 1:**
```
Input: nums = [1, 3, 12, 5, 15, 11], k1 = 3, k2 = 6
Output: 23
Explanation: Sorted: [1, 3, 5, 11, 12, 15]
3rd smallest = 5, 6th smallest = 15
Sum of elements between: 11 + 12 = 23
```

**Example 2:**
```
Input: nums = [3, 5, 8, 7], k1 = 1, k2 = 4
Output: 12
Explanation: Sorted: [3, 5, 7, 8]
1st smallest = 3, 4th smallest = 8
Sum of elements between: 5 + 7 = 12
```

In [None]:
def sum_of_elements(nums: list[int], k1: int, k2: int) -> int:
    """
    Find sum of elements between k1th and k2th smallest.

    Args:
        nums: List of integers
        k1: First boundary (smaller k)
        k2: Second boundary (larger k)

    Returns:
        Sum of elements strictly between k1th and k2th smallest
    """
    # Your implementation here
    pass

In [None]:
check(sum_of_elements)

In [None]:
hint("sum_of_elements")

---
## Summary

Congratulations on completing the Top K Elements problems!

### Key Takeaways
1. **Min heap of size K** efficiently tracks K largest elements - the root is the Kth largest
2. **Max heap of size K** efficiently tracks K smallest elements - the root is the Kth smallest
3. **Frequency problems** combine Counter/dict with heaps for efficient top-K by frequency
4. **Quick select** provides O(n) average case for finding Kth element without full sort
5. **Heap size management** - always maintain exactly K elements for optimal space

### Python Heap Tips
- `heapq` is a min heap by default
- For max heap, negate values: `heappush(heap, -val)`
- `heapq.nlargest(k, iterable)` and `heapq.nsmallest(k, iterable)` are built-in helpers

### Next Steps
Move on to **13_k_way_merge.ipynb** for merging sorted structures efficiently!