## [Top K Frequent Elements in Array](https://www.geeksforgeeks.org/problems/top-k-frequent-elements-in-array/1)

#### Given a non-empty array nums[] of integers of length N, find the top k elements which have the highest frequency in the array. If two numbers have same frequencies, then the larger number should be given more preference.

- Example 1:
    - Input:
    - N = 6
    - nums = {1,1,1,2,2,3}
    - k = 2
    - Output: {1, 2}
- Example 2:
    - Input:
    - N = 8
    - nums = {1,1,2,2,3,3,3,4}
    - k = 2
    - Output: {3, 2}
    - Explanation: Elements 1 and 2 have the same frequency ie. 2. Therefore, in this case, the answer includes the element 2 as 2 > 1.

**Method #1:** Sorting Method
- Time Complexity: `O(n * log n)`
- Space Complexity: `O(n)`

In [7]:
def top_k_frequency_sort(arr, k):
    freq_count = {}
    for num in arr:
        if num not in freq_count:
            freq_count[num] = 1
        else:
            freq_count[num] += 1
    num_count_arr = [(key, value) for key, value in freq_count.items()]
    result = sorted(num_count_arr, key=lambda x: x[0], reverse=True)        # sort by num
    result = sorted(result, key=lambda x: x[1], reverse=True)           # sort by count
    return [num[0] for num in result][:k]

In [8]:
arr = [1, 1, 2, 2, 3, 3, 3, 4]
k = 2

In [9]:
top_k_frequency_sort(arr, k)

[3, 2]

**Method #2:** Using max-heap
- Time Complexity: `O(n * log k)`
- Space Complexity: `O(n + k)`

In [10]:
import heapq

def top_k_frequency_heap_opt(arr, k):
    # Create frequency dictionary
    num_freq = {}
    for num in arr:
        num_freq[num] = num_freq.get(num, 0) + 1
    
    # Create min heap with (-frequency, -number)
    pq = []
    for num, freq in num_freq.items():
        heapq.heappush(pq, (-freq, -num))               # REMEMBER
    
    # Get top k elements    
    return [-heapq.heappop(pq)[1] for x in range(k)]    # REMEMBER

In [11]:
top_k_frequency_heap_opt(arr, k)

[3, 2]

`Simplified Version:`

In [12]:
from heapq import heappush, heappop, heapify

arr = [1, 1, 2, 2, 3, 3, 3, 4]
k = 2

def top_k(arr, k):
    num_freq = {}
    for num in arr:
        num_freq[num] = num_freq.get(num, 0) + 1
        
    result = [(freq, num) for num, freq in num_freq.items()]
    result = sorted(result, key=lambda x: (x[0], x[1]), reverse=True)   # first by `freq`, next by `num`
    
    return [x[1] for x in result[:k]]
    
print(top_k(arr, k))

def top_k_heap(arr, k):
    num_freq = {}
    for num in arr:
        num_freq[num] = num_freq.get(num, 0) + 1
    pq = [(-freq, -num) for num, freq in num_freq.items()]
    heapify(pq)
    
    return [-heappop(pq)[1] for _ in range(k)]

print(top_k_heap(arr, k))

[3, 2]
[3, 2]


In [None]:
# easy to remember
import heapq
arr = [1, 1, 2, 2, 3, 3, 3, 4]
k = 2

def test1(arr, k):
    num_freq = {}
    for num in arr:
        num_freq[num] = num_freq.get(num, 0) + 1
    result = [(freq, num) for num, freq in num_freq.items()]
    result.sort(key=lambda x: (-x[0], -x[1]))
    return [x[1] for x in result][:k]
print(test1(arr, k))

def test2(arr, k):
    num_freq = {}
    for num in arr:
        num_freq[num] = num_freq.get(num, 0) + 1
    result = [(-freq, -num) for num, freq in num_freq.items()]
    heapq.heapify(result)
    return [-heapq.heappop(result)[1] for _ in range(k)]
print(test2(arr, k))

In [13]:
# import heapq

# def top_k_frequency_heap_opt(arr, k):
#     # Step 1: Count frequencies of each element in the array
#     freq_count = {}
#     for num in arr:
#         if num not in freq_count:
#             freq_count[num] = 1
#         else:
#             freq_count[num] += 1
    
#     # Step 2: Use a heap to keep track of the top k elements by frequency
#     heap = []
#     for key, value in freq_count.items():
#         heapq.heappush(heap, (value, key))                              # REMEMBER: Push (frequency, element) into the heap
        
#         # If heap size exceeds k, remove the smallest element (min-heap)
#         if len(heap) > k:
#             heapq.heappop(heap)
    
#     # Step 3: Extract the elements from the heap
#     return [x[1] for x in heapq.nlargest(k, heap)]  # Return elements in descending order by frequency