<h1>347. Top K Frequent Elements</h1>
<hr>

<!--Copy Paste Leetcode statement between-->
<div><p>Given a non-empty array of integers, return the <b><i>k</i></b> most frequent elements.</p>

<p><strong>Example 1:</strong></p>

<pre><strong>Input: </strong>nums = <span id="example-input-1-1">[1,1,1,2,2,3]</span>, k = <span id="example-input-1-2">2</span>
<strong>Output: </strong><span id="example-output-1">[1,2]</span>
</pre>

<div>
<p><strong>Example 2:</strong></p>

<pre><strong>Input: </strong>nums = <span id="example-input-2-1">[1]</span>, k = <span id="example-input-2-2">1</span>
<strong>Output: </strong><span id="example-output-2">[1]</span></pre>
</div>

<p><b>Note: </b></p>

<ul>
	<li>You may assume <i>k</i> is always valid, 1 ≤ <i>k</i> ≤ number of unique elements.</li>
	<li>Your algorithm's time complexity <b>must be</b> better than O(<i>n</i> log <i>n</i>), where <i>n</i> is the array's size.</li>
	<li>It's guaranteed that the answer is unique, in other words the set of the top k frequent elements is unique.</li>
	<li>You can return the answer in any order.</li>
</ul>
</div>
<!--Copy Paste Leetcode statement between-->

<p>&nbsp;</p>
<a href="https://leetcode.com/problems/top-k-frequent-elements/">Source</a> 
<hr>

<h4>Code</h4>
<p>Naive approach O(nlog n)</p>

In [4]:
# Naive solutions in O(nlogn)
from collections import Counter

def top_k_frequent(nums, k):
    """Fast solution but O(nlogn)"""
    frequency = sorted(Counter(nums).items(), key=lambda item: -item[1])  # list of tuples
    return [tup[0] for tup in frequency][:k]

In [5]:
# Naive solutions in O(nlogn)
from collections import Counter

def top_k_frequent(nums, k):
    """variation of above solution using builtin method most_common()"""
    frequency = Counter(nums).most_common(k)
    return [tup[0] for tup in frequency]

<p>&nbsp;</p>
<p>Using Heaps O(nlog k)</p>

In [6]:
from heapq import nsmallest
from collections import Counter

def top_k_frequent(nums, k):
    """Using builtin methods nsmallest() from heapq
    Note: nsmallest --> no need to heapify all the array nums (only need array of size k)
    
    Time Complexity: O(nlog k)
    Space Complexity O(n)
    """
    tup_list = [(-freq, key) for key, freq in Counter(nums).items()]  # heapq = only minheap
    return [tup[1] for tup in nsmallest(k, tup_list)]   # since implied in nsmallest()

In [7]:
from heapq import nsmallest
from collections import Counter

def top_k_frequent(nums, k):
    """Variation of above solution
    """
    frequency = Counter(nums)
    return nsmallest(k, frequency, key=lambda x: -frequency[x])

In [None]:
from heapq import nlargest
from collections import Counter

def top_k_frequent(nums, k):
    """Using nlargest() instead of nsmallest()
    """
    tup_list = [(freq, key) for key, freq in Counter(nums).items()]
    return [tup[1] for tup in nlargest(k, tup_list)]

In [None]:
from heapq import nlargest
from collections import Counter

def top_k_frequent(nums, k):
    """Using nlargest() instead of nsmallest()
    """
    frequency = Counter(nums)
    return nlargest(k, frequency, key=lambda x: frequency[x])

<p>&nbsp;</p>
<p>Using bucket sort</p>

In [109]:
from collections import Counter
from functools import reduce
from itertools import chain
from operator import iconcat # operator +=

def top_k_frequent(nums, k):
    """frequency of any element can not be more than n --> create array of size n
    Time Complexity: O(n)
    Space Complexity: O(n)
    """
    bucket = [[] for _ in range(len(nums)+1)]
    frequency = Counter(nums).items()  
    for key, freq in frequency:
        bucket[freq] += key, 

    result = [item for sublist in bucket for item in sublist]  # Option 1 
    result = reduce(iconcat, bucket, [])                       # Option 2
    result = chain(*bucket)                                    # Option 3

    return result[-k:]  # !! Option 3 requires list conversion first

<p>&nbsp;</p>
<p>Using <a href="https://nbviewer.jupyter.org/github/adrien-perelloyb/leetcode/blob/main/lessons/selection_algo.ipynb">Quick Select</a> (itself based on <a href="https://nbviewer.jupyter.org/github/adrien-perelloyb/leetcode/blob/main/lessons/sorting_algo.ipynb">Quick Sort</a>)</p>

In [44]:
from random import randint

def top_k_frequent(nums, k):
    """Do a partial sort: from less frequent to the most frequent, until
    (n - k)th less frequent element takes its (n - k) place in a sorted array. 
    All element on the left are less frequent.
    All the elements on the right are more frequent.  
    
    Time Complexity: O(n) on average O(n²) in worst case (improbable)
    Space Complexity: O(k) (? stack of recursive calls)
    """
    
    def partition(left, right, pivot_idx):
        pivot_freq = frequency[elements[pivot_idx]]
        elements[pivot_idx], elements[right] = elements[right], elements[pivot_idx]  # move pivot to end
        pivot_idx = left

        for i in range(left, right):  # move all less frequent elements to the left
            if frequency[elements[i]] < pivot_freq:
                elements[pivot_idx], elements[i] = elements[i], elements[pivot_idx]
                pivot_idx += 1
        elements[right], elements[pivot_idx] = elements[pivot_idx], elements[right]  # mv pivot to final place
        return pivot_idx

    def quick_select(start, end, k):
        if start == end:  # base case: the list contains only one element
            return elements[start]
        pivot_idx = randint(start, end)               # select a random pivot_index       
        pivot_idx = partition(start, end, pivot_idx)  # find the pivot position in a sorted list   
        if k == pivot_idx:                            # if the pivot is in its final sorted position
             return elements[start]
        if k < pivot_idx:                             # if not,go left
            end = pivot_idx - 1
        else:                                         # or go right
            start = pivot_idx + 1
        return quick_select(start, end, k)


    frequency = Counter(nums)
    elements = list(frequency.keys())
    n = len(elements) 
    quick_select(0, n - 1, n - k)   # kth top frequent element = (n - k)th less frequent.
    return elements[n - k:]

<h4>Check</h4>

In [110]:
nums, k = [1,1,1,2,2,3], 2
top_k_frequent(nums, k)

[2, 1]

In [46]:
nums, k = [1], 1
top_k_frequent(nums, k)

[1]

In [93]:
nums, k = [], 3
for i in range(4,0,-1):
    nums.extend([randint(0,20)]*i)
print(nums)
top_k_frequent(nums, k)

[11, 11, 11, 11, 4, 4, 4, 19, 19, 7]


[19, 11, 4]