## 973. K Closest Points to Origin 

### Description

Given an array of points where points$[i] = [x_i, y_i]$ represents a point on the $X-Y$ plane and an integer k, return the $k$ closest points to the origin $(0, 0)$.

The distance between two points on the X-Y plane is the Euclidean distance (i.e., $\sqrt{(x_1 - x_2)^2 + (y_1 - y_2)^2}$).

You may return the answer in any order. The answer is guaranteed to be unique (except for the order that it is in).

**Example**

Input: $points = [[1,3],[-2,2]], k = 1$

Output: $[[-2,2]]$

Explanation:

The distance between $(1, 3)$ and the origin is $\sqrt(10)$.

The distance between (-2, 2) and the origin is $\sqrt(8)$.

Since $\sqrt(8) < \sqrt(10)$, $(-2, 2)$ is closer to the origin.
We only want the closest $k = 1$ points from the origin, so the answer is just $[[-2,2]]$.


Input: $points = [[3,3],[5,-1],[-2,4]], k = 2$

Output: $[[3,3],[-2,4]]$

Explanation: The answer $[[-2,4],[3,3]]$ would also be accepted.

### Algorithm

**Naive approach**

When encounter such problem, sorting will be the first thing that comes to our mind: we sort the list with respect to the euclidiean distance between origin and points, then pick the first $k$ elements. By applying sorthing algorithm, we will have a run time of $O(nlogn)$. 

**Optimization**

Notice the program only need to return $k$ points,and order does not matter. We will optimize the original algorithm from here by using a min-heap with fixed size of $n-k$. Thanks to the property of min-heap, if we build it using the euclidean distance of input points, we are guaranteed to have first $k$ points have smaller distance than the rest of points. So, when the min-heap has size bigger than $n-k$, we extract the minimum element from it and add it to the list we want to return. 

**Pseudocode**

```
func KClosest(points, k):
    result = []
    heap = {}
    for x,y in point, do:
        euclidean_dist = sqrt((x-0)^2+(y-0)^2))
        heap.push((euclidean_dist, (x,y))
        if size(heap) > n-k:
            result.add(extract_min(heap))
    return result

```

In [7]:
import math
import heapq

def kClosest(points, k):
    n = len(points)
    result = []
    min_heap = []

    for point in points:
        euclidean_dist = math.sqrt((point[0]-0)**2+(point[1]-0)**2)
        heapq.heappush(min_heap, (euclidean_dist, point))
        if len(min_heap) > n-k:
            result.append(heapq.heappop(min_heap)[1])

    return result

points = [[3,3],[5,-1],[-2,4]]
k = 2

KClosest_points = kClosest(points, k)
print(f"Top {k} cloeset points to origin are: {KClosest_points}")

Top 2 cloeset points to origin are: [[3, 3], [-2, 4]]


### Time complexity

The program iterate through each point, in each iteration, it computes euclidean distance ($O(1)$) and $extract-min (O(log(k))$ if size of the heap $> n-k$. Therefore, the time complexity is $O(nlogk)$. 

### Playbook for top $k$ problems

The heap data structure is extremely useful when solving such problems. 

   - If we want top $k$ greatest, condition is $size(heap) > k$
   - If we want top $k$ smallest, condition is $size(heap) > n-k$


The code snippet looks like this:

```
heap = {}
result = []

for item in list l:
    do something here
    heap.push(item)
    if conditions:
        result.add(item)

return result

```

### Similar problems

- Leetcode 347. Top K frequent elements

- Leetcode 703. Kth Largest Element in a Stream

- Leetcode 703. Kth Largest Element in a Stream