# Airplanes API Question from Jason Sinn

Create an API `def closestN(airplane_coordinates: Array[(x, y)], airplane_loc: (x, y), num_airplanes: Int): Array[(x, y)]` where `type((x, y)) = tuple(int, int))`. cloestN will take the airplane coordinates and a airport location, then return the number of airplanes closest to the airport

## Clarifications
1. `airplane_coordinates` is not sorted
2. Use Cartesian Distance
3. x, y can be negative numbers
4. Resulting array's order does not matter
5. (New!) Don't use the DataFrame structure

## Rough Notes from Gabe
- A high-level solution would be to calculate all the cartesian distance of every coordinate (keeping track of the minimum while going through) at O(n), then tracing the results again for the minimum

# Code

## Sample Data

In [1]:
test_input_1 = [
    (0, 1),
    (1, 0),
    (0, -1),
    (-1, 0),
    (1, 1),
    (1, -1),
    (-1, 1),
    (-1, -1)
]
result_input_1 = [
    (0, 1),
    (1, 0),
    (0, -1),
    (-1, 0),
    (1, 1),
    (1, -1),
    (-1, 1),
    (-1, -1)
]
test_origin_1 = (0, 0)

## Global Imports

In [2]:
import pandas as pd

In [3]:
import numpy as np

## General Functions

In [5]:
def closestN(airplane_coordinates, airport_loc, num_airports):
    return solution_1(airplane_coordinates, airport_loc, num_airports)

In [6]:
def euclidean_distance(p, q):
    result = np.sqrt(np.square(q[0] - p[0]) + np.square(q[1] - p[1]))
    print(result)
    return result

In [7]:
num_airports = 2

## Solution 1
DataFrame solution

**Jason said to solve this without dataframes**

In [8]:
df = pd.DataFrame(test_input_1, columns=['x', 'y'])

In [9]:
def euclid_dist_helper(row, airport):
    return euclidean_distance((row['x'], row['y']), airport)

In [10]:
df['distance'] = df.apply(lambda x: euclid_dist_helper(x, test_origin_1), axis=1)

1.0
1.0
1.0
1.0
1.4142135623730951
1.4142135623730951
1.4142135623730951
1.4142135623730951


In [11]:
current_min = df['distance'].min()

In [12]:
closest_planes_df = df.sort_values(by=['distance'], ascending=True).head(num_airports)

In [13]:
closest_planes_df.drop(columns=['distance'])

Unnamed: 0,x,y
0,0,1
1,1,0


In [14]:
def f(a, b):
    return (a, b)
result = [f(a, b) for a, b in zip(closest_planes_df['x'], closest_planes_df['y'])]

In [15]:
print(result)

[(0, 1), (1, 0)]


# Solution 2

Iterate through the list of airplane coordinates at O(n) and calculate it's distance. For the first m = num_airplanes, store (coordinate, distance) into a **priority queue with a heap** where priority is set by distance. This is so we can peak the element with the greatest distance in prio_queue of length m at O(1) time, and insert/delete things at O(log(n)) time. After the first m length prio_queue is made, you just iterate through the rest of the coordinates and update the priority_queue respectively

## Notes
- Python offers heappq as a library, however it specifically in Py, it is a min-heap
- Should look into heap / bst implementation options then implement your own Priority Queue

## Priority Queue Implementation

In [21]:
def heapify():
    return 'hello'

In [22]:
def get_max():
    return 'hello'

In [23]:
def remove_max():
    return 'hello'

In [20]:
def insert():
    return 'hello'