### Case Study

- **cupcake purchases** based on **proximity of customers** (customer location, or other customers in the similar region) 
- Case Senario: Assume that a new customer just purchased his first cupcake, and we want to develop some expectation for how many cupcakes he may purchase from us in the following year. 

### Code Structure 
same logic as `ml-knn-distance`

1. Write a function to calculate the distance of one neighbor from another
2. Write a function that returns the distance between one neighbor and all others (using `map`)
3. Return a selected number of nearest neighbors

In [5]:
import math

def distance(selected_individual, neighbor):
   distance_squared = (neighbor['x'] - selected_individual['x'])**2 + (neighbor['y'] - selected_individual['y'])**2
   return math.sqrt(distance_squared)

def distance_between_neighbors(selected_individual, neighbor):
    neighbor_with_distance = neighbor.copy()
    neighbor_with_distance['distance'] = distance(selected_individual, neighbor)
    return neighbor_with_distance

def distance_all(selected_individual, neighbors):
    remaining_neighbors = filter(lambda neighbor: neighbor != selected_individual, neighbors)
    return list(map(lambda neighbor: distance_between_neighbors(selected_individual, neighbor), remaining_neighbors))

In [6]:
neighbors = [{'name': 'Bob', 'x': 4, 'y': 8, 'purchases': 52}, {'name': 'Suzie', 'x': 1, 'y': 11, 'purchases': 70}, 
             {'name': 'Fred', 'x': 5, 'y': 8, 'purchases': 60}, {'name': 'Edgar', 'x': 6, 'y': 13, 'purchases': 20},
             {'name': 'Steven', 'x': 3, 'y': 6, 'purchases': 32}, {'name': 'Natalie', 'x': 5, 'y': 4, 'purchases': 45}]
bob = neighbors[0]
suzie = neighbors[1]

In [7]:
import plotly

plotly.offline.init_notebook_mode(connected=True)
trace0 = dict(x=list(map(lambda neighbor: neighbor['x'],neighbors)), 
              y=list(map(lambda neighbor: neighbor['y'],neighbors)),
              text=list(map(lambda neighbor: neighbor['name'] + ': ' + str(neighbor['purchases']),neighbors)),
              mode='markers')
plotly.offline.iplot(dict(data=[trace0], layout={'xaxis': {'dtick': 1}, 'yaxis': {'dtick': 1}}))

In [9]:
def nearest_neighbors(selected_individual, neighbors, number = None):
    number = number or len(neighbors)
    neighbor_distances = distance_all(selected_individual, neighbors)
    sorted_neighbors = sorted(neighbor_distances, key=lambda neighbor: neighbor['distance'])
    return sorted_neighbors[:number]

In [10]:
bob = neighbors[0]
nearest_neighbor_to_bob = nearest_neighbors(bob, neighbors, 1)
nearest_neighbor_to_bob

[{'distance': 1.0, 'name': 'Fred', 'purchases': 60, 'x': 5, 'y': 8}]

In [11]:
# new customer with location {'x': 4, 'y': 3}
nearest_three_neighbors = nearest_neighbors({'x': 4, 'y': 3}, neighbors, 3)
nearest_three_neighbors

[{'distance': 1.4142135623730951,
  'name': 'Natalie',
  'purchases': 45,
  'x': 5,
  'y': 4},
 {'distance': 3.1622776601683795,
  'name': 'Steven',
  'purchases': 32,
  'x': 3,
  'y': 6},
 {'distance': 5.0, 'name': 'Bob', 'purchases': 52, 'x': 4, 'y': 8}]

In [12]:
# predict new customer future purchases based on 3 nn
purchases = list(map(lambda neighbor: neighbor['purchases'],nearest_three_neighbors))
average = sum(purchases)/len(purchases)
average # 43.0

43.0