# Weekly Challenge 15

*Original URL* https://community.alteryx.com/t5/Weekly-Challenge/Challenge-15-Warehouse-Shipped-Miles/td-p/36744 and [**My Alteryx Approach**](https://github.com/dsmdavid/Alteryx-Weekly-Challenge/tree/master/submitted/sub_Challenge%2315)

## Brief
Based on last week’s warehouse distribution exercise, we want to calculate the total shipped miles per item.  The products are available from 3 different warehouses. 

The objective is to find the total distance travelled (straight line miles) for each item based on it being shipped from the closest warehouse.  


In [1]:
import pandas as pd
import os
import numpy as np
from geopy.distance import geodesic, great_circle, vincenty

## Approach I want to follow:
1. Read the files, find the closest warehouse to each location.
1. Multiply the distance times the number of items.
1. Summarize.

## 1. Read and combine

In [2]:
#Read the files
os.chdir(os.path.join(os.getcwd(), '15_files'))

In [3]:
os.listdir()

['01_storesPriority.csv',
 '02_assignedItems.csv',
 '15_solutions.csv',
 'warehouseLocations.csv']

In [4]:
#Load the data:
input_dfStores = pd.read_csv("./01_storesPriority.csv", encoding="latin")     
input_dfItems = pd.read_csv("./02_assignedItems.csv", encoding = "latin")
input_dfWarehouse = pd.read_csv("./warehouseLocations.csv", encoding = "latin")

In [5]:
#Get LatLon points for distances:
input_dfWarehouse['loc'] = input_dfWarehouse.apply(lambda row: (row['Lat'], row['Lon']), axis =1)
input_dfStores['loc'] = input_dfStores.apply(lambda row: (row['Lat'], row['Lon']), axis =1)

In [6]:
# Transient
temp = input_dfWarehouse[['Warehouse','loc']].transpose()
temp.columns = temp.loc['Warehouse',:]
temp.drop(labels='Warehouse', inplace=True)
temp['join'] = 1
temp

Warehouse,Main,Second,Third,join
loc,"(34.737177, -86.603266)","(31.767002, -106.49205800000001)","(38.825775, -104.831478)",1


In [7]:
input_dfStores['join'] = 1

new_df = input_dfStores.merge(temp, on='join')

new_df.reset_index(drop=True, inplace=True)

In [8]:
def calc_distance(point1,point2):
#    return vincenty(point1,point2).miles
    return geodesic(point1,point2).miles
#    return great_circle(point1, point2).miles


def shorter_distance_to_warehouse(row):
    return min([calc_distance(row['loc'],row[name]) for name in ["Main","Second", "Third"]])

In [9]:
new_df['shorter_distance'] = new_df.apply(shorter_distance_to_warehouse, axis=1)

In [10]:
new_df = new_df[['Store', 'shorter_distance']]

## Distance to closest warehouse

In [11]:
new_df.head()

Unnamed: 0,Store,shorter_distance
0,A,623.173245
1,B,91.462267
2,C,1065.81719
3,D,559.448123
4,E,96.659644


# 2. Combine distance from warehouse with number of items

In [12]:
distances_df = input_dfItems.merge(new_df, on ='Store')

distances_df['sumPerStore'] = distances_df['Assigned']*distances_df['shorter_distance']

## 3. Summarize

In [13]:
test = distances_df.groupby(by='Item').sum()['sumPerStore']
test

Item
1     468252.136867
2     521754.915047
3     419114.863699
4     472036.061198
5     613912.020098
6     338841.852195
7     654964.615885
8     465743.921370
9     386097.719547
10    386261.377667
Name: sumPerStore, dtype: float64

## 4. Compare to existing results

In [14]:

#Load existing results
solution_df = pd.read_csv("./15_solutions.csv", encoding = "latin")
solution_df.set_index('Item')

solution_df = solution_df.join(test, on='Item')
solution_df['Diff'] = solution_df['TotalMiles'] - solution_df['sumPerStore']
solution_df

Unnamed: 0,Item,TotalMiles,sumPerStore,Diff
0,1,467764.318078,468252.136867,-487.818788
1,2,520950.356459,521754.915047,-804.558587
2,3,418529.434085,419114.863699,-585.429614
3,4,471483.324599,472036.061198,-552.736599
4,5,613089.857982,613912.020098,-822.162116
5,6,338432.743812,338841.852195,-409.108383
6,7,654162.976293,654964.615885,-801.639592
7,8,465070.542813,465743.92137,-673.378557
8,9,385441.880195,386097.719547,-655.839351
9,10,385761.522225,386261.377667,-499.855442


In [15]:
# The results are closed, but not the same, due to the differences in the distance calculation between Alteryx & geopy