### Install wheels for Basemap
- install Proj: https://proj.org/install.html#install
- go to above link >> find Windows: click OSGeo4W >> download 64bit >> following above link's Window section to isntall PROJ
- install basemap wheel and pyproj wheel from link: https://www.lfd.uci.edu/~gohlke/pythonlibs/
- find: Basemap: a matplotlib toolkit for plotting 2D data on maps based on GEOS. 
- find: Pyproj: an interface to the PROJ library for cartographic transformations.
- #### Important: pip install numpy --upgrade ###

### Install wheels for geopandas 
Installing geopandas and its dependencies manually
refer to: https://stackoverflow.com/questions/34427788/how-to-successfully-install-pyproj-and-geopandas

Installing geopandas and its dependencies manually

1. First and most important: do not try to directly pip install or conda install any of the dependencies – if you do, they will fail in some way later, often silently or obscurely, making troubleshooting difficult. If any are already installed, uninstall them now.

2. Download the wheels for GDAL, Fiona, pyproj, rtree, and shapely from Gohlke. Make sure you choose the wheel files that match your architecture (64-bit) and Python version (2.7 or 3.5). If Gohlke mentions any prerequisites in his descriptions of those 5 packages, install the prerequisites now (there might be a C++ redistributable or something similar listed there)

3. If OSGeo4W, GDAL, Fiona, pyproj, rtree, or shapely is already installed, uninstall it now. The GDAL wheel contains a complete GDAL installation – don’t use it alongside OSGeo4W or other distributions.

4. Open a command prompt and change directories to the folder where you downloaded these 5 wheels.

5. pip install the GDAL wheel file you downloaded. Your actual command will be something like: pip install
GDAL-1.11.2-cp27-none-win_amd64.whl

6. Add the new GDAL path to the windows PATH environment variable, something like C:\Anaconda\Lib\site-packages\osgeo
pip install your Fiona wheel file, then your pyproj wheel file, then rtree, and then shapely.

7. Now that GDAL and geopandas’s dependencies are all installed, you can just pip install geopandas from the command prompt

# MilkRun Initial Routing Modeling

In [125]:
# import general packages:
from openpyxl import load_workbook
import win32com.client
import numpy as np
import pandas as pd
from pandas import Grouper
from pandas import Timestamp
import os
import io
import datetime as dt
import time 
import feather
import itertools
from math import sqrt
import csv
import dask.dataframe as dd
from datetime import datetime
import timestring
from IPython.core.display import display, HTML
from collections import Counter
from collections import defaultdict

# import modeling packages
from sklearn.cluster import AffinityPropagation
from sklearn.cluster import KMeans
from sklearn import preprocessing, datasets
from sklearn.metrics import pairwise_distances_argmin
from scipy.spatial.distance import cdist,pdist
from scipy import stats
from scipy.sparse import *

# import visualization packages:
from matplotlib import pyplot as plt
# from mpl_toolkits.basemap import Basemap
import seaborn as sns
# import ggplot
%matplotlib inline

# checking path and dir
os.chdir('C:\\Users\\u279014\\Documents\\H_Drive\\7.AA Models\\12.Logistic_Optimization\\data')
os.getcwd()

'C:\\Users\\u279014\\Documents\\H_Drive\\7.AA Models\\12.Logistic_Optimization\\data'

In [126]:
from __future__ import print_function
from ortools.constraint_solver import routing_enums_pb2
from ortools.constraint_solver import pywrapcp

In [127]:
def riding_distance(riding_distance_matrix, geo):
    """
    Compute a distance matrix of the coordinates using a spherical metric.
    :param  
        coordinate_df: numpy.ndarray with shape (n,n); riding_distance_matri: dataframe, col & index type: str 
        geo_zipcode: Data.Series, element type: str
    :returns distance_mat: numpy.ndarray with shape (n, n) containing distance in km between coords.
    """
    d_matrix = []
    zipcodes = geo['zip_code'].apply(lambda x: str(x))
    for i in zipcodes:
        d_row = []
        for j in zipcodes:
            d_row.append(riding_distance_matrix.loc[i,j])
        d_matrix.append(d_row)
    return np.asarray(d_matrix)

In [128]:
def load_riding_distance_matrix(path,file):
    riding_distance_matrix = pd.read_excel(os.path.join(path,file)).set_index('zipcode')
    riding_distance_matrix.columns = riding_distance_matrix.columns.astype('str')
    riding_distance_matrix.index = riding_distance_matrix.index.astype('str')
    return riding_distance_matrix

##  Modeling Start >>>>>>
## 1. Data_prep
### 1.1 load saved feather supplier-cluster dataset

### dictionary for osk_hub 

In [129]:
cass_zip_cluster = pd.read_csv('cass_zip_cluster.csv')
cluster_copy = cass_zip_cluster.copy() # make a copy of original dataset
cluster_copy = cluster_copy[cluster_copy.label != -1] # drop label(cluser) = -1, which do not belong to any group
cluster_copy['shipping_date'] = '10-01-2019'

In [130]:
cluster_copy['zip_code'] = cluster_copy.zip_code.astype('str')

In [131]:
cluster_copy.sort_values(by='ship_weight')

Unnamed: 0,zip_code,longitude,latitude,cluster,shipper_name,shipper_state,ship_weight,miles,billed_amount,label,shipping_date
33,30093,-84.17940,33.909952,south_east,DEUTZ CORPORATION,GA,0,0,26.45,5,10-01-2019
56,46619,-86.31341,41.667797,mid_west,JLG CO STAND,IN,0,0,47.73,6,10-01-2019
76,53932,-89.05837,43.407179,mid_west,"ROBBINS MANUFACTURING, INC.",WI,0,0,28.41,6,10-01-2019
15,18020,-75.32938,40.656498,north_east,BOSCH REXROTH,PA,1,144,900.00,3,10-01-2019
51,45891,-84.57871,40.874092,other,EATON CORPORATION PLC,OH,75,450,67.17,6,10-01-2019
...,...,...,...,...,...,...,...,...,...,...,...
58,46619,-86.31341,41.667797,mid_west,JLG PA CO ST,IN,253144,6504,19755.90,6,10-01-2019
37,43402,-83.65795,41.388519,other,"ROSENBOOM MACHINE & TOOL, INC.",OH,649690,9066,22226.42,6,10-01-2019
52,46052,-86.46592,40.047966,mid_west,KOUNS ENGINE,IN,685515,10803,39007.86,6,10-01-2019
60,47201,-85.94560,39.185341,mid_west,CUMMINS,IN,718000,12305,44870.62,6,10-01-2019


In [132]:
cluster_copy = cluster_copy[cluster_copy.ship_weight < 45000].reset_index(drop=True)
cluster_copy

Unnamed: 0,zip_code,longitude,latitude,cluster,shipper_name,shipper_state,ship_weight,miles,billed_amount,label,shipping_date
0,11413,-73.75141,40.670138,north_east,CEVA FM,NY,8763,246,1185.72,0,10-01-2019
1,11413,-73.75141,40.670138,north_east,CEVA LOGISTI,NY,5030,246,1192.71,0,10-01-2019
2,11435,-73.80986,40.700068,north_east,ROSCO INC,NY,1070,486,226.59,0,10-01-2019
3,11735,-73.44151,40.725968,north_east,TAPESWITCH,NY,1156,264,194.31,0,10-01-2019
4,15767,-78.97017,40.954059,north_east,BFG MANUFACTURING SERVICES,PA,1792,656,401.32,1,10-01-2019
...,...,...,...,...,...,...,...,...,...,...,...
61,60069,-87.92717,42.188074,mid_west,HYDRAFORCE,IL,14864,666,1944.72,6,10-01-2019
62,60139,-88.07891,41.920228,mid_west,HYDAC INTERNATIONAL,IL,1794,3900,422.92,6,10-01-2019
63,60188,-88.13688,41.918578,mid_west,NACA LOGISTI,IL,14589,1302,1849.21,6,10-01-2019
64,60191,-87.97688,41.962979,mid_west,POWER GREAT,IL,6878,1294,648.44,6,10-01-2019


### 1.2 choose supplier-cluster to run milkrun Model

### Select top n supplier-cluster

In [133]:
rank = 1 # option for choosing supplier-cluster to run milkrun
label_no = Counter(cluster_copy.label).most_common()[rank-1][0]
cluster = cluster_copy[cluster_copy.label == label_no]

# only append Greenville WH with sliced clusering
greenville = pd.DataFrame([['54942',-88.53557,44.293820,'mid_west','GREENVILLE_WH','WI',0,0,0,999,'01-01-2019']], 
                          columns=cluster.columns)

chanbersburg = pd.DataFrame([['17201',-77.6614, 39.93112,'east','CHANBERSBURG_WH','PA',0,0,0,999,'01-01-2019']], 
                          columns=cluster.columns)


cass_zip_cluster_copy = chanbersburg.append(cluster).reset_index(drop = True)

In [134]:
cass_zip_cluster_copy

Unnamed: 0,zip_code,longitude,latitude,cluster,shipper_name,shipper_state,ship_weight,miles,billed_amount,label,shipping_date
0,17201,-77.6614,39.93112,east,CHANBERSBURG_WH,PA,0,0,0.0,999,01-01-2019
1,43512,-84.36539,41.29037,other,DEFIANCE METAL PRODUCTS,OH,7331,1752,1068.53,6,10-01-2019
2,43551,-83.58904,41.540724,other,KENAKORE SOL,OH,7504,1564,955.59,6,10-01-2019
3,45242,-84.36042,39.242559,other,PMP INDUSTRI,OH,635,424,102.37,6,10-01-2019
4,45414,-84.19381,39.820807,other,WURTH ELECTR,OH,1116,1224,283.44,6,10-01-2019
5,45869,-84.38731,40.452556,other,"THIEMAN QUALITY METAL FAB, INC.",OH,1134,1712,348.0,6,10-01-2019
6,45891,-84.57871,40.874092,other,EATON CORPORATION PLC,OH,75,450,67.17,6,10-01-2019
7,46383,-87.03165,41.47339,mid_west,DYNATECT,IN,20756,589,2131.6,6,10-01-2019
8,46383,-87.03165,41.47339,mid_west,DYNATECT MFG,IN,29098,2380,8658.98,6,10-01-2019
9,46619,-86.31341,41.667797,mid_west,JLG CO STAND,IN,0,0,47.73,6,10-01-2019


### 1.3 Samples Initialization with small selections: 100 locations

In [135]:
path = r'C:\Users\u279014\Documents\H_Drive\7.AA Models\12.Logistic_Optimization\data'
file = r'riding_distance_matrix.xlsx'
riding_distance_matrix = load_riding_distance_matrix(path,file)

In [136]:
cass_zip_toy = cass_zip_cluster_copy[:100]

In [137]:
distance_matrix_toy = riding_distance(riding_distance_matrix, cass_zip_toy)

In [138]:
distance_matrix_toy.shape

(28, 28)

In [139]:
# distance_matrix_toy = distance_on_sphere_numpy(cass_zip_toy)
# df_distance_matrix = pd.DataFrame(distance_matrix_toy,index=cass_zip_toy.zip_code,columns=cass_zip_toy.zip_code)

In [140]:
unique_cass_zip_toy = cass_zip_toy.drop_duplicates(subset=['zip_code'])
unique_distance_matrix_toy = riding_distance(riding_distance_matrix, unique_cass_zip_toy)
# unique_distance_matrix_toy = distance_on_sphere_numpy(unique_cass_zip_toy)
df_unique_distance_matrix = pd.DataFrame(unique_distance_matrix_toy,
                                         index=unique_cass_zip_toy.zip_code,
                                         columns=unique_cass_zip_toy.zip_code)

ship_wight_list_toy = cass_zip_toy.ship_weight.tolist()
sum(ship_wight_list_toy)

276913

## 2. Model_Prep
### I. Initilizing Opt-model

In [141]:
def create_data_model(distance_matrix=0, 
                      ship_weight_list = 0, 
                      each_vehicle_capacity = 45000, 
                      num_vehicles = 30,
                      nrLocations = 9):
    """Stores the data for the problem."""
    data = {}
    data['distance_matrix']=distance_matrix
    data['demands'] = ship_weight_list
    data['vehicle_capacities'] = [each_vehicle_capacity]*num_vehicles
    data['num_vehicles'] = num_vehicles
    data['depot']=0
    data['nrLocations'] = nrLocations
    return data

### II. Customized model output_NCv-2

In [142]:
""" optimize algorithm for accurate route """
def print_solution_3(data, manager, routing, assignment):
    """Prints assignment on console."""
    total_distance = 0
    total_load = 0
    
    vehicle_routes = dict() # for list out the same truck pick zipcodes

    for vehicle_id in range(data['num_vehicles']):
        index = routing.Start(vehicle_id)
        plan_output = 'Route for vehicle {}:\n'.format(vehicle_id)
        plan_output_backward = 'Route for vehicle {}:\n'.format(vehicle_id) # if backward is shorter path
        route_distance = 0
        route_load = 0
        edge_distance = []
        while not routing.IsEnd(index):
            node_index = manager.IndexToNode(index)
            route_load += data['demands'][node_index]
            plan_output += ' {0} Load({1}) -> '.format(node_index, route_load)
            plan_output_backward += ' {0} Load({1}) <- '.format(node_index, route_load) # if backward is shorter path
            
            previous_index = index            
            index = assignment.Value(routing.NextVar(index))
            
            if vehicle_id in vehicle_routes:
                vehicle_routes[vehicle_id].append(node_index)   # adding zipcodes to same truck
            else:
                vehicle_routes[vehicle_id] = [node_index]
            
            route_distance += routing.GetArcCostForVehicle(previous_index, index, vehicle_id)
            edge_distance.append(routing.GetArcCostForVehicle(previous_index, index, vehicle_id))
        
        # adding destination to entire route

        """ this situation is Fudging Headacheeeeeeee"""
        # distance from greenville to first supplier is larger than last supplier to greenville, 
        # truck starts from first supplier, remove first span of driving from VRP
        if edge_distance[0] >= edge_distance[-1]:
            vehicle_routes[vehicle_id].append(0)
            vehicle_routes[vehicle_id].pop(0)
            route_distance = route_distance - edge_distance[0]
            plan_output += ' {0} Load({1})\n'.format(manager.IndexToNode(index),route_load)
            plan_output += 'Distance of the route: {} miles\n'.format(route_distance)
            plan_output += 'Load of the route: {}\n'.format(route_load)
            # print(plan_output)
            print(plan_output.replace('0 Load(0) ->  ',''))
            total_distance += route_distance
            total_load += route_load
        
        # truck starts form last supplier,remove last span of driving from VRP
        else:
            route_distance = route_distance - edge_distance[-1]
            vehicle_routes[vehicle_id] = vehicle_routes[vehicle_id][::-1]
            plan_output_backward += ' {0} Load({1})\n'.format(manager.IndexToNode(index),route_load)
            plan_output_backward += 'Distance of the route: {} miles\n'.format(route_distance)
            plan_output_backward += 'Load of the route: {}\n'.format(route_load)
            print(plan_output_backward)
            total_distance += route_distance
            total_load += route_load
    print('Total distance of all routes: {} miles'.format(total_distance))
    print('Total load of all routes: {}'.format(total_load))
    return vehicle_routes

### III. Running Opt_Medel: initialize truck_max_capacity & total truck_available

In [143]:
num_v = 30
num_stops = 7
v_capacity = 45000
n_route_location = 5

In [144]:
# Initiate data problem
_data = create_data_model(distance_matrix=distance_matrix_toy,
                         ship_weight_list=ship_wight_list_toy,
                         each_vehicle_capacity=v_capacity,
                         num_vehicles=num_v,
                        nrLocations=n_route_location)

In [145]:
# Create routing index manager
manager = pywrapcp.RoutingIndexManager(len(_data['distance_matrix']),_data['num_vehicles'],_data['depot'])

In [146]:
# Create Routing Model
routing = pywrapcp.RoutingModel(manager)

In [147]:
# Register transit callback
def distance_callback(from_index, to_index):
    from_node = manager.IndexToNode(from_index)
    to_node = manager.IndexToNode(to_index)
    return _data['distance_matrix'][from_node][to_node]

transit_callback_index = routing.RegisterTransitCallback(distance_callback)

In [148]:
# Define cost of each arch
routing.SetArcCostEvaluatorOfAllVehicles(transit_callback_index)

In [149]:
# dimension_name = 'Distance'
# routing.AddDimension(transit_callback_index,
#         0,  # no slack
#         int(np.sum(data['distance_matrix'])),  # vehicle maximum travel distance
#         True,  # start cumul to zero
#         dimension_name)
# distance_dimension = routing.GetDimensionOrDie(dimension_name)
# distance_dimension.SetGlobalSpanCostCoefficient(5)

## <<< try adding dimention for stops limitation >>>

In [150]:
# Add count_stops constraint
count_stop_callback = routing.RegisterUnaryTransitCallback(lambda index: 1)
dimension_name = 'Counter'
routing.AddDimension(count_stop_callback,
                     0,
                     v_capacity,
                     True,
                     'Counter'
                    )

True

In [151]:
counter_dimension = routing.GetDimensionOrDie(dimension_name)

# add sovler to count stop numbers  
for vehicle_id in range(num_v):
    index = routing.End(vehicle_id)
    solver = routing.solver()
    solver.Add(counter_dimension.CumulVar(index) <= num_stops)

#    solver.Add(counter_dimension.CumulVar(index).SetRange(3, 7)) 
#    Above >> [unsuccessful] set a range of stops  

In [152]:
# Add Capacity constraint
def demand_callback(from_index):
    from_code = manager.IndexToNode(from_index)
    return _data['demands'][from_code]

demand_callback_index = routing.RegisterUnaryTransitCallback(demand_callback)

routing.AddDimensionWithVehicleCapacity(demand_callback_index,
        0,  # null capacity slack
        _data['vehicle_capacities'],  # vehicle maximum capacities
        True,  # start cumul to zero
        'Capacity')

# Adding penalty for loading weight exceeds truck capacity
penalty = 1000
for node in range(1, len(_data['distance_matrix'])):
    routing.AddDisjunction([manager.NodeToIndex(node)], penalty)

In [153]:
# Setting first solution heuristic.
search_parameters = pywrapcp.DefaultRoutingSearchParameters()
search_parameters.first_solution_strategy = (routing_enums_pb2.FirstSolutionStrategy.PATH_CHEAPEST_ARC)

In [154]:
# Solve the problem.
assignment = routing.SolveWithParameters(search_parameters)

In [155]:
if assignment:
    route_dictionary = print_solution_3(_data,manager,routing,assignment)

Route for vehicle 0:
 0 Load(0)
Distance of the route: 0 miles
Load of the route: 0

Route for vehicle 1:
 0 Load(0)
Distance of the route: 0 miles
Load of the route: 0

Route for vehicle 2:
 0 Load(0)
Distance of the route: 0 miles
Load of the route: 0

Route for vehicle 3:
 0 Load(0)
Distance of the route: 0 miles
Load of the route: 0

Route for vehicle 4:
 0 Load(0)
Distance of the route: 0 miles
Load of the route: 0

Route for vehicle 5:
 0 Load(0)
Distance of the route: 0 miles
Load of the route: 0

Route for vehicle 6:
 0 Load(0)
Distance of the route: 0 miles
Load of the route: 0

Route for vehicle 7:
 0 Load(0)
Distance of the route: 0 miles
Load of the route: 0

Route for vehicle 8:
 0 Load(0)
Distance of the route: 0 miles
Load of the route: 0

Route for vehicle 9:
 0 Load(0)
Distance of the route: 0 miles
Load of the route: 0

Route for vehicle 10:
 0 Load(0)
Distance of the route: 0 miles
Load of the route: 0

Route for vehicle 11:
 0 Load(0)
Distance of the route: 0 miles


## 3. Result Visualization to PowerBI

In [156]:
def route_schedule(route_dictionary):
    """ generat truck:pick_node map in dataFrame """
    df = pd.DataFrame()
    for k in route_dictionary.keys():
        if len(route_dictionary[k]) == 1: # this step eliminate dummy trucks like #0,#1 trucks doing nothing
            continue
        for v in route_dictionary[k]:
            df = df.append(pd.DataFrame({'truck_number':[k],'pick_node':[v]}))
    return df.reset_index(drop = True)

In [157]:
route_schedule = route_schedule(route_dictionary)

In [158]:
route_schedule

Unnamed: 0,truck_number,pick_node
0,24,27
1,24,11
2,24,15
3,24,14
4,24,23
5,24,0
6,25,26
7,25,24
8,25,25
9,25,7


### Note: input of Graph must be unique distance matrix 

In [159]:
def distance_index(df,x):
    '''
    param:
        df: distance matrix with UNIQUE index & columns
        x: truck location source and truck location next-stop 
    return:
        DataFrame: distance matrix
    '''
    try:
        return df.loc[x[0],x[1]]
    except:
        return 0

In [160]:
cass_zip_toy.head()

Unnamed: 0,zip_code,longitude,latitude,cluster,shipper_name,shipper_state,ship_weight,miles,billed_amount,label,shipping_date
0,17201,-77.6614,39.93112,east,CHANBERSBURG_WH,PA,0,0,0.0,999,01-01-2019
1,43512,-84.36539,41.29037,other,DEFIANCE METAL PRODUCTS,OH,7331,1752,1068.53,6,10-01-2019
2,43551,-83.58904,41.540724,other,KENAKORE SOL,OH,7504,1564,955.59,6,10-01-2019
3,45242,-84.36042,39.242559,other,PMP INDUSTRI,OH,635,424,102.37,6,10-01-2019
4,45414,-84.19381,39.820807,other,WURTH ELECTR,OH,1116,1224,283.44,6,10-01-2019


In [161]:
route_in_weight = route_schedule.merge(cass_zip_toy,left_on='pick_node',right_index=True,how='left')
route_in_weight['next_zip_code'] = route_in_weight.groupby(['truck_number'])['zip_code'].shift(-1)
route_in_weight['next_shipper_name'] = route_in_weight.groupby(['truck_number'])['shipper_name'].shift(-1)

route_in_weight['milk_run_distance'] = route_in_weight[['zip_code','next_zip_code']].apply(lambda x: round(distance_index(df_unique_distance_matrix,x)),axis=1)
route_in_weight['stop_number'] = route_in_weight.groupby('truck_number').cumcount()

In [162]:
route_in_weight

Unnamed: 0,truck_number,pick_node,zip_code,longitude,latitude,cluster,shipper_name,shipper_state,ship_weight,miles,billed_amount,label,shipping_date,next_zip_code,next_shipper_name,milk_run_distance,stop_number
0,24,27,61109,-89.05595,42.213439,mid_west,"BERGSTROM, INC.",IL,161,2130,282.32,6,10-01-2019,53027.0,HARTFORD FIN,104.0,0
1,24,11,53027,-88.37332,43.313361,mid_west,HARTFORD FIN,WI,1006,758,142.35,6,10-01-2019,53154.0,AAA SALES,51.0,1
2,24,15,53154,-87.8992,42.884347,mid_west,AAA SALES,WI,25000,715,2100.0,6,10-01-2019,53154.0,A A A SALES,0.0,2
3,24,14,53154,-87.8992,42.884347,mid_west,A A A SALES,WI,2514,715,315.5,6,10-01-2019,60069.0,HYDRAFORCE,55.0,3
4,24,23,60069,-87.92717,42.188074,mid_west,HYDRAFORCE,IL,14864,666,1944.72,6,10-01-2019,17201.0,CHANBERSBURG_WH,663.0,4
5,24,0,17201,-77.6614,39.93112,east,CHANBERSBURG_WH,PA,0,0,0.0,999,01-01-2019,,,0.0,5
6,25,26,60191,-87.97688,41.962979,mid_west,POWER GREAT,IL,6878,1294,648.44,6,10-01-2019,60139.0,HYDAC INTERNATIONAL,12.0,0
7,25,24,60139,-88.07891,41.920228,mid_west,HYDAC INTERNATIONAL,IL,1794,3900,422.92,6,10-01-2019,60188.0,NACA LOGISTI,2.0,1
8,25,25,60188,-88.13688,41.918578,mid_west,NACA LOGISTI,IL,14589,1302,1849.21,6,10-01-2019,46383.0,DYNATECT,82.0,2
9,25,7,46383,-87.03165,41.47339,mid_west,DYNATECT,IN,20756,589,2131.6,6,10-01-2019,46619.0,JLG CO STAND,45.0,3


In [163]:
route_in_weight.to_csv(r'C:\Users\u279014\Documents\H_Drive\7.AA Models\12.Logistic_Optimization\data\route_in_weight.csv',index=True,index_label='time_sequence')
# route_in_weight.to_csv(r'S:\CORP-Share\DEPT\IT\DT-AA\FY20\GPSC\UseCases\8. Logistics Route Optimization\route_in_weight.csv',index=True,index_label='time_sequence')

##  Analytical Result: Miles & Cost Saving Comparison

In [164]:
# distance matrix
df_unique_distance_matrix

zip_code,17201,43512,43551,45242,45414,45869,45891,46383,46619,46788,...,53932,54110,54456,54481,60056,60069,60139,60188,60191,61109
zip_code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
17201,0.0,438.2,393.7,424.4,408.0,425.1,448.2,582.7,539.7,477.3,...,797.2,822.3,924.1,874.4,654.3,661.9,648.9,650.4,646.2,709.9
43512,438.1,0.0,48.5,162.3,112.2,73.9,38.8,156.8,118.2,48.3,...,383.0,408.1,509.9,460.2,240.1,247.7,234.7,236.1,232.0,295.7
43551,390.8,50.7,0.0,187.8,137.7,103.5,87.5,199.8,156.8,89.8,...,414.3,439.4,541.2,491.5,271.3,279.0,266.0,267.4,263.2,326.9
45242,423.9,161.7,185.6,0.0,50.0,100.8,142.0,278.6,258.2,191.5,...,475.0,500.0,601.9,552.1,332.0,339.6,326.7,328.1,323.9,387.6
45414,407.9,112.1,136.0,50.5,0.0,51.3,92.4,231.1,214.4,141.9,...,463.4,488.4,590.2,540.5,320.4,328.0,315.1,316.5,312.3,376.0
45869,426.3,73.7,102.0,101.3,51.1,0.0,35.2,173.9,157.2,84.7,...,400.2,425.2,527.1,477.3,257.2,264.8,251.9,253.3,249.1,312.8
45891,450.0,38.6,85.3,142.7,92.6,35.2,0.0,146.8,130.1,49.6,...,373.1,398.2,500.0,450.2,230.1,237.8,224.8,226.2,222.0,285.7
46383,583.6,157.6,204.3,279.7,231.3,173.9,147.3,0.0,45.3,121.2,...,228.6,253.6,355.5,305.7,85.6,93.2,80.3,81.7,77.5,139.9
46619,542.2,118.3,156.4,259.1,214.9,157.6,130.9,45.4,0.0,104.9,...,261.5,286.5,388.3,338.6,118.5,126.1,113.1,114.6,110.4,174.1
46788,476.8,48.5,88.1,190.1,140.0,82.6,47.5,120.9,104.2,0.0,...,347.2,372.2,474.0,424.3,204.2,211.8,198.9,200.3,196.1,259.8


In [165]:
# routing work-order
route_in_weight[['truck_number','shipper_name','zip_code','milk_run_distance','next_shipper_name','next_zip_code','ship_weight','miles']]

Unnamed: 0,truck_number,shipper_name,zip_code,milk_run_distance,next_shipper_name,next_zip_code,ship_weight,miles
0,24,"BERGSTROM, INC.",61109,104.0,HARTFORD FIN,53027.0,161,2130
1,24,HARTFORD FIN,53027,51.0,AAA SALES,53154.0,1006,758
2,24,AAA SALES,53154,0.0,A A A SALES,53154.0,25000,715
3,24,A A A SALES,53154,55.0,HYDRAFORCE,60069.0,2514,715
4,24,HYDRAFORCE,60069,663.0,CHANBERSBURG_WH,17201.0,14864,666
5,24,CHANBERSBURG_WH,17201,0.0,,,0,0
6,25,POWER GREAT,60191,12.0,HYDAC INTERNATIONAL,60139.0,6878,1294
7,25,HYDAC INTERNATIONAL,60139,2.0,NACA LOGISTI,60188.0,1794,3900
8,25,NACA LOGISTI,60188,82.0,DYNATECT,46383.0,14589,1302
9,25,DYNATECT,46383,45.0,JLG CO STAND,46619.0,20756,589


In [166]:
route_in_weight.shipper_state.unique()

array(['IL', 'WI', 'PA', 'IN', 'OH'], dtype=object)

In [167]:
total_tmc_miles = route_in_weight.miles.sum()
total_milk_miles = route_in_weight.milk_run_distance.sum()
miles_saving = (total_tmc_miles-total_milk_miles)
print('-original_miles:{0} \n-milkrun_miles:{1}\n-miles reducton:{2}'.format(total_tmc_miles,total_milk_miles,miles_saving))

-original_miles:36011 
-milkrun_miles:4025.0
-miles reducton:31986.0


##  <<<<<<  Modeling Completed

## Financial Impact >>>>>>

In [168]:
def load_data(path,file,sheet_name = None):
    df = pd.read_excel(os.path.join(path,file),sheet_name=sheet_names)
    df = pd.concat(df[frame] for frame in df.keys())
    df.reset_index(drop=True, inplace=True)
    df.to_feather(os.path.join(path,'tmc_feather'))
    return feather.read_dataframe(os.path.join(path,'tmc_feather'))

In [169]:
path = r'C:\Users\u279014\Documents\H_Drive\7.AA Models\12.Logistic_Optimization\data'
file = r'TMC_freight_rate.xlsx'
sheet_names = ['Phase 1','Phase 2','Phase 3','Phase 4','Phase 5']

In [170]:
df = load_data(path=path,file=file,sheet_name=sheet_names)

In [171]:
# standardize dataframe colume names
def col_name(df):
    """
    this is to trim the data_frame column names to a unique format:
    all case, replace space to underscore, remove parentheses
    param df:
        raw from share drive for
    return:
        polished data set with new column names
    """
    df.columns = df.columns.str.strip().str.lower().str.replace('-','').str.replace(' ', '_').str.replace('(', '').\
                    str.replace(')', '').str.replace('"','')
    return df

In [172]:
""" Slice tmc """
def clean_tmc(df, sink_state = 'WI', source_states = 'IL'):
    """
    parameter: 
        df: original TMC dataset
        sink_state: destination warehouse, only one locations allowed
        source_states: shipping states, allowing multiple states as source state
    return:
        cleaned TMC including freight_cost from all states to sink_state
    """
    # starndardize col name
    df = col_name(df)
    
    # drop rows if all cols are nan
    df.dropna(how='all',subset=['market_rate_over_quarter_decmar',
       'market_rate_over_jan_2019mar_2020',
       'market_rate_all_offers_jan_2019_mar_2020_no_fb',
       'market_rate_all_offers_jan_2019_mar_2020_with_fb'],inplace=True)
    
    # generate freight_cost = market_rate_all_offers_jan_2019_mar_2020_no_fb or max of all
    df['freight_cost'] = np.round(np.where(df.market_rate_all_offers_jan_2019_mar_2020_no_fb.isnull(),
                               np.max(df,axis=1),
                               df.market_rate_all_offers_jan_2019_mar_2020_no_fb),2)  
    df['source_state'] = df.lane.apply(lambda x: x[:2]) # find source state short code
    df['sink_state'] = df.lane.apply(lambda x: x[-2:]) # find sink state short code
    
    df = df[df.source_state.isin(source_states)] # slice only source state
    df = df[df.sink_state.str.contains(sink_state)] # slice to include destination state only
    df = df.groupby(['source_state','sink_state'])['freight_cost'].mean().reset_index() # average duplidate states to same destination, 
    return df

In [173]:
# generate cleaned TMC dataset
source_states = cluster.shipper_state.unique()
tmc = clean_tmc(df, sink_state='WI', source_states = source_states)

In [174]:
tmc

Unnamed: 0,source_state,sink_state,freight_cost
0,IL,WI,729.42
1,IN,WI,869.9
2,OH,WI,1162.51
3,WI,WI,653.1


In [175]:
# updating full truck load cost
route_in_weight['milk_run_cost'] = 0
TL_cost = np.max(tmc.freight_cost)
route_in_weight.loc[route_in_weight.groupby('truck_number').tail(1).index,'milk_run_cost'] = TL_cost
route_in_weight.to_csv(r'C:\Users\u279014\Documents\H_Drive\7.AA Models\12.Logistic_Optimization\data\route_in_weight.csv',index=True,index_label='time_sequence')

In [176]:
truck_used = len(route_in_weight.truck_number.unique())
total_tmc_billed = route_in_weight.billed_amount.sum()
total_milk_cost = round(np.max(tmc.freight_cost)*truck_used,2)
# total_milk_cost = round(float(tmc.freight_cost)*truck_used,2)
cost_saving = round((total_tmc_billed - total_milk_cost),2)
print('-original_cost:{0} \n-milkrun_cost:{1}\n-cost reducton:{2}'.format(total_tmc_billed,total_milk_cost,cost_saving))

-original_cost:32256.949999999993 
-milkrun_cost:5812.55
-cost reducton:26444.4


### Add potential Oshkosh Hubs to the route

In [177]:
import sys
sys.path.insert(0, '../main')

In [178]:
import clustering_main as cm

In [179]:
def hub_dict(path, file, destination_list, route_in_weight, inbound_indicator='INBOUND'):
    """
    param:
        file: Cass FY19 Invoice Detail.csv
        inbound_indicator: str
        destination_list: list
    return:
        osk_hub_dict: dictionary, {supplier_name: [osk_warehouses...]
    """
    _data = cm.ETL_data(path=path).col_name(file=file)
    _data = _data[_data.inbound_outbound_indicator == inbound_indicator]
    df_hub_dict = _data[_data.destination_city.isin(destination_list)][['shipper_name', 'shipper_city', 'shipper_state', 'shipper_zip', 'destination_city', 'destination_state', 'destination_zip']]
    df_hub_dict = df_hub_dict.drop_duplicates(subset=['shipper_name', 'destination_city'])
    df_hub_dict = df_hub_dict[df_hub_dict.shipper_name.isin(set(route_in_weight.shipper_name))]
    df_hub_dict = df_hub_dict[df_hub_dict.shipper_zip.isin(set(route_in_weight.zip_code))]

    hub_dict = defaultdict(set)
    for sn, dc in zip(_data.shipper_name, _data.destination_city):
        if dc in destination_list:
            hub_dict[sn].add(dc)
        else:
            pass
    return hub_dict, df_hub_dict

In [180]:
path = 'C:\\Users\\u279014\\Documents\\H_Drive\\7.AA Models\\12.Logistic_Optimization\\data'
file = 'Cass FY19 Invoice Detail.csv'
destination_list = ['MILWAUKEE', 'OSHKOSH', 'GREENVILLE']

In [181]:
hub_dictionary, df_hub_dictionary = hub_dict(path=path, file=file, destination_list=destination_list, route_in_weight=route_in_weight)

In [182]:
df_hub_dictionary.to_csv('hub_dictionary.csv', index=False)

In [183]:
route_in_weight

Unnamed: 0,truck_number,pick_node,zip_code,longitude,latitude,cluster,shipper_name,shipper_state,ship_weight,miles,billed_amount,label,shipping_date,next_zip_code,next_shipper_name,milk_run_distance,stop_number,milk_run_cost
0,24,27,61109,-89.05595,42.213439,mid_west,"BERGSTROM, INC.",IL,161,2130,282.32,6,10-01-2019,53027.0,HARTFORD FIN,104.0,0,0.0
1,24,11,53027,-88.37332,43.313361,mid_west,HARTFORD FIN,WI,1006,758,142.35,6,10-01-2019,53154.0,AAA SALES,51.0,1,0.0
2,24,15,53154,-87.8992,42.884347,mid_west,AAA SALES,WI,25000,715,2100.0,6,10-01-2019,53154.0,A A A SALES,0.0,2,0.0
3,24,14,53154,-87.8992,42.884347,mid_west,A A A SALES,WI,2514,715,315.5,6,10-01-2019,60069.0,HYDRAFORCE,55.0,3,0.0
4,24,23,60069,-87.92717,42.188074,mid_west,HYDRAFORCE,IL,14864,666,1944.72,6,10-01-2019,17201.0,CHANBERSBURG_WH,663.0,4,0.0
5,24,0,17201,-77.6614,39.93112,east,CHANBERSBURG_WH,PA,0,0,0.0,999,01-01-2019,,,0.0,5,1162.51
6,25,26,60191,-87.97688,41.962979,mid_west,POWER GREAT,IL,6878,1294,648.44,6,10-01-2019,60139.0,HYDAC INTERNATIONAL,12.0,0,0.0
7,25,24,60139,-88.07891,41.920228,mid_west,HYDAC INTERNATIONAL,IL,1794,3900,422.92,6,10-01-2019,60188.0,NACA LOGISTI,2.0,1,0.0
8,25,25,60188,-88.13688,41.918578,mid_west,NACA LOGISTI,IL,14589,1302,1849.21,6,10-01-2019,46383.0,DYNATECT,82.0,2,0.0
9,25,7,46383,-87.03165,41.47339,mid_west,DYNATECT,IN,20756,589,2131.6,6,10-01-2019,46619.0,JLG CO STAND,45.0,3,0.0


In [184]:
df_hub_dictionary[df_hub_dictionary.shipper_name.isin(set(route_in_weight.shipper_name))]

Unnamed: 0,shipper_name,shipper_city,shipper_state,shipper_zip,destination_city,destination_state,destination_zip
2593,"ROBBINS MANUFACTURING, INC.",FALL RIVER,WI,53932,MILWAUKEE,WI,53207
5620,EATON CORPORATION PLC,VAN WERT,OH,45891,MILWAUKEE,WI,53207
5811,"MILCUT, INC.",MENOMONEE FA,WI,53051,GREENVILLE,WI,54942
5854,HARTFORD FIN,HARTFORD,WI,53027,GREENVILLE,WI,54942
5872,"MILCUT, INC.",MENOMONEE FA,WI,53051,OSHKOSH,WI,54902
5899,DYNATECT MFG,VALPARAISO,IN,46383,GREENVILLE,WI,54942
5938,EATON CORPORATION PLC,VAN WERT,OH,45891,GREENVILLE,WI,54942
6074,RHINEHART FI,SPENCERVILLE,IN,46788,OSHKOSH,WI,54902
6087,HARTFORD FIN,HARTFORD,WI,53027,OSHKOSH,WI,54902
6123,"ROBBINS MANUFACTURING, INC.",FALL RIVER,WI,53932,OSHKOSH,WI,54901
