# LOGISTICS PROBLEM IMPLEMENTATION USING METAHEURASTICS
TWO MAIN COMPONENTS AND THEIR ATTRIBUTES: 
1. ITEMS 
    1. item
    2. size 
    3. delivery_loc
    4. delivery_time (deadline: 1st or 2nd round)
    5. delivery_dispatch (which round it is actually dispatched)
    6. delivery_number (on that route)
    7. bin

2. VEHICLES
    1. bin
    2. size 
    3. available_space
    4. delivery_time_dispatch (in which round will the vehicle be dispatched)
    5. route_dist
    6. bin_penalty
    7. route_cost
    8. distance_per_item

PROBLEM METHODOLOGY:
Items have to be delivered in the desired delivery time (either round 1 or round 2). If item cannot be packed and delivered on time, it is added (with penalty) to the next days list for the same delivery time. The items are grouped by bin. A route is generated for each bin. The corresponding delivery number on the item on the route is noted. The cost of the delivery, and distance per item is calculated.The route order that is generated is mapped to the delivery number of the item. 
 A fitness test is then applied to rank the outcomes, from which we select a solution.  
The combination of items placed in a bin are checked to ensure that the bins are not overfilled (constraint). 
Note: number of bins are fixed


IMPLEMENTATION STEPS:
1. Group the items which have to be delivered in the first and second round
2. Bin packing
3. Group items by bin
4. Generate a route for the items in the bin according to their delivery_loc
5. Set the order of where the state occurs in the route to the item
6. Calculate cost

In [1]:
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import math as math 
from math import floor
from random import randint
import csv as csv
#to shuffle dataframe
from sklearn.utils import shuffle 
from IPython.display import display, HTML
import scipy 
from scipy.misc import comb # comb(n,k, exact=True)
from math import inf
from math import exp, expm1
import decimal
import random
# from scipy import special
# from scipy.special import comb
CSS = """
.output {
    flex-direction: row;
}
"""

HTML('<style>{}</style>'.format(CSS))

### Read-in and check data

In [2]:
def read_data(fileName):
    df = pd.read_csv(fileName)
    return df
    
def check_packaging(df):
    rows, cols = df.shape #size of the data set
    return (rows, cols)

def data_check(df, n=3):#n number of items to check 
    df_top_n = df.head(n)
    return (df_top_n)

def check_ns(df):
    ns = df.describe()
    return ns

In [3]:
###FILE NAMES
#ITEMS: 
items = 'items'
bins = 'bins'
items_2D = 'items2D'
bins_2D = 'bins2D'
city_dist = 'city'
dist_mat = 'distance_matrix'

#### 1D Item data

In [4]:
df_items = read_data('%s.csv'%items)
print("rows(%s) x cols(%s) "%check_packaging(df_items))
print()
print("%s"%data_check(df_items))
print()
print(check_ns(df_items))
print()
df_items.set_index('item')

rows(5) x cols(7) 

   item  size  bin  delivery_loc  delivery_time  delivery_dispatch  \
0     0     4  NaN             4              2                NaN   
1     1     3  NaN             2              1                NaN   
2     2     6  NaN             1              2                NaN   

   delivery_number  
0              NaN  
1              NaN  
2              NaN  

           item      size  bin  delivery_loc  delivery_time  \
count  5.000000  5.000000  0.0      5.000000       5.000000   
mean   2.000000  5.400000  NaN      3.000000       1.400000   
std    1.581139  1.949359  NaN      1.581139       0.547723   
min    0.000000  3.000000  NaN      1.000000       1.000000   
25%    1.000000  4.000000  NaN      2.000000       1.000000   
50%    2.000000  6.000000  NaN      3.000000       1.000000   
75%    3.000000  6.000000  NaN      4.000000       2.000000   
max    4.000000  8.000000  NaN      5.000000       2.000000   

       delivery_dispatch  delivery_number  
co

Unnamed: 0_level_0,size,bin,delivery_loc,delivery_time,delivery_dispatch,delivery_number
item,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
0,4,,4,2,,
1,3,,2,1,,
2,6,,1,2,,
3,8,,5,1,,
4,6,,3,1,,


#### Vehicle data

In [5]:
df_vehicles = read_data('%s.csv'%bins)
print("rows(%s) x cols(%s) "%check_packaging(df_vehicles))
print()
print("%s"%data_check(df_vehicles))
print()
print(check_ns(df_vehicles))
print()
df_vehicles.set_index('bin')

rows(3) x cols(8) 

   bin  size  available_space  delivery_time_dispatch  route_dist  \
0    0     4                4                     NaN         NaN   
1    1     3                3                     NaN         NaN   
2    2     9                9                     NaN         NaN   

   bin_penalty  route_cost  distance_per_item  
0          1.0         NaN                NaN  
1          0.8         NaN                NaN  
2          2.0         NaN                NaN  

       bin      size  available_space  delivery_time_dispatch  route_dist  \
count  3.0  3.000000         3.000000                     0.0         0.0   
mean   1.0  5.333333         5.333333                     NaN         NaN   
std    1.0  3.214550         3.214550                     NaN         NaN   
min    0.0  3.000000         3.000000                     NaN         NaN   
25%    0.5  3.500000         3.500000                     NaN         NaN   
50%    1.0  4.000000         4.000000           

Unnamed: 0_level_0,size,available_space,delivery_time_dispatch,route_dist,bin_penalty,route_cost,distance_per_item
bin,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
0,4,4,,,1.0,,
1,3,3,,,0.8,,
2,9,9,,,2.0,,


#### City data 
(Not all states are connected)

In [6]:
df_cityDist = read_data('%s.csv'%city_dist)
print("rows(%s) x cols(%s) "%check_packaging(df_cityDist))
print()
print("%s"%data_check(df_cityDist))
print()
print(check_ns(df_cityDist))
print()
num_cities = df_cityDist.shape[0]
# df_cityDist.set_index('city')
print('Number of cities including depot: ', num_cities)

rows(6) x cols(6) 

     0    1    2    3    4   5
0  NaN  NaN  1.0  3.0  NaN NaN
1  NaN  NaN  2.0  3.0  1.0 NaN
2  1.0  2.0  NaN  4.0  NaN NaN

              0    1         2         3    4    5
count  2.000000  3.0  4.000000  4.000000  1.0  2.0
mean   2.000000  2.0  2.250000  3.000000  1.0  2.0
std    1.414214  1.0  1.258306  0.816497  NaN  0.0
min    1.000000  1.0  1.000000  2.000000  1.0  2.0
25%    1.500000  1.5  1.750000  2.750000  1.0  2.0
50%    2.000000  2.0  2.000000  3.000000  1.0  2.0
75%    2.500000  2.5  2.500000  3.250000  1.0  2.0
max    3.000000  3.0  4.000000  4.000000  1.0  2.0

Number of cities including depot:  6


#### Symmetric distance matrx

In [7]:
df_distMat = read_data('%s.csv'%dist_mat)
print("rows(%s) x cols(%s) "%check_packaging(df_distMat))
print()
print("%s"%data_check(df_distMat))
print()
print(check_ns(df_distMat))
print()
num_cities = df_distMat.shape[0]
# df_cityDist.set_index('city')
print('Number of cities including depot: ', num_cities)

rows(6) x cols(6) 

      0     1     2     3     4     5
0   NaN  36.0  32.0  54.0  20.0  40.0
1  36.0   NaN  22.0  58.0  54.0  67.0
2  32.0  22.0   NaN  36.0  42.0  71.0

              0          1          2          3          4         5
count   5.00000   5.000000   5.000000   5.000000   5.000000   5.00000
mean   36.40000  47.400000  40.600000  58.000000  42.200000  63.00000
std    12.36123  18.132843  18.487834  20.736441  13.236314  21.05944
min    20.00000  22.000000  22.000000  36.000000  20.000000  40.00000
25%    32.00000  36.000000  32.000000  50.000000  42.000000  45.00000
50%    36.00000  54.000000  36.000000  54.000000  45.000000  67.00000
75%    40.00000  58.000000  42.000000  58.000000  50.000000  71.00000
max    54.00000  67.000000  71.000000  92.000000  54.000000  92.00000

Number of cities including depot:  6


# BIN PACKING PROBLEM
<font color='royalblue'>The Bin Packing Proble (BPP) component entails packing the $n$ deliverable items into the minimum number of bins without exceeding its fixed capacity and has the minimum wastage of space.
 In this case our bins are the subset of available delivery vehicles, into which the items are to be packed.
Note, not all the items that are required to be distributed are the same in size.</font>

# One dimesional
<font color='grass'>
In order to simplify the problem, we initially consider the problem to be one dimensional. Only one dimension of the item is not fixed, for example length, whilst the other two dimensions (width and height) remain constant.
</font>



In [8]:
#bin summary
def bin_summary(df_vehicles):
    number_of_bins = df_vehicles.shape[0]
    unused = 0
    partial = 0
    max_fill = 0
    for i in range(number_of_bins):
        av_space = df_vehicles.loc[i,'available_space']
        bin_size = df_vehicles.loc[i, 'size']
        if av_space == bin_size:
            unused = unused + 1
        elif av_space == 0:
            max_fill = max_fill +1
        elif av_space < bin_size and av_space!=0:
            partial = partial+1
            
    print("Number of bins : %d" %(number_of_bins))
    print("Number of partial filled bins : %d" %(partial))
    print("Number of unused bins : %d" %(unused))
    print("Number of max filled bins : %d" %(max_fill))
    return None

In [9]:
#number of items not accounted for
def unpacked_items(df_items):
    number_of_items = df_items.shape[0]
    not_packed = 0
    for i in range(number_of_items):
        if df_items.loc[i, 'bin'] == nan:
            not_packed = not_packed +1        
    return not_packed

# GREEDY APPROACH
<font color='darkorange'>A technique which makes locally optimal choices with hope of obtaining the global optimum. 
</font>

## Fit algorithms:  

### First-fit algorithm:
<font color='rebeccapurple'>
Algorithm scans the bins for the first bin which has a large enough space to fit the item. If the current bin has adequate space, the items is allocated to the bin, else the next bin is checked. For the next item, we iterate the bins from the FIRST bin (hence first fit algorithm).
</font>

In [10]:
# INPUT:subset of items in a particular delivery time. Note these are in 'raw form'. They have ro be in indexed. 
        #orginal items and vehicle dataframe
#OUTPUT: returns the updated item data frame, where the subset of items were allocated bins
        # updated bin which has the summary of space left in the bin
def first_fit(df_subItems, df_items, df_vehicles): #CHANGE1
    number_of_subItems = df_subItems.shape[0]#CHANGE2
    number_of_bins = df_vehicles.shape[0]
    df_subItems.index = range(0,number_of_subItems)#reindexsubitems    #CHANGE3
    for i in range(number_of_subItems): #CHANGE4
        item_no = df_subItems.loc[i,'item']#CHANGE5
        j = 0
        item_allocated = False
        item_size = df_subItems.loc[i,'size']#CHANGE6
        while j<= number_of_bins and item_allocated == False:
            available_bin_space = df_vehicles.loc[j,'available_space']
            if available_bin_space >= item_size: #if adequate space in the bin for the item
                item_allocated = True
                df_vehicles.loc[j,'available_space'] = df_vehicles.loc[j,'available_space'] - item_size #update avialable space
                bin_num = df_vehicles.loc[j,'bin']
                df_items.loc[item_no,'bin'] = bin_num#set the allocated bin for the item  #CHANGE7
            elif available_bin_space < item_size: #if NOT adequate space in the curr bin for the item
                #move to the next bin
                j = j+1 
                if j>=number_of_bins: #if none of the bins are large enough to house the item
                    item_allocated = True
                    df_items.loc[item_no,'bin'] = np.nan     #CHANGE8
    return df_items, df_vehicles

In [11]:
# df_items

In [12]:
# temp1 = df_items[df_items.delivery_time == 1]
# temp1

In [13]:
# df_vehicles

In [14]:
# items, bins = first_fit(temp1, df_items, df_vehicles)
# display(items)


In [15]:
display(bins)

'bins'

### Next-fit algorithm
<font color='rebeccapurple'>After allocating first bin large enough to house the item, when looking at the next item to be allocated, find the next suitable bin from the current bin. NOT starting from the very first bin. Check from the next bin in a loop, and stop at the bin before the last allocated (current bin). The search space is from current bin to the previous in a loop.

In [16]:
def next_fit(df_subItems, df_items, df_vehicles):
    number_of_subItems = df_subItems.shape[0]
    number_of_bins = df_vehicles.shape[0]
    df_subItems.index = range(0,number_of_subItems)#reindexsubitems
    j = 0
    for i in range(number_of_subItems):
        item_no = df_subItems.loc[i,'item']
        item_allocated = False
        item_size = df_subItems.loc[i,'size']
        while item_allocated == False and j<= number_of_bins:
            available_bin_space = df_vehicles.loc[j,'available_space']
            if available_bin_space >= item_size: #if adequate space in the bin for the item
                item_allocated = True
                df_vehicles.loc[j,'available_space'] = df_vehicles.loc[j,'available_space'] - item_size #update avialable space
                bin_num = df_vehicles.loc[j,'bin']
                df_items.loc[item_no,'bin'] = bin_num #set the allocated bin for the item
                j_curr = j
            elif available_bin_space < item_size: #if NOT adequate space in the curr bin for the item
                #move to the next bin
                j = j+1 
                if j>=number_of_bins: #if none of the bins are large enough to house the item
                    j = 0
                    if j_curr == j:
                        item_allocated = True
                        df_items.loc[item_no,'bin'] = np.nan
                        j = j_curr
    return df_items, df_vehicles

### Best-fit algorithm
<font color='rebeccapurple'>Allocating item to a in such that there is minimum wastage of space is left in in the bin.
<\font>

In [17]:
def best_fit(df_subItems, df_items, df_vehicles):
    number_of_subItems = df_subItems.shape[0]
    number_of_bins = df_vehicles.shape[0]
    df_subItems.index = range(0,number_of_subItems)#reindexsubitems

    for i in range(number_of_subItems):
        item_no = df_subItems.loc[i,'item']
        j = 0
        item_allocated = False
        item_size = df_subItems.loc[i,'size']
        df_vehicles = df_vehicles.sort_values(by='available_space', ascending=1).reset_index(drop=True)#after each iteration, order bins as per min space, so less wastage
        while j<= number_of_bins and item_allocated == False:
            available_bin_space = df_vehicles.loc[j,'available_space']
            if available_bin_space >= item_size: #if adequate space in the bin for the item
                item_allocated = True
                df_vehicles.loc[j,'available_space'] = df_vehicles.loc[j,'available_space'] - item_size #update avialable space
                bin_num = df_vehicles.loc[j,'bin']
                df_items.loc[item_no,'bin'] = bin_num#set the allocated bin for the item
            elif available_bin_space < item_size: #if NOT adequate space in the curr bin for the item
                #move to the next bin
                j = j+1 
                if j>=number_of_bins: #if none of the bins are large enough to house the item
                    item_allocated = True
                    df_items.loc[item_no,'bin'] = np.nan
    return df_items, df_vehicles

### Worst-fit algorithm
<font color='orange'>Allocating item to a bin such that there is maximum wastage of space is left in in the bin.
<\font>

In [18]:
def worst_fit(df_subItems, df_items, df_vehicles):
    number_of_subItems = df_subItems.shape[0]
    number_of_bins = df_vehicles.shape[0]
    df_subItems.index = range(0,number_of_subItems)#reindexsubitems

    for i in range(number_of_subItems):
        item_no = df_subItems.loc[i,'item']
        j = 0
        item_allocated = False
        item_size = df_subItems.loc[i,'size']
        df_vehicles = df_vehicles.sort_values(by='available_space', ascending=0).reset_index(drop=True)#after each iteration, order bins as per min space, so less wastage
        while j<= number_of_bins and item_allocated == False:
            available_bin_space = df_vehicles.loc[j,'available_space']
            if available_bin_space >= item_size: #if adequate space in the bin for the item
                item_allocated = True
                df_vehicles.loc[j,'available_space'] = df_vehicles.loc[j,'available_space'] - item_size #update avialable space
                bin_num = df_vehicles.loc[j,'bin']
                df_items.loc[item_no,'bin'] = bin_num#set the allocated bin for the item
            elif available_bin_space < item_size: #if NOT adequate space in the curr bin for the item
                #move to the next bin
                j = j+1 
                if j>=number_of_bins: #if none of the bins are large enough to house the item
                    item_allocated = True
                    df_items.loc[item_no,'bin'] = np.nan
    return df_items, df_vehicles

# PROBABILISTIC APPROACH
<font color='darkorange'>A technique which makes probabilistic choices with hope of obtaining the global optimum. 
</font>

## GENETIC ALGORITHM:  
##### From: Paper[GGA_FALKENEUR]
<font color='rebeccapurple'>
Group Genetic Algorithm (GGA), is a modification to the genetic algorithm to suit the structure of grouping problems. We aim to find a good partition of a set, or to group together the elements of the set. 
4 Steps:
1. ENCODING: obtain initial population for each bin, by randomising the order of the items and applying the first ft algorithm. All the solutions are parents.
2. CROSSOVER: Has 5 sub-steps: 
    1. Select the same random crossing site in two parents.
    2. Injest the contents of the first parent at the cross section in at the second parent's cross section, creating children.
    3. Replicated elements are removed.  
    4. Adapt results such that constraints are adhered to. 
    5. Rank the fitness of the children.
3. MUTATION: Shuffle a small group of items among groups.
4. INVERSION: do a random bit swop.

NOTE: IT IS AN UNEDUATED SEARCH SPACE CREATED
</font>

## STEP 1: ENCODING
1. Generating the initial population for th $k$ bins. For $n$ items, generate $n$ parents. 
2. Apply first-fit algorithm to each of the $n$ lists. 
3. For each list, we generate a solution for each of the $k$ bins.
4. TABLE: $x$ = LIST$_i$ Bins, and $y$ = Items, filled with bin j accordingly. Where $i \epsilon n$, and $j \epsilon k$  
5. OUTPUT: A population of bins to which items are allocated
6. <font color='red'>NOTE: LISTS = THE VARIOUS COMBINATIONS OF FIRST FIT ALGORITHM APPLIED

#### Encoding Initial population

In [19]:
#creates a list of new column names
def rename_col(num_cols):
    string = 'list'
    s1 = []
    for i in range(num_cols): # This is just to tell you how to create a list.
        updated_name = string + str(i)
        s1.append(updated_name)
    return s1


In [20]:
#INPUT: vehicle dataframe - vehicles and its capacity
    #        item dataframe - item size
    #        dimension - 1 if 1D items/vehicles and 2 if 2D items/vehicles
def encoding(df_subset_items, df_items, df_vehicles, dimension):
    num_items = df_items.shape[0]# total number of items
    n = df_subset_items.shape[0] #number of items in subset of items
    k = df_vehicles.shape[0]#number of bins
    updated_name = rename_col(n)
    
    if dimension == 1:
        newDF_items = pd.DataFrame(np.zeros((num_items, 3+n))) #initialise a new dataframe of size of the number of total items
        newDF_bins = pd.DataFrame(np.zeros((k,2+n))) #initialise a new dataframe
        newDF_items.rename(columns={0:'item'}, inplace=True)
        newDF_items.rename(columns={1:'size'}, inplace=True)
        newDF_items.rename(columns={2:'delivery_time'}, inplace=True)
        newDF_bins.rename(columns={0:'bin'}, inplace=True)
        newDF_bins.rename(columns={1:'size'}, inplace=True)
        
        newDF_items.loc[:,'item'] = df_items.loc[:,'item']
        newDF_items.loc[:,'size'] = df_items.loc[:,'size']
        newDF_items.loc[:,'delivery_time'] = df_items.loc[:,'delivery_time']#standard to all is the items with their respective sizes
        newDF_bins.loc[:,'bin'] = df_vehicles.loc[:,'bin']#standard to all is the bins with their respective sizes
        newDF_bins.loc[:,'size'] = df_vehicles.loc[:,'size']

        
        #use 1d first fit
        for i in range(n):
            
            df_subset_items_copy = df_subset_items.copy()# make a copy of the item list
            df_vehicles_copy = df_vehicles.copy()# make a copy of the bin list
            df_subset_items_copy_temp = shuffle(df_subset_items_copy) #shuffle the copy of items
            df_items_all = df_items.copy()
            df_items_all, df_vehicles_copy = first_fit(df_subset_items_copy_temp, df_items_all, df_vehicles_copy) #assigns items to bins using ff
            col_name = updated_name[i]
            newDF_items.rename(columns={i+3:'%s'%col_name}, inplace=True)
            newDF_items.loc[:,'%s'%col_name] = df_items_all.loc[:,'bin']
            newDF_bins.rename(columns={i+2:'%s'%col_name}, inplace=True)
            newDF_bins.loc[:,'%s'%col_name] = df_vehicles_copy.loc[:,'available_space']
            
            
    elif dimension == 2:
        newDF_items = pd.DataFrame(np.zeros((num_items, 4+n))) #initialise a new dataframe of size of the number of total items
        newDF_bins = pd.DataFrame(np.zeros((k,2+n))) #initialise a new dataframe
        newDF_items.rename(columns={0:'item'}, inplace=True)
        newDF_items.rename(columns={1:'x_size'}, inplace=True)
        newDF_items.rename(columns={2:'y_size'}, inplace=True)
        newDF_items.rename(columns={3:'delivery_time'}, inplace=True)
        newDF_bins.rename(columns={0:'bin'}, inplace=True)
        newDF_bins.rename(columns={1:'x_size'}, inplace=True)
        newDF_bins.rename(columns={2:'y_size'}, inplace=True)
        
        newDF_items.loc[:,0] = df_subset_items.loc[:,'item']
        newDF_items.loc[:,1] = df_subset_items.loc[:,'x_size']
        newDF_items.loc[:,2] = df_subset_items.loc[:,'y_size']
        newDF_items.loc[:,3] = df_subset_items.loc[:,'delivery_time']#standard to all is the items with their respective sizes
        newDF_bins.loc[:,0] = df_vehicles.loc[:,'bin']#standard to all is the bins with their respective sizes
        newDF_bins.loc[:,1] = df_vehicles.loc[:,'x_size']
        newDF_bins.loc[:,2] = df_vehicles.loc[:,'y_size']

        x_space = 'av_x_space'
        y_space = 'av_y_space'
        #use 2d first fit
        for i in range(n): 
            m = str(i)
            x = x_space + m
            y = y_space + m
            df_subset_items_copy = df_subset_items.copy()# make a copy of the item list
            df_vehicles_copy = df_vehicles.copy()# make a copy of the bin list
            df_subset_items_copy_temp = shuffle(df_subset_items_copy) #shuffle the copy of items
            df_items_all = df_items.copy()
            df_items_all, df_vehicles_copy = first_fit(df_subset_items_copy_temp, df_items_all, df_vehicles_copy) #assigns items to bins using ff
            col_name = updated_name[i]
            newDF_items.rename(columns={i+4:'%s'%col_name}, inplace=True)
            newDF_items.loc[:,i+4] = df_items_all.loc[:,'bin'] #bin list for items
            newDF_bins.rename(columns={(2*i)+3:'%s'%x}, inplace=True)
            newDF_bins.loc[:,(2*i)+3] = df_vehicles_copy.loc[:,'av_x_space']
            newDF_bins.rename(columns={(2*i)+3:'%s'%y}, inplace=True)
            newDF_bins.loc[:,(2*i)+4] = df_vehicles_copy.loc[:,'av_y_space']
    return newDF_items, newDF_bins

In [21]:
df_items1 = df_items[df_items.delivery_time == 1]
display(df_items)


Unnamed: 0,item,size,bin,delivery_loc,delivery_time,delivery_dispatch,delivery_number
0,0,4,,4,2,,
1,1,3,,2,1,,
2,2,6,,1,2,,
3,3,8,,5,1,,
4,4,6,,3,1,,


In [22]:
display(df_items1)


Unnamed: 0,item,size,bin,delivery_loc,delivery_time,delivery_dispatch,delivery_number
1,1,3,,2,1,,
3,3,8,,5,1,,
4,4,6,,3,1,,


In [23]:
display(df_vehicles)

Unnamed: 0,bin,size,available_space,delivery_time_dispatch,route_dist,bin_penalty,route_cost,distance_per_item
0,0,4,4,,,1.0,,
1,1,3,3,,,0.8,,
2,2,9,9,,,2.0,,


In [24]:
items_GA, bins_GA = encoding(df_items1,df_items, df_vehicles,dimension=1)


In [25]:
items_GA#[items_GA.loc[:,'delivery_time'] == 1]

Unnamed: 0,item,size,delivery_time,list0,list1,list2
0,0,4,2,,,
1,1,3,1,0.0,0.0,0.0
2,2,6,2,,,
3,3,8,1,,,
4,4,6,1,2.0,2.0,2.0


In [26]:
bins_GA

Unnamed: 0,bin,size,list0,list1,list2
0,0,4,1,1,1
1,1,3,3,3,3
2,2,9,3,3,3


## CROSSOVER: 
### Parent set per bin
Creating a dataframe per bin with parents. Boolean representation if item is in that bin or not per parent. TABLE: X= Parents/lists, Y = items. Filled with true and false. creates children $(2* ^{parents}C_2)$.
STEPS: 
1. Get the number of bins (vehicles)
2. For each bin create a parent dataframe. ie the initial population in boolean representation of that particular bin.
3. Create children dataframe. The result of crossover of the parents. 

<font color='red'>NOTE: The crossover is happening per bin.

In [27]:
#for a particular bin, creates a boolean representation of the bin for all the lists
# INPUT: df_population is the population of a particular delivery round (NOTE: IT IS A SUBSET)
        # bin_number for which the parents exist
        # mandatory_fields are the number of columns at the start of the list which are common to all parents
def parents(df_population, bin_number, mandotary_fields):
    df_population_copy = df_population.copy()
    pop_rows = df_population.shape[0]
    pop_cols = df_population.shape[1]-mandotary_fields
    num_items  = pop_rows #number of items
    newDF = pd.DataFrame(np.zeros((pop_rows,pop_cols))) #initialise a new dataframe
    newDF.iloc[:,:] = False
    newDF.index = df_population.loc[:,'item']
    update_col_name = rename_col(num_items)
    for j in range(num_items):#iterates all the lists
        #get the columns in which the item is filled with the sepecific item
        list_num = 'list'+str(j)
        temp = df_population_copy.loc[df_population_copy['%s'%list_num]==bin_number]
        #get the item values in array format
        temp1 = temp.loc[:,'%s'%'item'].values#creates an array of items in that bin
        col = update_col_name[j]
        newDF.rename(columns={j:'%s'%col}, inplace=True)
        for k in range(len(temp1)):
            #boolean fill the new dataframe
            item_val = temp1[k]
            newDF.loc[item_val][j] = True
            
            
    return newDF
                

In [28]:
pop = items_GA[items_GA.loc[:,'delivery_time'] == 1]
pop

Unnamed: 0,item,size,delivery_time,list0,list1,list2
1,1,3,1,0.0,0.0,0.0
3,3,8,1,,,
4,4,6,1,2.0,2.0,2.0


In [29]:
bool_parents = parents(pop,0,3)
bool_parents

Unnamed: 0_level_0,list0,list1,list2
item,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,True,True,True
3,False,False,False
4,False,False,False


In [30]:
# #creates a dataframe with only bn info of the population
# def parent_bins(df_population):
#     #1D scenario:
#     df_parent_bins = df_population.iloc[:,2:]
#     return df_parent_bins

### OFFSPRINGS
select a random position and crossover parents at this point 

In [31]:
# #offsprings of the parents: For each combination of parents, there is two offsprings
# #INPUT: parent dataframe
# #OUTPUT: children dataframe - the crossover of the parent combinations 
# def offsprings(df_parent):
#     pointer_children = 0
#     num_items= df_parent.shape[0]
#     combination_of_children = (comb(num_items,2,exact=True))*2#number of combinations. Multiply by two as from 2parents we have two children
#     random_cross_section = floor(num_items/2)#randint(0,num_items)
#     df_offspring = pd.DataFrame(np.zeros((num_items,combination_of_children)))#initializes a dataframe of size [r,c] = [num_items,num_children(combinations of crossovers)]
    
#     for i in range(num_items):
#         partA1 = df_parent.iloc[0:random_cross_section,i]#top part of parent 1
#         partB1 = df_parent.iloc[random_cross_section:, i]#bottom part of parent 1
#         for j in range(i, num_items):
#             if i != j:
#             #crossover between the two indices producing two children
#                 partA2 = df_parent.iloc[0:random_cross_section,j]#top part of parent 2
#                 partB2 = df_parent.iloc[random_cross_section:, j]#Bottom part of parent 2
#                 df_offspring.iloc[0:random_cross_section,pointer_children] = partA1 #top part of parent 1 
#                 df_offspring.iloc[random_cross_section:,pointer_children] = partB2 #Bottom part of parent 2
#                 df_offspring.iloc[0:random_cross_section,pointer_children+1] = partA2 #top part of parent 2
#                 df_offspring.iloc[random_cross_section:,pointer_children+1] = partB1#Bottom part of parent 1
#                 pointer_children = pointer_children + 2
#         i = i +1
#     return df_offspring

In [32]:
#offsprings of the parents: For each combination of parents, there is two offsprings
#INPUT: parent dataframe
#OUTPUT: children dataframe - the crossover of the parent combinations 
def offsprings(df_parent):
    pointer_children = 0
    num_items= df_parent.shape[0]
    combination_of_children = (comb(num_items,2,exact=True))*2#number of combinations. Multiply by two as from 2parents we have two children
    random_cross_section = floor(num_items/2)#randint(0,num_items)
    df_offspring = pd.DataFrame(np.zeros((num_items,combination_of_children)))#initializes a dataframe of size [r,c] = [num_items,num_children(combinations of crossovers)]
    
    for i in range(num_items):
        for j in range(num_items):
            if i != j:
            #crossover between the two indices producing two children
                df_offspring.iloc[0:random_cross_section,2*i] = df_parent.iloc[0:random_cross_section,i] #T:P1 | B:P2
                df_offspring.iloc[random_cross_section:,pointer_children] = df_parent.iloc[random_cross_section:, j] #T:P1 | B:P2
                
                df_offspring.iloc[0:random_cross_section,pointer_children+1] = df_parent.iloc[0:random_cross_section,j] # T:P2 | B:P1
                df_offspring.iloc[random_cross_section:,pointer_children+1] = df_parent.iloc[random_cross_section:, i] # T:P2 | B:P1

                
                pointer_children = pointer_children + 2
        i = i +1
    return df_offspring

In [36]:
df_offspring = offsprings(bool_parents)
df_offspring

IndexError: single positional indexer is out-of-bounds

In [None]:
bool_parents

### BIN PACKING according to delivery number: 
In order to pack items to meet delivery times, we follow the following implementation:
1. Create to subset item dataframes : delivery round 1 and delivery round two 
2. Pack fixed number of bins with 

The algorithm returns the item dataframe with bins it is allocated and the deliery dispatch round. There are bin dataframes returned, specific to each dispatch round. 

# The Traveling Salesman Problem (TSP)
#### The TSP depicts a salesman who has to visit $n$ cities, returning to it's start  (home)  city, whilst not visiting any of the nodes more than once.
# Greedy Approach
<font color='darkorange'>A technique which makes locally optimal choices with hope of obtaining the global optimum. 
</font>
 
Stores data in an adjacency matrix.

DistPsuedo-code for nearest neighbour algorithm:
1. From start node (V0), find the nearest neighbour (Vn). In the zeroth row from col 1:end find the smallest distance.
2. Put the next visited neighbour in the terminal set -> T. Add the new vertes  to the the possible set -> P. (Note: this must be assigned to Vn)
3. Do this while D(V0) = 0

<font color='blue'>Ammended to take in array of states to visit and return the order of route and the distance
</font>

In [None]:
# # Input: Array of states that need to be visited by the bin
#     #arr_states does not include the depots
# # OUTPUT: Array with order of states to vist and distance

# def nearest_neighbour_arr(arr_states, df_distMatrix):
#         num_states = len(arr_states)
#         total_states = df_distMatrix.shape[0]
        
#         #creates an array with depot at the begining
#         x0 = np.array([0])
#         arr_all_states = np.concatenate((x0,arr_states,x0), axis = 0)
        
#         #create a possible dataframe of length of the num_states
#         possible = pd.DataFrame(np.zeros((num_states+2,1)))# Accounts for the vertices that have been visited
#         possible.index.name = 'possible'
#         possible = possible.reindex(arr_all_states)
#         possible.iloc[:,:] = True
#         possible.iloc[0,0]= False
        
#         #distance calculator
#         distances = pd.DataFrame(np.zeros((num_states+2,1)))# The distance from Vertex i to the next vertex
#         distances.index.name = 'distances'
#         distances = distances.reindex(arr_all_states)
#         distances.iloc[:,:] = 0
        
#         #terminal set 
#         terminal_set = pd.DataFrame(np.zeros((num_states,1)))# The set/ordering of vertices that are encountered on the route
#         terminal_set.loc[0,0] = 0
#         terminal_set.loc[num_states,0] = 0
        
#         #create a distance matrix with only the columns that can be visited
#         df_distMatrix.fillna(inf,inplace = True)
#         df_curr_distMatrix = pd.DataFrame(np.zeros((total_states, total_states)))
#         df_curr_distMatrix.iloc[:,:] = inf
        
#         for i in range(num_states):
#             state = arr_states[i]
#             df_curr_distMatrix.iloc[:,state] = df_distMatrix.iloc[:, state]
#         #add the depot 
#         df_curr_distMatrix.iloc[:,0] = df_distMatrix.iloc[:,0]#begining depot
#         #df_curr_distMatrix.iloc[:,total_states+1] = df_distMatrix.iloc[:,0]#end depot
        
#         #find the nearest neighbour using this updated distance matrix
#         count = 0
#         new_curr = 0 #depot is the first vertex
#         #Note: iloc indexes from 0 to n-1 
#         while( (possible.iloc[num_states+1,0]== True) and (count<num_states+2)): 
#             curr = new_curr
#             if count < num_states:
#                 nearest_vertex = df_curr_distMatrix.loc[curr,1:].idxmin()#gives the minimum values index ie column
#                 nearest_vertex = int(nearest_vertex)
#                 used_not = possible.loc[nearest_vertex,0]
#                 if used_not == True: #if the next vertex is not used
#                     dist = df_curr_distMatrix.iloc[curr,nearest_vertex]
#                     distances.loc[curr,0] = dist
#                     possible.loc[nearest_vertex,0]=False
#                     terminal_set.iloc[count,0] = nearest_vertex
#                     df_curr_distMatrix.iloc[:,nearest_vertex] = inf
#                     new_curr = nearest_vertex
#                     count = count+1
#                 elif used_not==False:#if the next vertex is used
#                     df_curr_distMatrix.iloc[curr,nearest_vertex]=inf
#             elif count == num_states:
#                 nearest_vertex = 0
#                 dist = df_curr_distMatrix.loc[curr,0]
#                 distances.iloc[num_states,0] = dist
#                 possible.iloc[num_states+1,0] = False
#                 terminal_set.loc[count,0] = 0
#                 count = count +1
#         arr_terminal_route_temp  = terminal_set.loc[:,0].values
#         x0 = np.array([0])
#         arr_terminal_route = np.concatenate((x0, arr_terminal_route_temp), axis = 0)
#         distance = distances.iloc[0:num_states+1,0].sum() 
#         return arr_terminal_route, distance


In [None]:
# # Input: Array of states that need to be visited by the bin
#     #arr_states does not include the depots
# # OUTPUT: Array with order of states to vist and distance

def nearest_neighbour_arr(arr_states, df_distMatrix):
    num_states = len(arr_states)
    total_states = df_distMatrix.shape[0]

    #creates an array with depot at the begining
    x0 = np.array([0])
    all_states = np.concatenate((x0,arr_states), axis = 0)
    arr_all_states = np.concatenate((x0,arr_states,x0), axis = 0)

    #distance calculator
    distances = pd.DataFrame(np.zeros((num_states+1,1)))# The distance from Vertex i to the next vertex
    distances.index.name = 'distances'
    # distances = distances.reindex(all_states)
    distances.index = range(0, len(all_states))
    distances.iloc[:,:] = 0
    

    #terminal set 
    terminal_set = pd.DataFrame(np.zeros((num_states+1,1)))# The set/ordering of vertices that are encountered on the route


    #create a distance matrix with only the columns that can be visited
    df_distMatrix.fillna(inf,inplace = True)
    df_curr_distMatrix = pd.DataFrame(np.zeros((total_states, total_states)))
    df_curr_distMatrix.iloc[:,:] = inf

    for i in range(num_states):
        state = arr_states[i]
        df_curr_distMatrix.iloc[:,state] = df_distMatrix.iloc[:, state]
    #add the depot 
    df_curr_distMatrix.iloc[:,0] = df_distMatrix.iloc[:,0]#begining depot
    
    new_curr = 0 #depot is the first vertex

    for i in range(num_states+1):
        curr = new_curr
        if i < num_states:
            nearest_vertex = df_curr_distMatrix.iloc[curr, 1:].idxmin()
            if nearest_vertex == np.nan:
                i = num_states
            elif nearest_vertex != np.nan:
                dist = df_curr_distMatrix.iloc[curr, nearest_vertex]
                df_curr_distMatrix.iloc[:, nearest_vertex] = np.inf
                distances.iloc[i,0] = dist
                terminal_set.iloc[i,0]  = nearest_vertex
                new_curr = nearest_vertex
        elif i == num_states:
            dist = df_curr_distMatrix.iloc[curr,0]
            distances.iloc[i,0] = dist
            terminal_set.iloc[i,0]  = 0


    arr_terminal_route_temp  = terminal_set.loc[:,0].values
    x0 = np.array([0])
    arr_terminal_route = np.concatenate((x0, arr_terminal_route_temp), axis = 0)
    distance = distances.iloc[0:num_states+1,0].sum() 

    arr_route_set = arr_terminal_route 
    route_dist = distance
    return arr_terminal_route, distance


#  METAHEURASTIC: SIMULATED ANNEALING
<font color='darkorange'>Simulated annealing (SA) is a probabilistic technique for approximating the global optimum of a given function. Specifically, it is a metaheuristic to approximate global optimization in a large search space. It is often used when the search space is discrete (e.g., all tours that visit a given set of cities).
</font>

# ROUTE 
### Components
1. Group by value in a column
2. Convert a dataframe column to an array
3. Finding corresponding value in a column and assignning the order number

In [None]:
# ## Displays rows of a dataframe where the values in a particular column are as specifies
# # INPUT: DataFrame, column name, value grouping by 
# # OUTPUT: Subet of the dataframe where the column specified has the value by which it's being grouped
# def groupby_value(df, col_name, value):
#     df_grouped = df[df.col_name == value]
#     return df_grouped

In [None]:
##PUT ITEMS IN ARRAY WHICH ARE GROUPED BY SOME VALUE IN A COLUMN
#INPUT: a grouped by bin item dataframe
        #original bin dataframe
        #bin number by which items are grouped 
#OUTPUT: return an array of delivery locations of that bin
        #to fill the route of that bin in the bin_dataframe
def dfCol_arr(df_grouped_items):
    arr_delivery_loc = df_grouped_items.loc[:,'delivery_loc'].values
    return arr_delivery_loc


In [None]:
## Using the subset of the grouped values dataframe, find the correspoding location in the array and fill in the order/delivery number on the delivery route
# INPUT: Grouped dataframe, original dataframe, array- route, key_col_name, col_update_name, search_col
        #df = df_items
        #df_group = subset of items in the same bin
        #arr_route = route produced by greedy algorithm/ SA metaheurastic including the depot at the start and end.
        #key_col = delivery_loc
        #search_col = item
        #update_col = delivery_number
# OUTPUT: Updated original dataframe
    #df = df_items with delivery number allocated to each item

def assign_order(df, df_group, arr_route, key_col, search_col, update_col): #key_col = delivery_loc,#search_col = items #update_col = delivery_number
    arr_len = len(arr_route)
    for i in range(1, arr_len-1):#exclude depots
        state = arr_route[i]
        item_row = df_group.loc[df_group['%s'%key_col] == state].index[0]#row number of location of the state
        item = df_group.loc[item_row, '%s'%search_col]
        df.loc[item, '%s'%update_col] = i #assign order
    return df

Once all items ahve been assigned to a bin, group items by bin. Get all the locations in array format. Using the  Greeedy algorithm or the SA metaheaurastic, generate a route and the cost of the route. 
Set the corresponding states, delivery number in the items dataframe. 
Set the route cost for the particular bin in the bin dataframe. 
<font color='red'>SA to be ammended!
</font>

In [None]:
#INPUT: df_bins
#OUTPUT: df_bins, only that which have been used

def bins_used(df_bins, dimension):
    if dimension == 1:
        bins_used_subset = df_bins[df_bins['size'] > df_bins['available_space']]
        bins_used_subset.index = range(0,bins_used_subset.shape[0])
    elif dimension == 2:
        bins_used_subset = df_bins[(df_bins['x_size'] > df_bins['av_x_space']) or (df_bins['y_size'] > df_bins['av_y_space'])]
        bins_used_subset.index = range(0,bins_used_subset.shape[0])
    return bins_used_subset

In [None]:
df_bins_dispatch1 = bins_used(df_bins_dispatch1,1)
df_bins_dispatch2 = bins_used(df_bins_dispatch2,1)
display(df_bins_dispatch1)
display(df_bins_dispatch2)

In [None]:
df_bins_dispatch1

In [None]:
# INPUT: item (where all items have been assigned bins) and bin dataframe(only contains bins which are used for a particular dispatch round)
    #NOte: bins are a subset: only those which are used and which are used in that particular dispatch round
# OUTPUT: updated item and bin dataframes, with delivery number and cost of the route

def del_order_cost(df_items, df_bins, df_dist_matrix, TS_method):
    delivery_dispatch_round = df_bins.loc[0,'delivery_time_dispatch']
    #only use a subset of used bins
    num_bins = df_bins.shape[0]
    for k in range(num_bins):
        bin_no = df_bins.loc[k, 'bin']
        df_item_subset = df_items[(df_items['bin'] == bin_no)  & (df_items['delivery_dispatch'] == delivery_dispatch_round)] #subset of items in the bin of value bin_no
        arr_del_loc = dfCol_arr(df_item_subset)# array of all delivery locations
        #ROUTE AND ROUTE DISTANCE
        #nearest neighbour
        arr_route_set, route_dist = TS_method(arr_del_loc, df_dist_matrix)
        #Simulated annealing
#         arr_route_set, route_dist = simulated_anealing(adj_mat, num_cities, distances, possible,terminal_set)
         #route distance and cost
        df_bins.loc[bin_no, 'route_dist'] = route_dist #setting the route 
        bin_penalty_factor = df_bins.loc[bin_no, 'bin_penalty'] #the penalty associated with the bin
        df_bins.loc[bin_no, 'route_cost'] = route_dist*bin_penalty_factor  #route cost = route dist X bin penalty
        #delivery number on route
        df_items = assign_order(df_items, df_item_subset, arr_route_set, 'delivery_loc', 'item', 'delivery_number')
    return df_items, df_bins


In [None]:
display(df_itemsA)
display(df_bins_dispatch1)
display(df_bins_dispatch2)

In [None]:
df_items = df_itemsA
df_bins = df_bins_dispatch1
df_dist_matrix = df_distMat
TS_method = nearest_neighbour_arr

# FULL IMPLEMENTATION
INPUT: 
1. df_items
2. df_vehicles
3. df_distanceMatrix
4. BP_method
5. TS_method
6. dimension: of the problem (1 or 2)

OUTPUT
1. df_items : with to which bin item is allocated/ delivery_dispatch_round
2. df_bin1 : with all dispatch 1 bins / distance / cost
3. df_bin2 : with all dispatch 1 bins / distance / cost

In [None]:
def packing_and_routing(df_items, df_vehicles, df_distanceMatrix, BP_method, TS_method, dimension):
    df_items1, df_bin1, df_bin2 = priority_binPacking(df_items, df_vehicles, BP_method)
#     display(df_items1)
# #     display(df_bin1)
#     display(df_bin2)
    df_items2, df_bin1A = del_order_cost(df_items1, df_bin1, df_distanceMatrix, TS_method)
    df_items3, df_bin2A = del_order_cost(df_items2, df_bin2, df_distanceMatrix,TS_method)
    return df_items3, df_bin1A, df_bin2A

In [None]:
df_items3, df_bin1A, df_bin2A = packing_and_routing(df_items, df_vehicles, df_distMat, first_fit, nearest_neighbour_arr, 1)

In [None]:
df_items3

In [None]:
df_bin1A

In [None]:
df_bin2A