<h1>Introduction</h1>

The Coding Elves are helping Santa to figure out the optimal route to deliver gifts via sleigh for the nine worldwide locations listed below. This notebook and the app.py file, along with JSON output, were provided to the Software Engineering members of the team to integrate into a web application for Santa to use for his planning!

Importing the libraries and data

In [3]:
import pandas as pd
import geopy
from geopy.distance import great_circle
import random
import numpy as np
import plotly.express as px
from dash import Dash, dcc, html, Input, Output
import json
import plotly.graph_objects as go
from dash.dependencies import Output



In [4]:
df_final_cities = pd.read_csv('/Users/admin/Desktop/GitHub/new_repos/jingle-jam-2023/notebooks/final_cities.csv', encoding = "iso-8859-1")


<h2>Preprocessing</h2>

Viewing the data and information about it to evaluate any preprocessing that needs to be done. It looks good the way it is!

In [5]:
display(df_final_cities)
df_final_cities.info()

Unnamed: 0,City,Latitude,Longitude
0,Athens,37.9838,23.7275
1,Cairo,30.0444,31.2357
2,Hialeah,25.8576,-80.2781
3,Lincoln,40.8136,-96.7026
4,Cleveland,41.4993,-81.6944
5,Bangkok,13.7563,100.5018
6,Gilbert,33.3528,-111.789
7,Corpus Christi,27.8006,-97.3964
8,Osaka,34.6937,135.5022


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 9 entries, 0 to 8
Data columns (total 3 columns):
 #   Column     Non-Null Count  Dtype  
---  ------     --------------  -----  
 0   City       9 non-null      object 
 1   Latitude   9 non-null      float64
 2   Longitude  9 non-null      float64
dtypes: float64(2), object(1)
memory usage: 348.0+ bytes


<h2>Calculating Distance</h2>

This function calculates distance between two locations in miles. The code below tests this function to ensure it works correctly. I verified the distance between Athens, Greece and Cairo, Egypt on Google, and the function is working correctly.

In [6]:
def calculate_distance(location1, location2):
    return great_circle(location1, location2).mi


In [7]:
name_1 = df_final_cities.iloc[0, 0]
lat_long_1 = df_final_cities.iloc[0, 1], df_final_cities.iloc[0, 2]

name_2 = df_final_cities.iloc[1, 0]
lat_long_2 = df_final_cities.iloc[1, 1], df_final_cities.iloc[1, 2]

distance = calculate_distance(lat_long_1, lat_long_2)
print(f"Distance between {name_1} and {name_2}: {distance} miles")

Distance between Athens and Cairo: 696.4479817894902 miles


<h2>Prepping the Data</h2>

Here I created a list of numbers and added a column to the existing dataframe. These numbers will be used in the future to create routes.

In [8]:
number_list = [0,1,2,3,4,5,6,7,8]

df_final_cities['number'] = number_list


In [9]:
display(df_final_cities)

Unnamed: 0,City,Latitude,Longitude,number
0,Athens,37.9838,23.7275,0
1,Cairo,30.0444,31.2357,1
2,Hialeah,25.8576,-80.2781,2
3,Lincoln,40.8136,-96.7026,3
4,Cleveland,41.4993,-81.6944,4
5,Bangkok,13.7563,100.5018,5
6,Gilbert,33.3528,-111.789,6
7,Corpus Christi,27.8006,-97.3964,7
8,Osaka,34.6937,135.5022,8


<h2>Evaluation</h2>

This function is an evaluation function that determines the total amount of distance for a given route.

In [10]:
def fitness(route):
    total_distance = 0
    for i in range(len(route) - 1):

        location1 = route[i]
        location2 = route[i+1]

        lat_long_1 = (df_final_cities.loc[df_final_cities['number'] == location1, 'Latitude'].iloc[0], 
                      df_final_cities.loc[df_final_cities['number'] == location1, 'Longitude'].iloc[0])
        lat_long_2 = (df_final_cities.loc[df_final_cities['number'] == location2, 'Latitude'].iloc[0], 
                      df_final_cities.loc[df_final_cities['number'] == location2, 'Longitude'].iloc[0])

        total_distance += calculate_distance(lat_long_1, lat_long_2)
    return total_distance

<h2>Creating a Baseline Route</h2>

This code creates a baseline route and then calls the fitness function to evaluate the total distance flown in Santa's sleigh in that route.

In [11]:
baseline_route = [0,1,2,3,4,5,6,7,8]

baseline_model = fitness(baseline_route)

print(baseline_model)

34369.700652266736


<h2>Optimization</h2>

Next we start creating functions to optimize Santa's route. This function creates 10 randomized routes, and the code below calls the function and calculates the distance traveled for each route.

In [12]:
def create_population():
    location_numbers = list(range(9))

    population_size = 10 
    population = []

    for pop_number in range(population_size):
        optimized_route = location_numbers[:]
        random.shuffle(optimized_route)
        population.append(optimized_route)
    return population


In [13]:
initial_population = create_population()
for i in initial_population:
    result = fitness(i)
    print(result)

37241.432697569115
43050.68708946316
28822.711808539985
38076.18849815266
24912.348562437634
26704.396914669305
35537.88269988939
39738.25646799222
41891.40487723439
34635.10952559888


This function selects optimal routes through tournament selection, or pitting routes against each other and selecting the one that has the lesser total distance traveled as the winner.

In [14]:
def selection_optimize(population, fitness_func, tournament_size=10):
    selected = []
    for j in range(len(population)):
        contenders = random.sample(population, tournament_size)
        winner = min(contenders, key=fitness_func)
        selected.append(winner)
    return selected

This function creates child routes from the winners of the selected routes of the previous functions.

In [15]:
def ordered_crossover(parent1, parent2):
    size = len(parent1)
    start, end = sorted(random.sample(range(size), 2))
    offspring = [None] * size
    offspring[start:end] = parent1[start:end]
    fill_values = [item for item in parent2 if item not in offspring]
    for i in range(size):
        if offspring[i] is None:
            offspring[i] = fill_values.pop(0)
    return offspring

This function then mutates the routes by moving the index of the child routes generated above.

In [16]:
def swap_mutation(route, mutation_rate):
    mutated_route = route[:]
    for i in range(len(route)):
        if random.random() < mutation_rate:
            swap_index = random.randint(0, len(route) - 1)
            mutated_route[i], mutated_route[swap_index] = mutated_route[swap_index], mutated_route[i]
    return mutated_route

This code is an algorithm that calls all the above functions, then from the final population, finds the route with the minimum distance. It does this 250 times (called generations), then appends the best of each generation to a list, then the next cell finds the best of this list.

In [17]:
num_generations = 250 
population_size = 10  
mutation_rate = 0.1  
best_route_per_generation = []

# Initialize population with random routes
selection_population = create_population()

for generation in range(num_generations):

    fitness_scores = [fitness(route) for route in selection_population]

    optimized_selection = selection_optimize(selection_population, fitness, tournament_size=5)

    next_generation = []
    while len(next_generation) < population_size:
        route_1, route_2 = random.sample(optimized_selection, 2)
        crossover_route = ordered_crossover(route_1, route_2)
        next_generation.append(crossover_route)

    population = [swap_mutation(route, mutation_rate) for route in next_generation]

    best_route = min(population, key=fitness)
    best_route_fitness = fitness(best_route)
    best_route_per_generation.append((best_route, best_route_fitness))



In [18]:
best_overall_route = min(best_route_per_generation, key=lambda x: x[1])
print(best_overall_route)



([1, 0, 5, 8, 3, 4, 6, 7, 2], 19005.62425395951)


This code takes the route found above and matches the numbers to the cities found in the existing dataframe. It then creates a new dataframe with the cities in the correct order. This is to create data visualizations.

In [19]:
best_of_routes = best_overall_route[0]
optimized_route_data = []

for k in best_of_routes:
    city_row = df_final_cities[df_final_cities['number'] == k]
    city_data = {
        'city': city_row['City'].iloc[0],
        'latitude': city_row['Latitude'].iloc[0],
        'longitude': city_row['Longitude'].iloc[0]
    }
    optimized_route_data.append(city_data)

df_optimized_route = pd.DataFrame(optimized_route_data)
print(df_optimized_route)

             city  latitude  longitude
0           Cairo   30.0444    31.2357
1          Athens   37.9838    23.7275
2         Bangkok   13.7563   100.5018
3           Osaka   34.6937   135.5022
4         Lincoln   40.8136   -96.7026
5       Cleveland   41.4993   -81.6944
6         Gilbert   33.3528  -111.7890
7  Corpus Christi   27.8006   -97.3964
8         Hialeah   25.8576   -80.2781


<h2>Route Visualizations</h2>

This map displays the optimal route as well as the baseline route on it.

In [20]:
fig = go.Figure(go.Scattermapbox(
    mode = "markers+lines",
    lon = df_final_cities['Longitude'],
    lat = df_final_cities['Latitude'],
    text = df_final_cities['City'],
    marker = {'size': 10}))

fig.add_trace(go.Scattermapbox(
    mode = "markers+lines",
    lon = df_final_cities['Longitude'],
    lat = df_final_cities['Latitude'],
    text = df_final_cities['City'],
    marker = {'size': 10}))

fig.add_trace(go.Scattermapbox(
    mode = "markers+lines",
    lon = df_optimized_route['longitude'],
    lat = df_optimized_route['latitude'],
    text = df_optimized_route['city'],
    marker = {'size': 10}))

fig.update_layout(
    margin ={'l':0,'t':0,'b':0,'r':0},
    mapbox = {
        'center': {'lon': 10, 'lat': 10},
        'style': "open-street-map",
        'center': {'lon': -20, 'lat': -20},
        'zoom': 1})

fig.show()


This map is a scatterbox that displays all the different locations on the route.

In [21]:
mapbox_access_token = open(".mapbox_token").read()

fig = go.Figure(go.Scattermapbox(
        lat=df_final_cities['Latitude'],
        lon=df_final_cities['Longitude'],
        mode='markers',
        marker=go.scattermapbox.Marker(
            size=9
        ),
        text=df_final_cities['City'],
    ))

fig.update_layout(
    autosize=True,
    hovermode='closest',
    mapbox=dict(
        accesstoken=mapbox_access_token,
        bearing=0,
        center=dict(
            lat=38.92,
            lon=-77.07, 
        ),
        pitch=0,
        zoom=1
    ),
)



fig.show()

In [22]:
mapbox_access_token = open(".mapbox_token").read()


This code is actually here for the app.py file to be integrated into the application by the Software Engineers.

In [23]:
app = Dash(__name__)

In [24]:
app.layout = html.Div([
    html.H4('Mapping Deliveries Around the World'),
    dcc.RadioItems(
        id='map', 
        options=["Scatter", "Route Map"],
        value="Route Map",
        inline=True
    ),
    dcc.Graph(id="graph"),

])


@app.callback(
    Output("graph", "figure"),
    Input("map", "value")) 


def display_trace_scattermapbox(selected_map):
    if selected_map == 'Route Map':
        fig = go.Figure(go.Scattermapbox(
            #title = 'Optimal vs. Baseline Route'
        ))


        fig.add_trace(go.Scattermapbox(
            mode = "markers+lines",
            lon = df_final_cities['Longitude'],
            lat = df_final_cities['Latitude'],
            text = df_final_cities['City'],
            name='Baseline Route',
            marker = {'size': 10})),

        fig.add_trace(go.Scattermapbox(
            mode = "markers+lines",
            lon = df_optimized_route['longitude'],
            lat = df_optimized_route['latitude'],
            text = df_optimized_route['city'],
            name = 'Optimized Route',
            marker = {'size': 10}))

        fig.update_layout(
            margin ={'l':0,'t':0,'b':0,'r':0},
            mapbox = {
                'center': {'lon': 10, 'lat': 10},
                'style': "open-street-map",
                'center': {'lon': -20, 'lat': -20},
                'zoom': 1})
    elif selected_map == 'Scatter':
        fig = go.Figure(go.Scattermapbox(
            lat=df_final_cities['Latitude'],
            lon=df_final_cities['Longitude'],
            mode='markers',
            marker=go.scattermapbox.Marker(
                size=9
            ),
            text=df_final_cities['City'],
        ))

        fig.update_layout(
            autosize=True,
            hovermode='closest',
            mapbox=dict(
                accesstoken=mapbox_access_token,
                bearing=0,
                center=dict(
                    lat=38.92,
                    lon=-77.07, 
            ),
            pitch=0,
            zoom=1
            ),
        )
    return fig


app.run_server(debug=True)

<h2>JSON Formatting for Application Integration</h2>

This is the JSON data for the Software Engineers.

In [25]:
data_to_convert = {
    "data_config": {
        "csv_file_path": "/Users/admin/Desktop/GitHub/new_repos/jingle-jam-2023/notebooks/final_cities.csv",
        "number_list": number_list,
        "population_size": 10,
        "num_generations": 250,
        "mutation_rate": 0.1
    },
    "best_route": {
        "route": best_overall_route[0],
        "fitness": best_overall_route[1],
        "baseline_model": baseline_model,
        "baseline_route": baseline_route
    },
    "optimized_route_data": optimized_route_data,
    "dash_app_layout": {
        "title": "Mapping Deliveries Around the World",
        "radio_items_options": ["Route Map", "Scatter"],
        "default_value": "Route Map"
    }
}

In [26]:
json_data = json.dumps(data_to_convert, indent=4)


In [27]:
with open('output.json', 'w') as json_file:
    json.dump(data_to_convert, json_file, indent=4)

<h1>Conclusions</h1>

