# IE517
## Homework #1 
##### (Due March 22)
## Abdullah Hanefi Önaldı
## Solving the TSP using Construction Heuristics and 2-Opt Improvement Heuristic

### Problem Description

In this homework, you are going to solve the TSP for three data sets. They are called eil51.dat, eil76.dat, and eil101.dat, and consist of 51, 76, and 101 customer locations, respectively. Each data set includes the x-coordinates and y-coordinates of customers. The distances between customer locations are measured via Euclidean distance rounded to two digits after the decimal point. You can also compute the optimal tour length by considering the sequence given in the xxxopt.dat files.

1. Solve each instance using the one-sided nearest neighbor heuristic starting at cities 10, 20, and 30. This means that you will obtain nine tours. Provide the tour length of each one using the table below.
2. Solve each instance using the two-sided nearest neighbor heuristic starting at cities 10, 20, and 30. This means that you will obtain nine tours. Provide the tour length of each one using the table below.
3. Solve each instance using the nearest insertion heuristic starting at cities 10, 20, and 30. This means that you will obtain nine tours. Provide the tour length of each one using the table below.
4. Solve each instance using the farthest insertion heuristic starting at cities 10, 20, and 30. This means that you will obtain nine tours. Provide the tour length of each one using the table below.
5. For each tour obtained so far, apply the 2-opt improvement heuristic, and give the tour length using the table below.

I would like to remind you the following points which you should consider when you submit your homework. It will consist of two parts: your code and report. First, your code must be clear and you should define the following using comment lines in the code: variables names and their purpose, function names and their purpose. For example, you should write "X is the location variable", "CompObj calculates the objective value", etc. Or, you can use a function name that is self explanatory e.g., ApplyMove.

In the report part, you have to mention which solution representation and neighborhood structure you used as well as other pertinent and tiny details worth pointing out. You can use the following table for the output of your solutions.

<table border="1" class="dataframe">
  <thead>
    <tr>
      <th></th>
      <th>method</th>
      <th colspan="2" halign="left">1-Sided_NN</th>
      <th colspan="2" halign="left">2-Sided_NN</th>
      <th colspan="2" halign="left">Nearest_Insert</th>
      <th colspan="2" halign="left">Furthest_Insert</th>
    </tr>
    <tr>
      <th></th>
      <th>stage</th>
      <th>Initial</th>
      <th>After_1-opt</th>
      <th>Initial</th>
      <th>After_1-opt</th>
      <th>Initial</th>
      <th>After_1-opt</th>
      <th>Initial</th>
      <th>After_1-opt</th>
    </tr>
    <tr>
      <th>dataset</th>
      <th>initial_customer</th>
      <th></th>
      <th></th>
      <th></th>
      <th></th>
      <th></th>
      <th></th>
      <th></th>
      <th></th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th rowspan="3" valign="top">eil76</th>
      <th>10</th>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <th>20</th>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <th>30</th>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <th rowspan="3" valign="top">eil101</th>
      <th>10</th>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <th>20</th>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <th>30</th>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <th rowspan="3" valign="top">eil51</th>
      <th>10</th>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <th>20</th>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
    <tr>
      <th>30</th>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
      <td></td>
    </tr>
  </tbody>
</table>

Let's start by defining several helper functions:

- `l2_squared` : calculates the euclidian distance between two coordinates
- `calc_total_length` : calculates the total path length, given the corrdinates of all customers, and the order they are visited 

In [1]:
import pandas as pd
import numpy as np


def l2_squared(p1, p2):
    x1, y1 = p1
    x2, y2 = p2
    return ((x1 - x2)**2 + (y1 - y2)**2)**0.5


def calc_total_length(coords, path):
    length = 0
    for first, second in zip(path, path[1:]):
        length += l2_squared(coords[first], coords[second])
    return length

The static variables storing problem instances, initial customer indices, methods, and stages as described in the problem description

In [2]:
INSTANCES = {
    'eil76': {
        'file': 'data/eil76.dat',
        'file_opt': 'data/eil76opt.dat'
    },
    'eil101': {
        'file': 'data/eil101.dat',
        'file_opt': 'data/eil101opt.dat'
    },
    'eil51': {
        'file': 'data/eil51.dat',
        'file_opt': 'data/eil51opt.dat'
    },
}
INITIAL_CUSTOMERS = [10, 20, 30]
METHODS = ['1-Sided_NN', '2-Sided_NN', 'Nearest_Insert', 'Furthest_Insert']
STAGES = ['Initial', 'After_1-opt']

Create the Pandas DataFrame that will hold all the solutions

In [3]:
def create_df(instances=INSTANCES,
              initial_customers=INITIAL_CUSTOMERS,
              methods=METHODS,
              stages=STAGES):
    indexes = [instances.keys(), initial_customers]
    row_index = pd.MultiIndex.from_product(
        indexes, names=['dataset', 'initial_customer'])

    indexes = [methods, stages]
    column_index = pd.MultiIndex.from_product(
        indexes, names=['method', 'stage'])

    df = pd.DataFrame(index=row_index, columns=column_index)

    return df

Read the files of the given instances

In [4]:
def read_files(instances=INSTANCES):
    for instance in instances.values():
        instance['coords'] = {}
        with open(instance['file']) as file:
            for line in file:
                [i, x, y] = list(map(int, line.split()))
                instance['coords'][i] = (x, y)

        instance['optimal_path'] = []
        with open(instance['file_opt']) as file_opt:
            for line in file_opt:
                try:
                    i = int(line)
                    instance['optimal_path'].append(i)
                except ValueError:
                    pass
        instance['optimal_length'] = calc_total_length(
            coords=instance['coords'], path=instance['optimal_path'])

Finds the path using one sided Nearest Neighbor Algorithm

In [5]:
def one_sided(coords, initial_customer=10):
    residues = set(coords.keys())
    current = initial_customer
    path = [current]
    residues.discard(current)

    while len(residues) > 0:
        closest_distance = 1e10
        closest = None
        for candidate in residues:
            distance = l2_squared(coords[current], coords[candidate])
            if distance < closest_distance:
                closest_distance = distance
                closest = candidate
        path.append(closest)
        residues.discard(closest)
        current = closest

    path.append(path[0])
    return path

Actually read the files and create the dataframe that will store our solutions

In [6]:
read_files()
df = create_df()
df

Unnamed: 0_level_0,method,1-Sided_NN,1-Sided_NN,2-Sided_NN,2-Sided_NN,Nearest_Insert,Nearest_Insert,Furthest_Insert,Furthest_Insert
Unnamed: 0_level_1,stage,Initial,After_1-opt,Initial,After_1-opt,Initial,After_1-opt,Initial,After_1-opt
dataset,initial_customer,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2
eil76,10,,,,,,,,
eil76,20,,,,,,,,
eil76,30,,,,,,,,
eil101,10,,,,,,,,
eil101,20,,,,,,,,
eil101,30,,,,,,,,
eil51,10,,,,,,,,
eil51,20,,,,,,,,
eil51,30,,,,,,,,
