Question 1

In this assignment we will revisit an old friend, the traveling salesman problem (TSP).  This week you will implement a heuristic for the TSP, rather than an exact algorithm, and as a result will be able to handle much larger problem sizes.
Data file is: nn.txt

The first line indicates the number of cities. Each city is a point in the plane, and each subsequent line indicates the x- and y-coordinates of a single city.

The distance between two cities is defined as the Euclidean distance --- that is, two cities at locations (x,y) and (z,w) have distance [ sqr root of (x-z)^2 + (y-w)^2 ] between them.

You should implement the nearest neighbor heuristic:

1. Start the tour at the first city.

2. Repeatedly visit the closest city that the tour hasn't visited yet.  In case of a tie, go to the closest city with the lowest index.  For example, if both the third and fifth cities have the same distance from the first city (and are closer than any other city), then the tour should begin by going from the first city to the third city.

3. Once every city has been visited exactly once, return to the first city to complete the tour.

In the box below, enter the cost of the traveling salesman tour computed by the nearest neighbor heuristic for this instance, rounded down to the nearest integer.

[Hint: when constructing the tour, you might find it simpler to work with squared Euclidean distances (i.e., the formula above but without the square root) than Euclidean distances.  But don't forget to report the length of the tour in terms of standard Euclidean distance.]

In [1]:
import math
import time

def load(filename):
    """
        To load the data file into a list, having first element as count of the number of cities, and others as a list
        of the x- and y-coordinates of a single city
    """
    data = []
    with open(filename) as file:
        f = file.readlines()
        data.append(int(f[0].strip()))
        
        for lines in f[1:]:
            _, x, y = lines.strip().rsplit(" ")
            data.append([float(x), float(y)])

    return data


def cal_dist_sqr(point1, point2):
    """
        Calculate the square of distance of every two given points
    """
    return (point1[0]-point2[0])**2 + (point1[1]-point2[1])**2


def cal_x_dist(point1, point2):
    """
        Calculate the square distance of x axis
    """
    return (point1[0]-point2[0])**2


def findMinDist(start, visited, data):
    """
        First check the prior indexes, then check the later indexes.
        Since the data is sorted by x-axis, once the square distance of x axis is larger than 
        the the square of distance of every two given points, so can stop early. 
        Return the minimum distance and corresponding end point
    """
    dist = math.inf
    head = None
    
    # go throught the city prior
    end = start - 1
    while end > 0:
        # only visit the end city has not been visited
        if end in visited:
            end -= 1
            
        else:
            # stop looping once the square distance of x axis is larger than the the square of distance of 
            # every two given points, since it means the distance must be larger as well; otherwise, log
            # the minimum distance and end city
            if cal_x_dist(data[start], data[end]) > dist:
                break
            else:
                temp = cal_dist_sqr(data[start], data[end])
                if temp <= dist:
                    dist = temp
                    head = end
                end -= 1
    
    # go through the city behind
    end = start + 1
    while end <= data[0]:
        if end in visited:
            end += 1
        else:
            if cal_x_dist(data[start], data[end]) > dist:
                break
            else:
                temp = cal_dist_sqr(data[start], data[end])
                if temp < dist:
                    dist = temp
                    head = end
                end += 1
                
    # return minimum distance and the end point
    return dist, head
        

def TSP_heuristic(filename):
    """
        Nearest neighbor heuristic TSP Algorithm. Main idea is stated in question description.
    """
    # load the data file and initiate the start city, set of visited city(O(1) look up time), and min_dist to calculate
    # accumulated minimum distance
    data = load(filename)
    visited = {1}
    city = 1
    min_dist = 0
    
    # loop as long as not all cities have been visited
    while len(visited) != data[0]:
        dist, end = findMinDist(city, visited, data)
        visited.add(end)
        min_dist += math.sqrt(dist)
        city = end
    
    # add the last hop from last city visited to city 1
    min_dist += math.sqrt(cal_dist_sqr(data[1], data[city]))
    
    return round(min_dist, 2)
        

if __name__ == "__main__":
    start = time.time()
    minimum_distance = TSP_heuristic("nn.txt")
    end = time.time()
    
    print(f"The minimum cost of a traveling salesman tour for 33708 cities is {minimum_distance}")
    print(f"Run time of Nearest Neighbor Heuristic TSP Algorithm is {end-start} second(s)")
    

The minimum cost of a traveling salesman tour for 33708 cities is 1203406.5
Run time of Nearest Neighbor Heuristic TSP Algorithm is 5.180818319320679 second(s)
