# Nearest neighbour algorithm

The nearest neighbour algorithm was one of the first algorithms used to solve the travelling salesman problem approximately. In that problem, the salesman starts at a random city and repeatedly visits the nearest city until all have been visited. The algorithm quickly yields a short tour, but usually not the optimal one.

## Algorithm

These are the steps of the algorithm:

1. Select an arbitrary vertex, set it as the current vertex u. Mark u as visited.
1. Find out the shortest edge connecting the current vertex u and an unvisited vertex v. Set v as the current vertex u. Mark v as visited.
1. If all the vertices in the domain are visited, then terminate. Else, go to step 3.

If you don't understand the above algorithm, please don't worry. Next, we will learn based on an example.

## Step by step

First, we import the python dependencies required for this section.

In [None]:
import numpy as np

Next, we define a travel graph, which include four cities a, b, c and d, and the distance between them is described by distance matrix.

|  | a  | b  | c  | d  |
|--------|----|----|----|----|
| a      | 0  | 20 | 15 | 35 |
| b      | 20 | 0  | 10 | 25 |
| c      | 15 | 10 | 0  | 12 |
| d      | 35 | 25 | 12 | 0  |

Each element in the matrix represents the distance between the corresponding row and column city. For example, the distance between a and c is 15. Note that since the round-trip distance between any two points is the same, the distance matrix is a symmetric matrix.

The code implementation is as follows:

In [None]:
label = ['a', 'b', 'c', 'd']
G = [
    [0,20,15,35],
    [20,0,10,25],
    [15,10,0,12],
    [35,25,12,0]
]

#### Step 1. Choose any starting node

Select an arbitrary vertex, set it as the current vertex u. Mark u as visited.

In [None]:
city = np.random.randint(len(label))
current, route = city, [city]
print(f'Start node: {current}, route: {route}')

#### Step 2. Consider the arcs which join the node just chosen to nodes as yet unchosen.  Pick the one with minimum weight and add it to the cycle

Find out the shortest edge connecting the current vertex u and an unvisited vertex v. Set v as the current vertex u. Mark v as visited.


We define a function to implement this function.

In [None]:
def find_closest_neighbor(city, G, label, route):
    w = 10000003
    index = -1
    for i in range(len(label)):
        if i not in route and i != city and G[city][i] < w:
            w = G[current][i]
            index = i
    return index

closest_neighbor = find_closest_neighbor(current, G, label, route)
print(f'Node: {current}, Closest neighbor: {closest_neighbor}')

### Step 3. Repeat step 2 until all nodes have been chosen

If all the vertices in the domain are visited, then terminate. Else, go to step 3.

In [None]:
while len(route) != len(label):
    index = find_closest_neighbor(current, G, label, route)
    current = index
    route.append(current)
print(f'Final route: {list(map(lambda x:label[x], route))}')

The sequence of the visited vertices is the output of the algorithm.

## Note

The nearest neighbour algorithm is easy to implement and executes quickly, but it can sometimes miss shorter routes which are easily noticed with human insight, due to its "greedy" nature. As a general guide, if the last few stages of the route are comparable in length to the first stages, then the route is reasonable; if they are much greater, then it is likely that much better routes exist. Another check is to use an algorithm such as the lower bound algorithm to estimate if this route is good enough.

In the worst case, the algorithm results in a route that is much longer than the optimal route. To be precise, for every constant r there is an instance of the traveling salesman problem such that the length of the route computed by the nearest neighbour algorithm is greater than r times the length of the optimal route. Moreover, for each number of cities there is an assignment of distances between the cities for which the nearest neighbor heuristic produces the unique worst possible route. (If the algorithm is applied on every vertex as the starting vertex, the best path found will be better than at least N/2-1 other routes, where N is the number of vertices.)

The nearest neighbour algorithm may not find a feasible route at all, even when one exists.

You can find the implementation of the nearest neighbor algorithm in pyTSP from the following location:

`pyTSP/source/algorithms/tour_construction.py#L33`

## Exercises

 - Can this method find the best routes? Verify your idea with code (refer to the function for calculating routes distance in permutation.ipynb)