# Artificial Intelligence

# Genetic Algorithm

## An Overview of the Travelling Salesman Problem

In the travelling salesman problem, a salesperson wish to find the shortest path that passes through all cities s/he wishes to visit given the coordinates of a set of cities. The salesperson should visit each of the cities once only, and so:

a. Each path consists all cities in the set.

b. Each path visits each of the cities once only. So, none of the cities are visited more than once. 

## Imports

In [6]:
%matplotlib inline

# Please add more imports if you need them 

import random
import time
import csv

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from pprint import pprint as print 

## Concrete Classes

### City

The City class, which represents a city, possesses the properties of the city and has functions/ methods used for calculating the distance between the city and another city. Each path, represented by a chromosome, is formed by a set of cities.   

In [7]:
class City:
    def __init__(self, x, y):
        self.x = x
        self.y = y
        
    def distance(self, city):
        xDis = abs(self.x - city.x)
        yDis = abs(self.y - city.y)
        distance = np.sqrt((xDis ** 2) + (yDis ** 2))
        return distance
    
    def __repr__(self):
        return "(" + str(self.x) + "," + str(self.y) + ")"

### Fitness

The Fitness class, which represents the fitness function, possesses the properties of a path and has functions/methods used for calculating the fitness value of the path, which is based on the distance of the path. 

In [8]:
class Fitness:
    def __init__(self, route):
        self.route = route
        self.distance = None
        self.fitness = None
    
    def routeDistance(self):
        if self.distance == None:
            pathDistance = 0.0
            for i in range(0, len(self.route)):
                fromCity = self.route[i]
                toCity = None
                if i+1 < len(self.route):
                    toCity = self.route[i+1]
                else:
                    toCity = self.route[0]
                pathDistance += fromCity.distance(toCity)
            self.distance = pathDistance
        return self.distance
    
    def routeFitness(self):
        if self.fitness == None:
        # Fitness function (Simple division) that uses a simple 
        # division that divides one by the distance of the path
            self.fitness = 1 / float(self.routeDistance()) 
            # Note: ensure a division by zero does not occur 
        return self.fitness


## Population Initialization  

The population initialization function (or method) performs random initialization. This creates an initial population with completely random chromosomes (or solutions). There are three functions related to population initialization. 

The first function is genCityList() which generates a set of cities from a file.  

In [95]:
def genCityList(filename):
    cityList = []
    
    # Replace the following codes that generate 12 random cities.
    # an initial population.  
    
    

    df = pd.read_csv(filename,sep=" |\t",names=["number","X","Y"],engine="python") #extract data from text file
    rng = np.random.default_rng(seed=1) #use for make each randomize output to same , for evaluating the parent selection
    #rng = np.random.default_rng(seed=None) 
    random_index = rng.choice(len(df), replace = True, size = 12)
    temp = df.iloc[random_index]
    for i in range (12): #for 12 times
        cityList.append(City(x=temp.iloc[i,1],y=temp.iloc[i,2])) #assign the selected column to city and add it into city's list
    
    return cityList

The second function is createRoute() which generates a random route (chromosome) from a set of City instances.

In [10]:
def createRoute(cityList):
    route = random.sample(cityList, len(cityList))
    return route

The third function is initialPopulation() which calls the second function repeatedly to create an initial population (a list of routes).

In [11]:
def initialPopulation(popSize, cityList):
    population = []
    for i in range(0, popSize):
        population.append(createRoute(cityList))
    return population



Sample run 1 initializes 12 cities in cityList as follows:

cityList = genCityList('cities10.txt') 
print(cityList)

Sample run 2 initializes 12 cities in cityList and creates a population with three routes as follows:

cityList = genCityList('cities10.txt') 
population = initialPopulation(3, cityList) 
print(population)

## Selection

Parents selection selects chromosomes with high fitness values from a population. Survivor selection selects chromosomes with higher fitness values to form the population of the next generation. The population size is len(population), so we have len(population) in this population. 

### Parent Selection

There are three implementations for parent selection. The first parentSelection() performs random selection.

In [12]:
def parentSelection(population, poolSize=None):
    if poolSize == None:
        poolSize = len(population)
        
    matingPool = []
    
    for i in range(0, poolSize):
        fitness = Fitness(population[i]).routeFitness()
        matingPool.append(random.choice(population))
      
    return matingPool

The second parentSelection() performs Tournament Selection.

In [13]:
def parentSelectionT(population, poolSize=None):
    
    # Replace the dummy parent selection function below with  
    # Tournament Selection.
      
    
    # compare the performance achieved by Random Selection, 
    # Tournament Selection, during performance evaluation 
    # run either Random Selection, Tournament Selection, or
    
    if poolSize == None:
        poolSize = len(population)
        
    matingPool = []

    TournamentPool=population.copy()
    for x in range(0,2):
        tournament = random.choices(TournamentPool, k=5) #tournament size = 5
        max_fitness = Fitness(tournament[0]).routeFitness()
        winner = tournament[0]
        for i in range (1,len(tournament)):
            fitness = Fitness(tournament[i]).routeFitness()
            if max_fitness < fitness:
                max_fitness = Fitness(tournament[i]).routeFitness()
                winner = tournament[i]
        matingPool.append(winner) # append winner into mating pool
        TournamentPool.remove(winner) #remove the first winner to prevent duplicate parent selection
    
    return matingPool

The third parentSelection() performs Proportional Selection.

In [14]:
def parentSelectionR(population, poolSize=None):
    
    # Replace the dummy parent selection function below with  
    # Proportional Selection.
       
    # compare the performance achieved by Random Selection, 
    # Proportional Selection during performance evaluation 
    
    if poolSize == None:
        poolSize = len(population)
        
    matingPool = []
    
    # Replacement starts here
    prbs = []
    sum = 0
    for i in range(len(population)):  # Computes the totallity of the population fitness
        sum += Fitness(population[i]).routeFitness() 
    for i in range(len(population)):      # Computes for each route the probability 
         prbs.append(Fitness(population[i]).routeFitness()/sum)
    winner = np.random.choice(len(population),2,replace=False,p = prbs)   # Making the probabilities for a minimization problem
    for x in range(0,2): #adding winners into matingPool
        matingPool.append(population[winner[x]])
    # Replacement ends here
    
    return matingPool

### Survival Selection

In [15]:
def survivorSelection(population, eliteSize):
    
    # Replace the dummy survival selection function below with  
    # Merge, Sort & Truncate.
      
    
    elites = []
    
    eliminations = population.copy()
    for i in range(eliteSize):
        max_fitness = Fitness(eliminations[0]).routeFitness()
        elite = eliminations[0]
        for x in range(1,len(eliminations)):
            fitness = Fitness(eliminations[x]).routeFitness()
            if max_fitness < fitness:
                max_fitness = Fitness(eliminations[x]).routeFitness()
                elite = eliminations[x]
        elites.append(elite)
        eliminations.remove(elite)
    
    return elites



Sample run 1 initializes 12 cities in cityList, creates a population with four routes, and creates a pool of parents as follows:

population = initialPopulation(4, genCityList('cities10.txt'))
matingpool = parentSelection(population, 4) 
print('Initial population') 
print(population) 
print('Mating pool') 
print(matingpool)

Sample run 2 initializes 12 cities in cityList, creates a population with four routes, select an elite chromosome as follows:

population = initialPopulation(4, genCityList('cities10.txt'))
elites = survivorSelection(population, 1)
print('Initial population')
print(population)
print('Selected elites')
print(elites)

## Crossover


Crossover selects two parents, crossover the genetic materials of the parents, and produce one or more children. In the Travelling Salesman Problem, each travelling path must be valid. Each path consists all cities in the set, and each path visits each of the cities once only. So, none of the cities are visited more than once. Exchanging parts of two chromosomes tend to produce invalid paths. As an example, Parent 1 is [2 1 0 7 3 5 4 6] and Parent 2 is [6 1 0 5 2 3 4 7]. One point crossover at midpoint generates Child 1 [2 1 0 7 2 3 4 7] and Child 2 [6 1 0 5 3 5 4 6]. Both children are invalid paths.     

In [36]:
def crossover(parent1, parent2):
    
    # Replace the dummy crossover function below with 
    # Partially Mapped Crossover approach.
   
    
    #define the crossover point 
    #first this using fixed crossover point which 3 and 6
    crossoverpoint1 = 3
    crossoverpoint2 = 9
    #second this using random crossover point which will sure not duplicate
    #rng = np.random.default_rng(seed=seed)
    #crossoverpoint1, crossoverpoint2 = np.sort(rng.choice(np.arange(len(parent1)-1), size=2, replace=False))
    
    #print (parent1) #debug purpose
    #print (parent2) #debug purpose
    
    #define the crossover function that will not duplicate the gene which purposely for TSP problem
    def PartialMappedCrossover(parent1,parent2):
        child = []
        count = 0
        for i in parent1:
            if(count == crossoverpoint1):
                break
            if(i not in parent2[crossoverpoint1:crossoverpoint2]):
                child.append(i)
                count= count+1

        #select the genes within the crossover points from parent2          
        child.extend(parent2[crossoverpoint1:crossoverpoint2])
        #fill in the remaining genes in order of parent1
        child.extend([x for x in parent1 if x not in child])
        return child
    child1 = PartialMappedCrossover(parent1,parent2)
    #print(child1) #debug purpose
    child2 = PartialMappedCrossover(parent2,parent1)
    #print(child2) #debug purpose
    
    return child1, child2

Crossover selects two parents from the mating pool to produce a new generation of the same size.

In [17]:
def breedPopulation(matingpool):
    children = []
    
    # Choosing parents in their order of presence in the mating pool. Choosing parents
    # in a random manner is possible. 
    
    for i in range(1, len(matingpool), 2):
        child1, child2 = crossover(matingpool[i-1], matingpool[i])
        children.append(child1)
        children.append(child2)
    
    return children

You can run the above functions using the sample run below. To do so, simply change the cell type from Markdown to Code. The sample run initializes 2 chromosomes in the population, and performs crossover among the two parents. 

population = initialPopulation(2, genCityList('cities10.txt'))
parent1, parent2 = population
child1, child2 = crossover(parent1, parent2)
print('Parents')
print(parent1)
print(parent2)
print('Children')
print(child1)
print(child2)

## Mutation

Mutation mutates a single chromosome to get a mutated chromosome so that genetic algorithm can converge to a shorter path quickly. In the Travelling Saleman Problem, a mutated chromosome must be a valid path. As an example, the shift mutation shifts a single gene in the [1 2 3 4 5 6 7 8 9 10] chromosome to generate the [1 2 4 5 6 7 3 8 9 10] mutated chromosome. As another example, the shift mutation shifts two consecutive genes in the [1 2 3 4 5 6 7 8 9 10] chromosome to generate the [1 4 5 6 7 2 3 8 9 10] mutated chromosome.

In [18]:
def mutate(route, mutationProbability):
    
    # Replace the dummy mutation function below with Insertion Mutation.
    # The dummy mutation function simply swaps a city with the city before it.  
   
     
    mutated_route = route[:]
    for i in range(len(route)):
        if (random.random() < mutationProbability):
            # mutationProbability is the probability of a gene undergoing mutation
            
            city1 = route[i]
            city2 = route[i-1]
            mutated_route[i] = city2
            mutated_route[i-1] = city1
    return mutated_route

Mutation runs over the entire population and mutates each chromosome in the population with a small mutationProbability. 

In [19]:
def mutation(population, mutationProbability):
    mutatedPopulation = []
    for i in range(0, len(population)):
        mutatedIndividual = mutate(population[i], mutationProbability)
        mutatedPopulation.append(mutatedIndividual)
    return mutatedPopulation

You can run the above functions using the sample run below. To do so, simply change the cell type from Markdown to Code. The sample run initializes a route comprised of 12 cities in cityList, and then mutates it as follows:

route = genCityList('cities10.txt')
mutated = mutate(route, 1)  # Give a pretty high chance for mutation
print('Original route')
print(route)
print('Mutated route')
print(mutated)

## Running One Generation (or Interation)

Here, we run one generation of genetic algorithm. 

In [78]:
def oneGeneration(population, eliteSize, mutationProbability):
    
    # First we preserve the elites
    elites = survivorSelection(population, eliteSize)
    
    # Then we calculate what our mating pool size should be and generate
    # the mating pool
    poolSize = len(population) - eliteSize
    matingpool = parentSelection(population, poolSize)
        
    # Then we perform crossover on the mating pool
    children = breedPopulation(matingpool)
    
    # We combine the elites and children into one population
    new_population = elites + children
    
    # We mutate the population
    mutated_population = mutation(new_population, mutationProbability)
        
    return mutated_population

You can run the above functions using the sample run below. To do so, simply change the cell type from Markdown to Code. The sample run initializes a population comprised of 5 chromosomes based on 12 cities in cityList, and then run one generation (or iteration) of genetic algorithm as follows:

population = initialPopulation(5, genCityList('cities10.txt'))
eliteSize = 1
mutationProbability = 0.01
new_population = oneGeneration(population, eliteSize, mutationProbability)
print('Initial population')
print(population)
print('New population')
print(new_population)

## Running Many Generations (or Interations) 

In [112]:
filename = 'cities10.txt'
popSize = 20
eliteSize = 5
mutationProbability = 0.01
iteration_limit = 100

cityList = genCityList(filename)

population = initialPopulation(popSize, cityList)
distances = [Fitness(p).routeDistance() for p in population]
min_dist = min(distances)
print("Best distance for initial population: " + str(min_dist))

for i in range(iteration_limit):
    population = oneGeneration(population, eliteSize, mutationProbability)
    distances = [Fitness(p).routeDistance() for p in population]
    index = np.argmin(distances)
    best_route = population[index]
    min_dist = min(distances)
    print("Best distance for population in iteration " + str(i) +
          ": " + str(min_dist))

print("Optimal path is " + str(best_route)) 

    # Performance Evaluation. You will present the performance achieved 
    # by different parent selection function. We will compare the 
    # performance achieved by Random Selection, Tournament Selection, and Proportional Selection. 

    
  

[(42033,82917),
 (42050,82583),
 (42100,83100),
 (42150,82967),
 (41800,82650),
 (41967,82533),
 (42133,82750),
 (42150,82967),
 (41983,82933),
 (42033,82750),
 (42133,82750),
 (42033,82917)]
'Best distance for initial population: 2331.9864929194'
'Best distance for population in iteration 0: 2331.9864929194'
'Best distance for population in iteration 1: 1998.2241006225393'
'Best distance for population in iteration 2: 1998.2241006225393'
'Best distance for population in iteration 3: 1998.2241006225393'
'Best distance for population in iteration 4: 1998.2241006225393'
'Best distance for population in iteration 5: 1998.2241006225393'
'Best distance for population in iteration 6: 1998.2241006225393'
'Best distance for population in iteration 7: 1998.2241006225393'
'Best distance for population in iteration 8: 1998.2241006225393'
'Best distance for population in iteration 9: 1998.2241006225393'
'Best distance for population in iteration 10: 1911.578454858661'
'Best distance for population