# Module 5 - Programming Assignment

In [2]:
from IPython.core.display import *
from StringIO import StringIO
from random import gauss, random
Rimport copy

## Local Search - Genetic Algorithm

This program uses the Genetic Algorithm to find the solution to a shifted Sphere Function in 10 dimensions, $x$, where the range of $x_i$ in each dimension is (-5.12 to 5.12). Here a "solution" means the vector $x$ that minimizes the function. The Sphere Function is:

$$f(x)=\sum x^2_i$$

We are going to shift it over 0.5 in every dimension:

$$f(x) = \sum (x_i - 0.5)^2$$

where $n = 10$.

As this *is* a minimization problem you'll the program uses the trick described in the lecture to turn the shifted Sphere Function into an appropriate fitness function (which is always looking for a *maximum* value).

## Binary GA

The problem is solved in two different ways. First, using the traditional (or "Canonical") Genetic Algorithm that encodes numeric values as binary strings (represented as lists of only 0 or 1).

There are many different ways to affect this encoding. This program uses a 10 bit binary encoding for each $x_i$. This gives each $x_i$ a potential value of 0 to 1024 which is be mapped to (-5.12, 5.12) by subtracting 512 and dividing by 100.

All the GA operators are as described in the lecture.

**Important**

There is a difference between the *genotype* and the *phenotype*. The GA operates on the *genotype* (the encoding) and does not respect the boundaries of the phenotype (the decoding). So, for example, do **not** use a List of Lists to represent an individual. It should be a *single* List of 10 x 10 or 100 bits. In general, crossover and mutation have no idea what those bits represent.

## Real Valued GA

For the real valued GA, each $x_i$ is represented as a float in the range (-5.12, 5.12) and the mutation operator applies gaussian noise. Python's random number generator for the normal distribution is called `gauss` and is found in the random module:

```
from random import gauss, random
```

You may need to experiment a bit with the standard deviation of the noise but the mean will be 0.0.

## GA

The Genetic Algorithm itself will have the same basic structure in each case: create a population, evaluate it, select parents, apply crossover and mutation, repeat until the number of desired generations have been generated. The easiest way to accomplish this in "Functional" Python would be to use Higher Order Functions.



Your code should print out the best individual of each generation including the generation number, genotype (the representation), phenotype (the actual value), the fitness (based on your fitness function transformation) and the function value (for the shifted sphere) if passed a DEBUG=True flag.

The GA has a lot of parameters: mutation rate, crossover rate, population size, dimensions (given for this problem), number of generations.  You can put all of those and your fitness function in a `Dict` in which case you need to implement:

```python
def binary_ga( parameters):
  pass
```

and

```python
def real_ga( parameters):
  pass
```

Remember that you need to transform the sphere function into a legit fitness function. Since you also need the sphere function, I would suggest that your parameters Dict includes something like:

```python
parameters = {
   "f": lambda xs: sphere( 0.5, xs),
   "minimization": True
   # put other parameters in here.
}
```

and then have your code check for "minimization" and create an entry for "fitness" that is appropriate.

In [10]:
def sphere( shift, xs):
    return sum( [(x - shift)**2 for x in xs])

In [12]:
sphere( 0.5, [1.0, 2.0, -3.4, 5.0, -1.2, 3.23, 2.87, -4.23, 3.82, -4.61])

113.42720000000001


-----

&nbsp;

**Minimization fitness**

minimization_fitness transforms the value returned from the sphere into an appropriate fitness function, since shifted Sphere Function is a minimization problem, and the fitness function is looking for a maximum.

In [8]:
def minimization_fitness(fitness_score):
    return 1 / (1 + fitness_score)

&nbsp;

**Initialize phenotype**

initialize_phenotype initializes a random phenotype for an individual.  It takes the given dimensions (10 for this problem) and initializes a list of 10 random numbers between -5.12 and 5.12.  This function is used to initialize the entire population.

In [None]:
def initialize_phenotype(dimensions):
    phenotype = []
    for i in range(dimensions):
        phenotype.append(round(random.uniform(-5.12, 5.12), 2))
    return phenotype

&nbsp;

**Binary GA phenotype to genotype**

It is necessary to be able to convert from phenotype to genotype for the binary GA.  This is used when the population is initialized.  Each genotype is initialized by converting the created phenotype to the corresponding genotype using the 10 bit binary encoding for each $x_i$. This gives each $x_i$ a potential value of 0 to 1024 which is be mapped to (-5.12, 5.12) by subtracting 512 and dividing by 100.  The genotype is represented as a list of 0s and 1s.

In [None]:
def binary_ga_phenotye_to_genotype(phenotype):
    genotype = []
    for decimal_representation in phenotype:
        decimal_representation = int((decimal_representation * 100) + 512)
        binary_representation = [int(x) for x in bin(decimal_representation)[2:]]
        for bit in binary_representation:
            genotype.append(bit)
    return genotype

&nbsp;

**Binary GA genotype to phenotype**

It is necessary to be able to convert from genotype to phenotype for the binary GA.  This is used in the binary GA crossover function, each time a new genotype is created using crossover to store the corresponding phenotype.  binary_ga_genotype_to_phenotype converts the binary representation of each dimension to the decimal representation, then subtracts 512 and divides by 100 to account for the binary encoding procedure described above.  

In [None]:
def binary_ga_genotype_to_phenotype(genotype):
    split_genotype = [genotype[i:i + 10] for i in xrange(0, len(genotype), 10)]
    phenotype = []
    for binary_representation in split_genotype:
        decimal_representation = int("".join(str(x) for x in binary_representation), 2)
        decimal_representation = (decimal_representation - 512.0) / 100
        phenotype.append(decimal_representation)
    return phenotype

&nbsp;

**Binary GA initialize population**

This function initializes the population for the binary GA with N random individuals of M dimensions, where N and M are defined as "population_size" and "dimensions" in the parameters dict passed to the genetic algorithm.  This function uses intialize_phenotype for each individual's phenotype, and binary_ga_phenotye_to_genotype to create its genotype.  The population is a list of dictionaries representing individuals.  Each dictionary contains key "phenotype" and "genotype", and this dictionary later includes key "fitness".

In [1]:
def binary_ga_initialize_population(population_size, dimensions):
    population = []
    for i in range(population_size):
        individual = {}
        phenotype = initialize_phenotype(dimensions)
        individual["phenotype"] = phenotype
        individual["genotype"] = binary_ga_phenotye_to_genotype(phenotype)
        population.append(individual)

    return population

&nbsp;

**Real GA initialize population**

This function initializes the population for the real GA with N random individuals of M dimensions, where N and M are defined as "population_size" and "dimensions" in the parameters dict passed to the genetic algorithm.  This function uses intialize_phenotype for each individual's phenotype, and and copies this value to create its genotype.  The population is a list of dictionaries representing individuals.  Each dictionary contains key "phenotype" and "genotype", and this dictionary later includes key "fitness".

In [None]:
def real_ga_initialize_population(population_size, dimensions):
    population = []
    for i in range(population_size):
        individual = {}
        phenotype = initialize_phenotype(dimensions)
        individual["phenotype"] = phenotype
        individual["genotype"] = copy.deepcopy(phenotype)
        population.append(individual)

    return population

&nbsp;

**Calculate fitness**

This function calculates and returns the fitness of an individual, given the dict representing the individual as described above and a flag that is True if this is a minimization problem, false otherwise.  The calculation is done on the phenotype of the individual.  If this is not a minimization problem, the sphere function is used as the fitness function.  If it is a minimization function, the minimization_fitness function is run on the value produced by the sphere function.  calculate_fitness is used to evaluate the entire population.

In [2]:
def calculate_fitness(individual, minimization):
    phenotype = individual["phenotype"]
    fitness = sphere(0.5, phenotype)
    
    if minimization:
        fitness = minimization_fitness(fitness)
    
    individual["fitness"] = fitness


&nbsp;

**Evaluate population**

This function 

In [None]:
# each individual should contain at least fields for its genome and fitness score
# evaluate applies the fitness function to each individual (make sure to transform fitness score)
def evaluate_population(population, minimization):
    best_fitness = float("-inf")
    best_individual = None
    for individual in population:
        calculate_fitness(individual, minimization)
        fitness = individual["fitness"]
        if fitness > best_fitness:
            best_individual = individual
            best_fitness = fitness

    return best_individual

-----

In [15]:
## Traditional GA
## binary_ga( parameters)

In [16]:
## Real Valued GA
## real_ga( parameters)