## Local Search - Genetic Algorithm

There are some key ideas in the Genetic Algorithm.

First, there is a problem of some kind that either *is* an optimization problem or the solution can be expressed in terms of an optimization problem.
For example, if we wanted to minimize the function

$$f(x) = \sum (x_i - 0.5)^2$$

where $n = 10$.
This *is* an optimization problem. Normally, optimization problems are much, much harder.

![Eggholder](http://www.sfu.ca/~ssurjano/egg.png)!

The function we wish to optimize is often called the **objective function**.
The objective function is closely related to the **fitness** function in the GA.
If we have a **maximization** problem, then we can use the objective function directly as a fitness function.
If we have a **minimization** problem, then we need to convert the objective function into a suitable fitness function, since fitness functions must always mean "more is better".

Second, we need to *encode* candidate solutions using an "alphabet" analogous to G, A, T, C in DNA.
This encoding can be quite abstract.
You saw this in the Self Check.
There a floating point number was encoded as bits, just as in a computer and a sophisticated decoding scheme was then required.

Sometimes, the encoding need not be very complicated at all.
For example, in the real-valued GA, discussed in the Lectures, we could represent 2.73 as....2.73.
This is similarly true for a string matching problem.
We *could* encode "a" as "a", 97, or '01100001'.
And then "hello" would be:

```
["h", "e", "l", "l", "o"]
```

or

```
[104, 101, 108, 108, 111]
```

or

```
0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1
```

In Genetics terminology, this is the **chromosome** of the individual. And if this individual had the **phenotype** "h" for the first character then they would have the **genotype** for "h" (either as "h", 104, or 01101000).

To keep it straight, think **geno**type is **genes** and **pheno**type is **phenomenon**, the actual thing that the genes express.
So while we might encode a number as 10110110 (genotype), the number itself, 182, is what goes into the fitness function.
The environment operates on zebras, not the genes for stripes.

## String Matching

You are going to write a Genetic Algorithm that will solve the problem of matching a target string.
Now, this is kind of silly because in order for this to work, you need to know the target string and if you know the target string, why are you trying to do it?
Well, the problem is *pedagogical*.
It's a fun way of visualizing the GA at work, because as the GA finds better and better candidates, they make more and more sense.

Now, string matching is not *directly* an optimization problem so this falls under the general category of "if we convert the problem into an optimization problem we can solve it with an optimization algorithm" approach to problem solving.
This happens all the time.
We have a problem.
We can't solve it.
We convert it to a problem we *can* solve.
In this case, we're using the GA to solve the optimization part.

And all we need is some sort of measure of the difference between two strings.
We can use that measure as a **loss function**.
A loss function gives us a score tells us how similar two strings are.
The loss function becomes our objective function and we use the GA to minimize it by converting the objective function to a fitness function.
So that's the first step, come up with the loss/objective function.
The only stipulation is that it must calculate the score based on element to element (character to character) comparisons with no global transformations of the candidate or target strings.

And since this is a GA, we need a **genotype**.
The genotype for this problem is a list of "characters" (individual letters aren't special in Python like they are in some other languages):

```
["h", "e", "l", "l", "o"]
```

and the **phenotype** is the resulting string:

```
"hello"
```

In addition to the generic code and problem specific loss function, you'll need to pick parameters for the run.
These parameters include:

1. population size
2. number of generations
3. probability of crossover
4. probability of mutation

You will also need to pick a selection algorithm, either roulette wheel or tournament selection.
In the later case, you will need a tournament size.
This is all part of the problem.

Every **ten** (10) generations, you should print out the fitness, genotype, and phenotype of the best individual in the population for the specific generation.
The function should return the best individual *of the entire run*, using the same format.

In [32]:
ALPHABET = "abcdefghijklmnopqrstuvwxyz "

In [33]:
from random import randint, sample, random
from typing import List, Dict, Callable, Any
from pprint import pprint

<a id="genetic_algorithm"></a>
### genetic_algorithm

`genetic_algorithm` tries to find a target string by randomly generating populations of individual phenotypes. Generations of individuals progress by reproducing and potentially mutating until the target string is found. **Uses**: [output_print](#output_print), [generate_random_population](#generate_random_population), [evaluate_fitness1](#evaluate_fitness1), [evaluate_fitness2](#evaluate_fitness2), [evaluate_fitness3](#evaluate_fitness3), [pick_parents](#pick_parents), [reproduce](#reproduce)

* **population_size** int: number of individuals in the population
* **generation_size** int: max number of generations to iterate over
* **crossover_prob** float: probability of parents passing their genes to a child. Value: `0-1`
* **mutation_prob** float: probability of a mutation happeing in a child. Value: `0-1`
* **alphabet** str: string of allowable characters in a gene.
* **target** str: target for the genetic algorithm to find.
* **evaluate_fitness** Callable: which fitness funtion to use.

**return**: List[Any]: list of statements to print

In [34]:
def genetic_algorithm(
    population_size: int,
    generation_size: int,
    crossover_prob: float,
    mutation_prob: float,
    alphabet: str,
    target: str,
    evaluate_fitness: Callable,
) -> Dict[str, Any]:
    population = generate_random_population(population_size, alphabet, target)
    generation = 0

    while generation < generation_size:
        fitness = evaluate_fitness(population, alphabet, target)
        best_individual = min(fitness)

        if generation == 10:
            generation = 0
            print(output_print(population, fitness, best_individual))
        if best_individual == 0:
            print(output_print(population, fitness, best_individual))
            return {
                "fitness": best_individual,
                "genotype": population[fitness.index(best_individual)],
                "phenotype": "".join(population[fitness.index(best_individual)]),
            }

        next_population = []
        for n in range(0, len(population) // 2):
            parent1 = pick_parent(fitness, sample(range(0, len(population)), 10))
            parent2 = pick_parent(fitness, sample(range(0, len(population)), 10))
            child1, child2 = reproduce(
                population[parent1],
                population[parent2],
                alphabet,
                crossover_prob,
                random(),
                randint(0, (len(population[0]) - 1)),
                mutation_prob,
                randint(0, len(population[0]) - 1),
                randint(0, len(population[0]) - 1),
            )
            next_population.append(child1)
            next_population.append(child2)

        population = next_population
        generation += 1


<a id="output_print"></a>
### output_print

`output_print` prints the fitness score, genotype, and phenotype. **Used by**: [genetic_algorithm](#genetic_algorithm)

* **population** List[str]: the list of the population expressed as chormosomes
* **fitness** List[int]: fintess values of the population.
* **best_individual** int: fitness value of an individual.

**return**: List[Any]: list of statements to print

In [35]:
def output_print(population, fitness, best_individual):
    statements = [
        f"Fitness score: {best_individual}",
        f"Genotype: {population[fitness.index(best_individual)]}",
        "Phenotype: " + "".join(population[fitness.index(best_individual)]),
        "\n",
    ]
    return statements

In [36]:
p = output_print([["1", "2", "3"], ["4", "5", "6"], ["7", "8", "9"]], [7, 8, 9], 7)
assert p == ["Fitness score: 7", "Genotype: ['1', '2', '3']", "Phenotype: 123", "\n"]
p = output_print([["1", "2", "3"], ["4", "5", "6"], ["7", "8", "9"]], [7, 8, 9], 9)
assert p == ["Fitness score: 9", "Genotype: ['7', '8', '9']", "Phenotype: 789", "\n"]
p = output_print([["1", "2", "3"], ["4", "5", "6"], ["7", "8", "9"]], [7, 8, 9], 8)
assert p == ["Fitness score: 8", "Genotype: ['4', '5', '6']", "Phenotype: 456", "\n"]

<a id="generate_random_population"></a>
### generate_random_population

`generate_random_population` produces a list of random individuals and each individual is a list of random genes. **Used by**: [genetic_algorithm](#genetic_algorithm)

* **population_size** int: the number of individuals desired.
* **alphabet** str: string of allowable characters in a gene.
* **target** str: target for the genetic algorithm to find.

**returns** List[List[str]]: the list of genes of a list individuals 

In [37]:
def generate_random_population(
    population_size: int, alphabet: str, target: str
) -> List[List[str]]:
    chromosome_length = len([char for char in target])
    alphabet_chars = [char for char in alphabet]
    population = []

    for individual in range(0, population_size):
        new_chromosome = []
        for gene in range(0, chromosome_length):
            new_gene = randint(0, len(alphabet_chars) - 1)
            new_chromosome.append(alphabet_chars[new_gene])
        population.append(new_chromosome)
    return population


In [38]:
pop = generate_random_population(3, "abcd", "ab")
assert len(pop) == 3 and len(pop[0]) == 2
pop = generate_random_population(2, "111111", "11")
assert pop == [["1", "1"], ["1", "1"]]
pop = generate_random_population(3, "zzzzzz", "zzz")
assert pop == [["z", "z", "z"], ["z", "z", "z"], ["z", "z", "z"]]

<a id="evaluate_fitness1"></a>
### evaluate_fitness1

`evaluate_fitness` produces a list of fitness values for individuals from a population. The fitness value is determined by comparing a gene from an individual to the gene of the target and calculating how far apart they are in the alphabet. The lower the score, the closer the individual is to the target. This function helps determine if the target has been found or which individuals should be selected as parents. **Used by**: [genetic_algorithm](#genetic_algorithm)

* **population** List[str]: the list of the population expressed as chormosomes
* **alphabet** str: string of allowable characters in a gene.
* **target** str: target for the genetic algorithm to find.

**returns** List[int]: the list of fitness values for individuals 

In [39]:
def evaluate_fitness1(population: List[str], alphabet: str, target: str) -> List[int]:
    alphabet_chars = [char for char in alphabet]
    target_chromosome = [char for char in target]
    fitness = []

    for individual in population:
        fitness_score = 0
        for gene_index, gene in enumerate(individual):
            target_gene = target_chromosome[gene_index]
            individual_gene = individual[gene_index]
            fitness_score += abs(
                alphabet_chars.index(target_gene)
                - alphabet_chars.index(individual_gene)
            )

        fitness.append(fitness_score)

    return fitness

In [40]:
pop = [["c", "b"], ["a", "d"], ["c", "d"]]
f = evaluate_fitness1(pop, "abcd", "cd")
assert f == [2, 2, 0]
f = evaluate_fitness1(pop, "abcd", "dd")
assert f == [3, 3, 1]
f = evaluate_fitness1(pop, "abcd", "ad")
assert f == [4, 0, 2]

<a id="evaluate_fitness2"></a>
### evaluate_fitness2

`evaluate_fitness2` produces a list of fitness values for individuals from a population. Fitness is determined in reverse order of the target. The fitness value is determined by comparing a gene from an individual to the gene of the target and calculating how far apart they are in the alphabet. The lower the score, the closer the individual is to the target. This function helps determine if the target has been found or which individuals should be selected as parents. **Used by**: [genetic_algorithm](#genetic_algorithm)

* **population** List[str]: the list of the population expressed as chormosomes
* **alphabet** str: string of allowable characters in a gene.
* **target** str: target for the genetic algorithm to find.

**returns** List[int]: the list of fitness values for individuals 

In [41]:
def evaluate_fitness2(population: List[str], alphabet: str, target: str) -> List[int]:
    alphabet_chars = [char for char in alphabet]
    target_chromosome = [char for char in target]
    fitness = []

    for individual in population:
        fitness_score = 0
        for gene_index, gene in enumerate(individual):
            target_gene = target_chromosome[len(individual) - gene_index - 1]
            individual_gene = individual[gene_index]
            fitness_score += abs(
                alphabet_chars.index(target_gene)
                - alphabet_chars.index(individual_gene)
            )

        fitness.append(fitness_score)

    return fitness

In [42]:
pop = [["c", "b"], ["a", "d"], ["c", "d"]]
f = evaluate_fitness2(pop, "abcd", "cd")
assert f == [2, 4, 2]
f = evaluate_fitness2(pop, "abcd", "dd")
assert f == [3, 3, 1]
f = evaluate_fitness2(pop, "abcd", "ad")
assert f == [2, 6, 4]

<a id="evaluate_fitness3"></a>
### evaluate_fitness3

`evaluate_fitness3` produces a list of fitness values for individuals from a population. Fitness is determined for ROT13. The fitness value is determined by comparing a gene from an individual to the gene of the target and calculating how far apart they are in the alphabet. The lower the score, the closer the individual is to the target. This function helps determine if the target has been found or which individuals should be selected as parents. **Used by**: [genetic_algorithm](#genetic_algorithm)

* **population** List[str]: the list of the population expressed as chormosomes
* **alphabet** str: string of allowable characters in a gene.
* **target** str: target for the genetic algorithm to find.

**returns** List[int]: the list of fitness values for individuals 

In [43]:
def evaluate_fitness3(population: List[str], alphabet: str, target: str) -> List[int]:
    alphabet_chars = [char for char in alphabet]
    target_chromosome = [char for char in target]
    fitness = []

    for individual in population:
        fitness_score = 0
        for gene_index, gene in enumerate(individual):
            target_gene = target_chromosome[gene_index]
            individual_gene = individual[gene_index]
            fitness_score += abs(
                ((alphabet_chars.index(target_gene) + 13) % len(alphabet))
                - alphabet_chars.index(individual_gene)
            )

        fitness.append(fitness_score)

    return fitness

In [44]:
pop = [["c", "b"], ["a", "d"], ["c", "d"]]
f = evaluate_fitness3(pop, "abcd", "cd")
assert f == [2, 6, 4]
f = evaluate_fitness3(pop, "abcd", "dd")
assert f == [3, 3, 5]
f = evaluate_fitness3(pop, "abcd", "ad")
assert f == [2, 4, 4]

<a id="pick_parents"></a>
### pick_parents

`pick_parents` picks an individual from the population to be a parent to the next generation. Individual is identified using a tournament style system, where a list of random individuals of a specified length gets sorted to find the most fit candidate to be a parent. **Used by**: [genetic_algorithm](#genetic_algorithm)

* **fitness** List[int]: list of the fitness values for each individual.
* **rand_fn** List[int]: list of indecies for tournament selection.

**returns** int: index of the individual chosen

In [45]:
def pick_parent(fitness: List[int], rand_fn: List[int]) -> int:
    chosen = []
    for index in rand_fn:
        chosen.append(fitness[index])
    best = rand_fn[chosen.index(min(chosen))]
    return best

In [46]:
pop = [["c", "b"], ["a", "d"], ["c", "d"]]
f = evaluate_fitness1(pop, "abcd", "ad")
parent = pick_parent(f, [1, 2, 2])
assert parent == 1
parent = pick_parent(f, [0, 2, 2])
assert parent == 2
parent = pick_parent(f, [0, 2, 0])
assert parent == 2

<a id="apply_crossover"></a>
### apply_crossover

`apply_crossover` takes 2 parents performs gene crossover. Crossover happens at a random gene location. **Used by**: [reproduce](#reproduce)

* **parent1** List[int]: list of genes of parent1.
* **parent2** List[int]: list of genes of parent2.
* **rand_crossover_op** int: location of gene where crossover should happen

**returns** List[List[str],List[str]]: the 2 individuals that will be part of the next generation

In [47]:
def apply_crossover(
    parent1: List[str], parent2: List[str], rand_crossover_op: int
) -> [List[str], List[str]]:
    child1 = parent1[:rand_crossover_op] + parent2[rand_crossover_op:]
    child2 = parent2[:rand_crossover_op] + parent1[rand_crossover_op:]
    return child1, child2


In [48]:
c = apply_crossover(["1", "2", "3"], ["4", "5", "6"], 2)
assert c == (["1", "2", "6"], ["4", "5", "3"])
c = apply_crossover(["1", "2", "3"], ["4", "5", "6"], 3)
assert c == (["1", "2", "3"], ["4", "5", "6"])
c = apply_crossover(["1", "2", "3"], ["4", "5", "6"], 0)
assert c == (["4", "5", "6"], ["1", "2", "3"])

<a id="apply_mutation"></a>
### apply_mutation

`apply_mutation` applies a random mutation to a random gene. **Used by**: [reproduce](#reproduce)

* **child** List[int]: list of genes of an individual.
* **alphabet** str: string of allowable characters in a gene.
* **rand_gene** int: location of gene where mutation should happen

**returns** List[str]: the individual with a mutated gene

In [49]:
def apply_mutation(
    child: List[str], alphabet: str, rand_gene: int, rand_char: int
) -> List[str]:
    alphabet_chars = [char for char in alphabet]
    child[rand_gene] = alphabet_chars[rand_char]
    return child


In [50]:
c = apply_mutation(["1", "2", "3"], "456", 1, 1)
assert c == ["1", "5", "3"]
c = apply_mutation(["1", "2", "3"], "456", 0, 2)
assert c == ["6", "2", "3"]
c = apply_mutation(["1", "2", "3"], "456", 2, 0)
assert c == ["1", "2", "4"]

<a id="reproduce"></a>
### reproduce

`reproduce` takes 2 parents and determines if they will cross their genes or be returned back to the population. Also applies a mutation to a gene if required by the mutation probability. This is the way the algorithm changes the population to search for the target answer.  **Used by**: [genetic_algorithm](#genetic_algorithm) **Uses**: [apply_crossover](#apply_crossover), [apply_mutation](#apply_mutation)

* **parent1** List[int]: list of genes of parent1.
* **parent2** List[int]: list of genes of parent2.
* **alphabet** str: string of allowable characters in a gene.
* **crossover_prob** float: the chance parents will create children. Value: `0-1`
* **rand_cross_over_chance** float: the value assigned to the parents to be compared against for potential crossover. Value: `0-1`
* **rand_crossover_op** int: location of gene where crossover should happen
* **mutation_prob** float: probability of a mutation occuring after crossover. Value: `0-1`
* **rand_mutation1** float: random number assigned to child 1 to see if mutation will occur. Value: `0-1`
* **rand_mutation2** float: random number assigned to child 2 to see if mutation will occur. Value: `0-1`

**returns** List[List[str],List[str]]: the 2 individuals that will be part of the next generation

In [51]:
def reproduce(
    parent1: List[str],
    parent2: List[str],
    alphabet: str,
    crossover_prob: float,
    rand_crossover_chance: float,
    rand_crossover_op: int,
    mutation_prob: float,
    rand_mutation1: float,
    rand_mutation2: float,
) -> [List[str], List[str]]:
    if rand_crossover_chance > crossover_prob:
        return parent1, parent2

    child1, child2 = apply_crossover(parent1, parent2, rand_crossover_op)

    if rand_mutation1 < mutation_prob:
        child1 = apply_mutation(
            child1, alphabet, randint(0, len(child1) - 1), randint(0, len(alphabet) - 1)
        )
    if rand_mutation2 < mutation_prob:
        child2 = apply_mutation(
            child2, alphabet, randint(0, len(child2) - 1), randint(0, len(alphabet) - 1)
        )

    return child1, child2


In [52]:
c1, c2 = reproduce(["1", "2", "3"], ["4", "5", "6"], "123456789", 1, 1, 1, 1, 1, 1)
assert c1 == ["1", "5", "6"]
assert c2 == ["4", "2", "3"]
c1, c2 = reproduce(["1", "2", "3"], ["4", "5", "6"], "123456789", 1, 1, 1, 1, 1, 1)
assert c1 == ["1", "5", "6"]
assert c2 == ["4", "2", "3"]
c1, c2 = reproduce(["1", "2", "3"], ["4", "5", "6"], "123456789", 1, 1, 1, 1, 1, 1)
assert c1 == ["1", "5", "6"]
assert c2 == ["4", "2", "3"]

## Problem 1

The target is the string "this is so much fun".
The challenge, aside from implementing the basic algorithm, is deriving a fitness function based on "b" - "p" (for example).
The fitness function should come up with a fitness score based on element to element comparisons between target v. phenotype.

In [53]:
target1 = "this is so much fun"

In [54]:
result1 = genetic_algorithm(100, 10000, 0.8, 0.05, ALPHABET, target1, evaluate_fitness1)

['Fitness score: 48', "Genotype: ['t', 'c', 'n', 'r', 'm', 'j', 'r', 'z', 'o', 'm', 'z', 'l', 't', 'b', 'k', 'y', 'j', 'v', 'n']", 'Phenotype: tcnrmjrzomzltbkyjvn', '\n']
['Fitness score: 42', "Genotype: ['t', 'c', 'n', 'r', 'm', 'j', 'r', 'z', 'r', 'm', 'z', 'l', 't', 'b', 'h', 'y', 'j', 'v', 'n']", 'Phenotype: tcnrmjrzrmzltbhyjvn', '\n']
['Fitness score: 39', "Genotype: ['t', 'c', 'g', 'r', 'm', 'j', 'r', 'z', 'r', 'm', 'z', 'l', 't', 'b', 'h', 'y', 'j', 'v', 'n']", 'Phenotype: tcgrmjrzrmzltbhyjvn', '\n']
['Fitness score: 35', "Genotype: ['t', 'k', 'g', 'r', 'm', 'j', 'r', ' ', 'r', 'p', 'z', 'l', 't', 'b', 'h', 'y', 'j', 'v', 'n']", 'Phenotype: tkgrmjr rpzltbhyjvn', '\n']
['Fitness score: 29', "Genotype: ['t', 'h', 'g', 'r', 'm', 'j', 'r', ' ', 'r', 'p', ' ', 'l', 'u', 'b', 'h', 'y', 'j', 'u', 'n']", 'Phenotype: thgrmjr rp lubhyjun', '\n']
['Fitness score: 14', "Genotype: ['t', 'h', 'g', 'r', ' ', 'j', 'r', ' ', 'r', 'p', ' ', 'l', 'u', 'c', 'h', 'y', 'j', 'u', 'n']", 'Phenotype: th

['Fitness score: 1', "Genotype: ['t', 'h', 'i', 's', ' ', 'i', 's', ' ', 's', 'p', ' ', 'm', 'u', 'c', 'h', ' ', 'f', 'u', 'n']", 'Phenotype: this is sp much fun', '\n']
['Fitness score: 1', "Genotype: ['t', 'h', 'i', 's', ' ', 'i', 's', ' ', 's', 'p', ' ', 'm', 'u', 'c', 'h', ' ', 'f', 'u', 'n']", 'Phenotype: this is sp much fun', '\n']
['Fitness score: 1', "Genotype: ['t', 'h', 'i', 's', ' ', 'i', 's', ' ', 's', 'p', ' ', 'm', 'u', 'c', 'h', ' ', 'f', 'u', 'n']", 'Phenotype: this is sp much fun', '\n']
['Fitness score: 1', "Genotype: ['t', 'h', 'i', 's', ' ', 'i', 's', ' ', 's', 'p', ' ', 'm', 'u', 'c', 'h', ' ', 'f', 'u', 'n']", 'Phenotype: this is sp much fun', '\n']
['Fitness score: 1', "Genotype: ['t', 'h', 'i', 's', ' ', 'i', 's', ' ', 's', 'p', ' ', 'm', 'u', 'c', 'h', ' ', 'f', 'u', 'n']", 'Phenotype: this is sp much fun', '\n']
['Fitness score: 1', "Genotype: ['t', 'h', 'i', 's', ' ', 'i', 's', ' ', 's', 'p', ' ', 'm', 'u', 'c', 'h', ' ', 'f', 'u', 'n']", 'Phenotype: this is 

In [55]:
pprint(result1, compact=True)

{'fitness': 0,
 'genotype': ['t', 'h', 'i', 's', ' ', 'i', 's', ' ', 's', 'o', ' ', 'm', 'u',
              'c', 'h', ' ', 'f', 'u', 'n'],
 'phenotype': 'this is so much fun'}


## Problem 2

You should have working code now.
The goal here is to think a bit more about fitness functions.
The target string is, 'nuf hcum os si siht'.
This is obviously target #1 but reversed.
The goal is then to derive a "gene vs gene" fitness function (although I am not specifying which gene against which gene).
(You may not reverse the target or the candidate, either before fitness evaluation or afterwards).

<div style="background: lemonchiffon; margin:20px; padding: 20px;">
    <strong>Important</strong>
    <p>
        You may not reverse an entire string (either target or candidate) at any time.
        Everything must be a computation of one gene against one gene (one letter against one letter).
        Failure to follow these directions will result in 0 points for the problem.
    </p>
</div>

The best individual in the population is the one who expresses this string *forwards*.

In [56]:
target2 = "nuf hcum os si siht"

In [57]:
result2 = genetic_algorithm(100, 10000, 0.8, 0.05, ALPHABET, target2, evaluate_fitness2)

['Fitness score: 35', "Genotype: ['r', 'n', 'i', 's', 'z', 'j', 'r', 'z', 'r', 't', 'z', 'i', 'v', 'd', 'l', ' ', 'd', 's', 'p']", 'Phenotype: rniszjrzrtzivdl dsp', '\n']
['Fitness score: 28', "Genotype: ['r', 'f', 'i', 's', 'z', 'j', 'r', 'z', 'r', 't', 'z', 'k', 'v', 'd', 'k', ' ', 'd', 's', 'p']", 'Phenotype: rfiszjrzrtzkvdk dsp', '\n']
['Fitness score: 25', "Genotype: ['r', 'f', 'i', 's', 'z', 'j', 'r', 'z', 'r', 't', 'z', 'k', 'u', 'd', 'k', ' ', 'd', 'v', 'o']", 'Phenotype: rfiszjrzrtzkudk dvo', '\n']
['Fitness score: 21', "Genotype: ['r', 'f', 'i', 's', 'z', 'j', 'r', 'z', 'r', 'l', 'z', 'n', 'u', 'd', 'k', ' ', 'd', 'u', 'o']", 'Phenotype: rfiszjrzrlznudk duo', '\n']
['Fitness score: 20', "Genotype: ['r', 'f', 'i', 's', 'z', 'j', 'r', 'z', 'r', 'm', 'z', 'n', 'u', 'd', 'k', ' ', 'd', 'u', 'o']", 'Phenotype: rfiszjrzrmznudk duo', '\n']
['Fitness score: 20', "Genotype: ['r', 'f', 'i', 's', 'z', 'j', 'r', 'z', 'r', 'm', 'z', 'n', 'u', 'd', 'k', ' ', 'd', 'u', 'o']", 'Phenotype: rf

In [58]:
pprint(result2, compact=True)

{'fitness': 0,
 'genotype': ['t', 'h', 'i', 's', ' ', 'i', 's', ' ', 's', 'o', ' ', 'm', 'u',
              'c', 'h', ' ', 'f', 'u', 'n'],
 'phenotype': 'this is so much fun'}


## Problem 3

This is a variation on a theme.
The Caeser Cypher replaces each letter of a string with the letter 13 characters down alphabet (rotating from "z" back to "a" as needed).
This is also known as ROT13 (for "rotate 13").
Latin did not have spaces (and the space is not continguous with the letters a-z) so we'll remove them from our alphabet.
Again, the goal is to derive a "gene vs gene" fitness function, without global transformations.

<div style="background: lemonchiffon; margin:20px; padding: 20px;">
    <strong>Important</strong>
    <p>
        You may not apply ROT13 to an entire string (either target or candidate) at any time.
        Everything must be a computation of one gene against one gene.
        Failure to follow these directions will result in 0 points for the problem.
    </p>
</div>

The best individual will express the target *decoded*.

In [59]:
ALPHABET3 = "abcdefghijklmnopqrstuvwxyz"

In [60]:
target3 = "guvfvffbzhpusha"

In [61]:
result3 = genetic_algorithm(
    100, 10000, 0.8, 0.05, ALPHABET3, target3, evaluate_fitness3
)


['Fitness score: 21', "Genotype: ['r', 'k', 'i', 's', 'g', 'q', 'u', 'o', 'm', 'u', 'd', 'e', 'e', 'x', 'l']", 'Phenotype: rkisgquomudeexl', '\n']
['Fitness score: 17', "Genotype: ['r', 'g', 'i', 's', 'g', 'q', 'q', 'o', 'm', 'u', 'd', 'e', 'e', 't', 'l']", 'Phenotype: rgisgqqomudeetl', '\n']
['Fitness score: 10', "Genotype: ['s', 'g', 'i', 's', 'i', 'r', 'q', 'o', 'm', 'u', 'd', 'g', 'e', 't', 'm']", 'Phenotype: sgisirqomudgetm', '\n']
['Fitness score: 8', "Genotype: ['s', 'g', 'i', 's', 'i', 'r', 'q', 'o', 'm', 'u', 'd', 'h', 'e', 't', 'n']", 'Phenotype: sgisirqomudhetn', '\n']
['Fitness score: 6', "Genotype: ['t', 'g', 'i', 's', 'i', 'r', 'q', 'o', 'm', 'u', 'c', 'h', 'e', 't', 'n']", 'Phenotype: tgisirqomuchetn', '\n']
['Fitness score: 6', "Genotype: ['t', 'g', 'i', 's', 'i', 'r', 'q', 'o', 'm', 'u', 'c', 'h', 'e', 't', 'n']", 'Phenotype: tgisirqomuchetn', '\n']
['Fitness score: 5', "Genotype: ['t', 'g', 'i', 's', 'i', 'r', 'q', 'o', 'm', 'u', 'c', 'h', 'f', 't', 'n']", 'Phenotype:

['Fitness score: 5', "Genotype: ['t', 'g', 'i', 's', 'i', 'r', 'q', 'o', 'm', 'u', 'c', 'h', 'f', 't', 'n']", 'Phenotype: tgisirqomuchftn', '\n']
['Fitness score: 5', "Genotype: ['t', 'g', 'i', 's', 'i', 'r', 'q', 'o', 'm', 'u', 'c', 'h', 'f', 't', 'n']", 'Phenotype: tgisirqomuchftn', '\n']
['Fitness score: 5', "Genotype: ['t', 'g', 'i', 's', 'i', 'r', 'q', 'o', 'm', 'u', 'c', 'h', 'f', 't', 'n']", 'Phenotype: tgisirqomuchftn', '\n']
['Fitness score: 5', "Genotype: ['t', 'g', 'i', 's', 'i', 'r', 'q', 'o', 'm', 'u', 'c', 'h', 'f', 't', 'n']", 'Phenotype: tgisirqomuchftn', '\n']
['Fitness score: 4', "Genotype: ['t', 'g', 'i', 's', 'i', 'r', 'r', 'o', 'm', 'u', 'c', 'h', 'f', 't', 'n']", 'Phenotype: tgisirromuchftn', '\n']
['Fitness score: 3', "Genotype: ['t', 'g', 'i', 's', 'i', 's', 'r', 'o', 'm', 'u', 'c', 'h', 'f', 't', 'n']", 'Phenotype: tgisisromuchftn', '\n']
['Fitness score: 3', "Genotype: ['t', 'g', 'i', 's', 'i', 's', 'r', 'o', 'm', 'u', 'c', 'h', 'f', 't', 'n']", 'Phenotype: tg

In [62]:
pprint(result3, compact=True)

{'fitness': 0,
 'genotype': ['t', 'h', 'i', 's', 'i', 's', 's', 'o', 'm', 'u', 'c', 'h', 'f',
              'u', 'n'],
 'phenotype': 'thisissomuchfun'}
