# Weasel program

The weasel program or Dawkins' weasel is a thought experiment and a variety of computer simulations illustrating it. Their aim is to demonstrate that the process that drives evolutionary systems—random variation combined with non-random cumulative selection—is different from pure chance.

The thought experiment was formulated by Richard Dawkins, and the first simulation written by him; various other implementations of the program have been written by others.  
Source: https://en.wikipedia.org/wiki/Weasel_program  

Read the webpage.

## Goal

The goal of this lesson is to write a Weasel program implementation in Python.  

A randomly generated sequence of 28 letters and spaces will be gradually changed each generation. The sequences progress through each generation:  

Generation 01:   WDLTMNLT DTJBKWIRZREZLMQCO P  
Generation 02:   WDLTMNLT DTJBSWIRZREZLMQCO P  
Generation 10:   MDLDMNLS ITJISWHRZREZ MECS P  
Generation 20:   MELDINLS IT ISWPRKE Z WECSEL  
Generation 30:   METHINGS IT ISWLIKE B WECSEL  
Generation 40:   METHINKS IT IS LIKE I WEASEL  
Generation 43:   METHINKS IT IS LIKE A WEASEL  

## The algoritm: simple version

The simplest algoritm would be to keep matching positions and to mutate non matching positions each generation. However, this is not how evolution works. Evolution works by random mutation, selection and amplification.  
Let's start with the simple version. Run the code block below in order to obtain the variables.

In [None]:
###DO NOT REMOVE###
import string
import random
random.seed(10) #you can comment this if all works well
target = "METHINKS IT IS LIKE A WEASEL"
target = [i for i in target] #no need to understand this yet
letters =  string.ascii_uppercase + " "
start_seq = [random.choice(letters) for i in range(len(target))] #no need to understand this yet
print("target:", target)
print("start sequence: ", start_seq)

Now write the program below:

In [None]:
###YOUR CODE HERE###

###ANSWER###
def compare_strings(target, descendant):
    diff_pos = []
    pos = 0
    for letter in target:
        if not letter == descendant[pos]:
            diff_pos.append(pos)
        pos += 1
    return diff_pos


def replace_letters(descendant, diff):
    if diff:
        for num in diff:
            descendant[num] = random.choice(letters)
    return descendant


def cumulative_select(target, start_seq):
    generation = 1
    descendant = start_seq
    while descendant != target:
        diff = compare_strings(target, descendant)
        descendant = replace_letters(descendant, diff)
        print("".join(descendant), " generation = ", generation)
        generation += 1

        
cumulative_select(target, start_seq)
###END ANSWER###

## Spicy problem: complex version

There is something odd about the algoritm above. Evolution does not work like that. Have a look at the figure below:  
![fig1](pics/fig1.png)

Note that, in generation 8, the 25th character, which had been correct (A), becomes incorrect (I). The program written by Dawkins does not "lock" correct characters as we did, rather it measures at each iteration the closeness of the complete string to the 'target' phrase.

Although Dawkins did not provide the source code for his program, a "Weasel" style algorithm could run as follows:  

- Start with a random string of 28 characters.
- Make 100 copies of the string (reproduce).
- For each character in each of the 100 copies, with a probability of 5%, replace (mutate) the character with a new random character.
- Compare each new string with the target string "METHINKS IT IS LIKE A WEASEL", and give each a score (the number of letters in the string that are correct and in the correct position).
- If any of the new strings has a perfect score (28), halt. Otherwise, take the highest scoring string, and go to step 2.

Write the new version according to this algoritm below.  
First run the cell below:

In [1]:
###DO NOT REMOVE###
import string
import random
random.seed(10) #you can comment this if all works well

TARGET = [i for i in "METHINKS IT IS LIKE A WEASEL"]
VOLUME = 100
MUT_RATE = 0.05
LETTERS =  string.ascii_uppercase + " "
SEED = [random.choice(LETTERS) for i in range(len(TARGET))] #no need to understand this yet
print("target:", TARGET)
print("seed:", SEED)

target: ['M', 'E', 'T', 'H', 'I', 'N', 'K', 'S', ' ', 'I', 'T', ' ', 'I', 'S', ' ', 'L', 'I', 'K', 'E', ' ', 'A', ' ', 'W', 'E', 'A', 'S', 'E', 'L']
seed: ['S', 'B', 'N', 'P', 'S', 'A', 'G', 'O', ' ', 'P', ' ', 'I', 'U', 'Z', 'F', 'B', 'Q', 'P', 'K', 'C', 'H', 'X', 'L', 'B', 'N', 'E', 'T', 'L']


In [7]:
###YOUR CODE HERE###

###ANSWER###
def sel_best_match(target, offspring):
    best_match_seq = ""
    best_match_score = 0
    for seq in offspring:
        score = 0
        pos = 0
        for letter in seq:
            if target[pos] == letter:
                score += 1
            pos += 1
        if score > best_match_score:
            best_match_score = score
            best_match_seq = seq
        
    return best_match_seq


def generate_offspring(parent, VOLUME, MUT_RATE):
    num_of_mut = int(round(MUT_RATE * len(parent), 0))
    offspring = []
    for _ in range(VOLUME):
        positions = []
        for i in range(num_of_mut):
            positions.append(random.randint(0, len(parent) -1))  
        child = parent.copy()
        for i in positions:
            child[i] = random.choice(LETTERS)
        offspring.append(child)
    return offspring


def cumulative_select(SEED, TARGET, VOLUME, MUT_RATE):
    best_match = SEED
    generation = 1
    while best_match != TARGET:
        offspring = generate_offspring(best_match, VOLUME, MUT_RATE)
        best_match = sel_best_match(TARGET, offspring)
        print("".join(best_match), " generation", generation)
        generation += 1
        
cumulative_select(SEED, TARGET, VOLUME, MUT_RATE)
###END ANSWER###

SBNPSAGO P IUZFBQPKCHXLBNEEL  generation 1
SBNPSAGO P IUSFBQPKCHXLBNEEL  generation 2
SBNPSNGO P IUSFBQPKCHXLBNEEL  generation 3
SBNPSNGO P IUS BQPKCHXLBNEEL  generation 4
SBNPSNGS P IUS BQPKCHXLBNEEL  generation 5
SBNPSNGS PTIUS BQPKCHXLBNEEL  generation 6
SBNPSNGS PTIUS BQPKCHXWBNEEL  generation 7
SBNPSNGS PTIUS BQPKCH WBNEEL  generation 8
SBTPSNGS PTIUS BQPKCH WBNEEL  generation 9
SBTPSNGS PTIUS BQPECH WBNEEL  generation 10
SBTPINGS PTIUS BQPECH WBNEEL  generation 11
SBTPINGS PTIIS BQPECH WBNEEL  generation 12
SBTPINGS PTIIS BIPECH WBNEEL  generation 13
SBTPINGS PTIIS BIPECH WBAEEL  generation 14
SBTPINAS PTIIS BIPECH WBAEEL  generation 15
SETPINAS PTIIS BIPECH WBAEEL  generation 16
SETHINAS PTIIS BIPECH WBAEEL  generation 17
IETHINAS PTIIS BIPECH WBAEEL  generation 18
METHINAS PTIIS BIPECH WBAEEL  generation 19
METHINAS ITIIS BIPECH WBAEEL  generation 20
METHINAS ITIIS BIPE H WBAEEL  generation 21
METHINAS ITIIS LIPE H WBAEEL  generation 22
METHINAS ITIIS LIPE H WBAXEL  generation 

The end...