## Introduction to Genetic Algorithms

### Evolutionary Algorithms
A type of machine learning that uses principles from nature to evolve a solution. Optimization is performed using evolutionary algorithms (EAs). The difference between traditional algorithms and EAs is that EAs are not static but dynamic as they can evolve over time.

The genetic algorithm is a random-based classical evolutionary algorithm. By random here we mean that in order to find a solution using the GA, random changes applied to the current solutions to generate new ones. Note that GA may be called Simple GA (SGA) due to its simplicity compared to other EAs. GA is based on Darwin’s theory of evolution. It is a slow gradual process that works by making changes to the making slight and slow changes. Also, GA makes slight changes to its solutions slowly until getting the best solution.

To summarize, the first step is to **generate a random population** of candidate solutions with each solution being a collection of data. The specifics of the representation of that data depends on the problem. After **defining an initial population** of randomly generated candidate solutions, the next step is to **evaluate the fitness** for each candidate in our population. That evaluation is specific to the particular problem we're trying to solve, but it always involves using the candidate data, and it always results in a numeric fitness score for each candidate. Based on those fitness scores we **select parent solutions** with generally higher fitness values, and then **crossover genetic information** from each to form child solutions. Those child solutions may undergo a **mutation**, and then the children form the next generation of the population. This process is repeated until after a certain number of generations that candidate with the best fitness is chosen as the ultimate solution to the problem.

![ga.png](imgs\ga.png)

One of the important ideas in evolutionary computing is the **solutions space**. The solution space is a set of all possible solutions to a given problem. **NP-hard** or **non-deterministic polynomial-time hardness** problems refers to a class of problems that are difficult to find a solution for. Non-deterministic means that for any given attempt to solve a problem, we may end up with a different solution due to a strong element of randomness in evolutionary computing. The polynomial-time part of it refers to measuring how long it takes to find a solution. **Combinatorial optimization** is a technique of finding an optimal combination of a given set of objects where an exhaustive search is not feasible. 

### Applications 
The genetic algorithm are effective in finding the optimal solutions, for example, finding an optimal subset of items to fit within a constrained area, optimally packing containers, resources allocation, shipping or dispatching, optimal ordering of data (where the number of possible permutations makes a brute-force approach impossible), optimal time manufacturing or scheduling. The genetic programming helps us in finding an equation to fit a set of data, to control a process, to control the movement of an object in space, to select stocks or investments, to generate keys for different types of cryptography, to create a strategy for picking a good starting hand for Texas Hold'em Poker, etc. 

### Generating a target string starting from a random string of the same length

In [1]:
import random
import datetime

In [2]:
# return a random string of the specified length
def generate_parent(length):
    genes = []
    while len(genes) < length:
        sampleSize = min(length - len(genes), len(geneSet))
        genes.extend(random.sample(geneSet, sampleSize))
    return ''.join(genes)

# returns the fitness score of the guessed string
def get_fitness(guess):
    return sum(1 for expected, actual in zip(target, guess) if expected == actual)

# Changing a character in the parent string at random index 
def mutate(parent):
    index = random.randrange(0, len(parent))
    childGenes = list(parent)
    newGene, alternate = random.sample(geneSet, 2)
    childGenes[index] = alternate if newGene == childGenes[index] else newGene
    return ''.join(childGenes)

def display(guess,startTime):
    timeDiff = (datetime.datetime.now() - startTime).microseconds
    fitness = get_fitness(guess)
    print("Guess: {}\tFitness Score: {}\tTime Taken (µs): {}".format(guess, fitness, timeDiff))

In [3]:
def predict_output(geneSet,target):
    
    # Initial Run
    startTime = datetime.datetime.now()
    bestParent = generate_parent(len(target))
    bestFitness = get_fitness(bestParent)
    display(bestParent,startTime)

    # Stopping when the guess matches result
    i=0 # Counting total iterations
    while True:
        i+=1
        child = mutate(bestParent)
        childFitness = get_fitness(child)
        if bestFitness >= childFitness:
            continue
        display(child,startTime)
        if childFitness >= len(bestParent):
            break
        bestFitness = childFitness
        bestParent = child

    possibilities=len(set(geneSet))**len(target)

    print(f'\nTotal Possibilities: {possibilities}')
    print(f'Total iterations: {i}')
    print(f'Optimization over large space state: {round(((possibilities-i)/possibilities)*100,2)}%')

In [4]:
geneSet = "01"
target = "1111111111"

predict_output(geneSet,target)

Guess: 1010100101	Fitness Score: 5	Time Taken (µs): 0
Guess: 1110100101	Fitness Score: 6	Time Taken (µs): 0
Guess: 1110101101	Fitness Score: 7	Time Taken (µs): 0
Guess: 1111101101	Fitness Score: 8	Time Taken (µs): 0
Guess: 1111101111	Fitness Score: 9	Time Taken (µs): 0
Guess: 1111111111	Fitness Score: 10	Time Taken (µs): 997

Total Possibilities: 1024
Total iterations: 48
Optimization over large space state: 95.31%


In [5]:
geneSet = " abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ!."
target = "Genetic Algorithms"

predict_output(geneSet,target)

Guess: .k!ShsLyWvDfAHxzGZ	Fitness Score: 0	Time Taken (µs): 0
Guess: .k!ShsLyWvDfAHxzmZ	Fitness Score: 1	Time Taken (µs): 0
Guess: .k!ShsLyWvDfAHxhmZ	Fitness Score: 2	Time Taken (µs): 0
Guess: .k!ShsL WvDfAHxhmZ	Fitness Score: 3	Time Taken (µs): 0
Guess: .k!ehsL WvDfAHxhmZ	Fitness Score: 4	Time Taken (µs): 0
Guess: .e!ehsL WvDfAHxhmZ	Fitness Score: 5	Time Taken (µs): 997
Guess: .e!etsL WvDfAHxhmZ	Fitness Score: 6	Time Taken (µs): 997
Guess: .e!etsL WvDfAHthmZ	Fitness Score: 7	Time Taken (µs): 1994
Guess: .enetsL WvDfAHthmZ	Fitness Score: 8	Time Taken (µs): 1994
Guess: .enetiL WvDfAHthmZ	Fitness Score: 9	Time Taken (µs): 2992
Guess: .enetiL AvDfAHthmZ	Fitness Score: 10	Time Taken (µs): 3991
Guess: .enetiL AvgfAHthmZ	Fitness Score: 11	Time Taken (µs): 6019
Guess: .enetiL AlgfAHthmZ	Fitness Score: 12	Time Taken (µs): 6019
Guess: .enetic AlgfAHthmZ	Fitness Score: 13	Time Taken (µs): 7016
Guess: .enetic AlgfrHthmZ	Fitness Score: 14	Time Taken (µs): 7016
Guess: .enetic AlgorHthmZ	Fitness Sco