In [2]:
import sys
sys.path.append('..')

In [3]:
import random
from copy import deepcopy
from library.solution import Solution

## Genetic Algorithms

Genetic Algorithms (GAs) are a class of optimization algorithms inspired by **natural selection** and **evolutionary principles**. They are used to find near-optimal solutions to complex problems, especially when traditional methods struggle due to high-dimensional or non-differentiable search spaces.

GAs operate by evolving a population of candidate solutions over multiple iterations (called generations), using biologically inspired operations:
- **Selection**: Choosing the best individuals based on a fitness function.
- **Crossover (Recombination)**: Combining two parent solutions to create new offspring.
- **Mutation**: Introducing small random changes to maintain diversity.

### Pseudo-code

1. Initialize a population P of **N** individuals/solutions (usually at random)
2. Repeat until termination condition (**max number of generations**):
   1. Create an empty population P'
   2. If using elitism, insert the best individual from P into P'
   3. Repeat until P' contains N individuals:
      1. Choose 2 individuals from population P using a **selection algorithm**
      2. Choose an operator between crossover and replication with probabilities **$P_c$** and $1-P_c$, respectively
      3. Apply the operator to the individuals to generate the offspring
      4. Apply mutation to the offspring. The mutation operator has an hyperparameter **$P_m$** (we'll see what this means for different mutation operators later)
      5. Insert the mutated individuals into P'
   4. Replace P with P'
3. Return the best individual in P


### Algorithm Implementation

Let's implement the genetic algorithm function. These are the arguments this function will receive:
- `initial_population`: List of individuals (randomly generated solutions)
- `max_gen`: Maximum number of generations
- `selection_algorithm`: A function that receives a population, selects one individual based on fitness and returns it
- `maximization`: Boolean that indicates if we're solving a maximization or minimization problem
- `xo_prob`: Probability of crossover (usually big)
- `mut_prob`: Probability of mutation (usually small)
- `elistism`: A boolean that indicates if elitism should be used or not

In [None]:
# TODO: Implement Genetic Algorithm function

**NOTE:** There are many variations of genetic algorithms. The implementation used in our practical classes and the library follows some choices. For example, before inserting the second mutated individual into P', we check whether it would exceed the population size. This can happen with even-sized populations since we always insert two individuals at a time. An alternative approach would be to insert the individual regardless and, if the population exceeds the limit, remove the worst-performing individual at the end.

There are also other assumptions for our implementation of the algorithm to run.
- individuals have `fitness`, `crossover` and `mutation` methods
- `crossover` always returns two offspring
- both `crossover` and `mutation` methods return new individuals instead of modifying individuals in-place

### Selection algorithms

Selection is the first main step of a genetic algorithm. Selection algorithms have the following properties:
- are probabilistic
- for any pair of individuals A and B, if A if better than B, then the probability of selecting A must be bigger than the probability of selecting B
- all individuals must have the chance of being selected, even the worst in the population
- when an individual is selected, it remains in population P and a copy is inserted in P'

In class we'll implement **Fitness Proportionate Selection** (or roulette wheel), but there are other techniques like Ranking or Tournament selection.

#### Fitness Proportionate Selection

Probabilistic selection method used in GAs to choose individuals for reproduction. It mimics a roulette wheel, where individuals with higher fitness have a greater chance of being selected, but lower-fitness individuals still have some probability of selection.

Let $N$ be the number of individuals in population $P$ and $F = {f_1, f_2, ..., f_N}$ be the set of fitness values of the indiiduals in the population. For an individual $i$ in the population, the probability of selecting $i$ is:

$$P(selecting\ i) = \frac{f_i}{\sum_{j=1}^{N} f_j}$$

![Fitness Proportionate Selection Implementation](images/fps.png)

Our implementation fo this selection algorithm will be a function that receives two arguments:
- `population`: A list of individuals / solutions. These must have a `fitness()` method.
- `maximization`: Boolean that indicates if we're solving a maximization or minimization problem

In [None]:
# TODO: Implement Fitness Proportionate Selection function