# Value Maximisation with Genetic Algorithms 🧬

---

## Summary

- [1. Problem Introduction](#problem-introduction)
- [2. Genetic Algorithms Presentation](#genetic-algorithms-presentation)
- [3. Problem Implementation](#problem-implementation)
- [4. Resolution and Fitness Comparison](#resolution-and-fitness-comparison)
- [5. Is Genetic Algorithms Good for This Problem?](#is-genetic-algorithms-good-for-this-problem)
- [6. Conclusion](#conclusion)

---
<a name="problem-introduction"></a>
## 1. Problem Introduction

### Biscuit manufacturing factory problem 🍪
A biscuit manufacturing factory is gearing up for the festive season by producing a variety of biscuits. The challenge? Maximising biscuit production and profit from a single roll of dough. Here’s the breakdown:

### Key Information: 
- **Dough Roll Properties**:
  - A predefined rectangular length, denoted as **‘LENGTH’** (one-dimensional problem).
  - Contains **defects** at specific positions (**x**) and of various types (**a**, **b**, **c**, etc.).

- **Biscuits**:
  - Can be produced in infinite quantities.
  - Have specific **sizes**, **values**, and **defect thresholds** (maximum allowable defects of each class).

### Constraints: 
1. **Placement Rules**:
   - Biscuits must be placed at **integer positions**.
   - **No overlapping**: Positions occupied by one biscuit cannot be used by another.
2. **Defect Tolerance**:
   - A biscuit’s defect limits must not be exceeded for the positions it covers.
3. **Length Limitation**:
   - The total size of the biscuits cannot exceed the dough roll’s length.

### Solution Value: 
- The value of a solution = **sum of biscuit values** – **penalty for unused dough** (-1 per empty position).

### Project Benchmark:
- **Roll Length**: 500 units.
- **Defects**: 3 classes (a, b, c), details in `defects.csv`.
- **Biscuits**:
  - **Biscuit 0**: Length 4, Value 3, Defects {a: 4, b: 2, c: 3}.
  - **Biscuit 1**: Length 8, Value 12, Defects {a: 5, b: 4, c: 4}.
  - **Biscuit 2**: Length 2, Value 1, Defects {a: 1, b: 2, c: 1}.
  - **Biscuit 3**: Length 5, Value 8, Defects {a: 2, b: 3, c: 2}.

---
<a name="genetic-algorithms-presentation"></a>
## 2. Genetic Algorithms Presentation 🧬🔄

### What are Genetic Algorithms (GAs)?
Genetic Algorithms (GAs) are optimisation techniques inspired by **natural selection**. They simulate the process of evolution to find solutions to complex problems. GAs are particularly effective for large search spaces and problems with multiple constraints.

### Key Steps in GAs:
1. **Initialisation** 🔄:
   - Start with a randomly generated population of potential solutions.

2. **Evaluation** 💯:
   - Calculate the fitness of each solution (e.g., total value of biscuits – penalties).

3. **Selection** 🥇:
   - Choose the best solutions based on fitness to form the next generation.

4. **Crossover** 🔗:
   - Combine parts of two solutions (parents) to produce new solutions (children).

5. **Mutation** 🍀:
   - Introduce small changes to solutions to explore new possibilities.

6. **Termination** ⏳:
   - Stop when a satisfactory solution is found or after a predefined number of generations.

---
<a name="problem-implementation"></a>
## 3. Problem Implementation ⚙️


In [1]:
import numpy as np
import random
import pandas as pd

In [2]:
np.random.seed(2)
random.seed(2)

In [3]:
def init_population(pop_size, roll_size):
    
    return [np.random.randint(-1, 4, size=(roll_size,)) for _ in range(pop_size)]

In [4]:
def respect_defects(threshold, start, end, roll_defects):
    pos_defects = {"a": 0, "b": 0, "c": 0}
    for pos in range(start, end):
        for key in roll_defects[pos]:
            pos_defects[key] += (roll_defects[pos][key])
    return all(threshold[key] >= pos_defects[key] for key in pos_defects.keys())

In [5]:
def fitness(ind, biscuits_list, roll_defects):
    
    score = 0
    size = 0
    last_elem = -2  # So we don't have to check if we test the first value of the list each iteration doing ind[i-1]
    for i, elem in enumerate(ind):
        if elem == -1:  # If no biscuit value no need to test anymore
            score -= (1 + size)
            size = 0
        else:
            if elem != last_elem:
                score -= size
                size = 1
            else:
                size += 1
            if size == biscuits_list[elem]["size"]:  # Test the defects only if the biscuit reach its required size
                if respect_defects(biscuits_list[elem]["threshold"], i - size + 1, i + 1, roll_defects):
                    score += biscuits_list[elem]["value"]
                    size = 0
                else:
                    score -= 1
                    size -= 1
        last_elem = elem
    score -= size  # Don't forget to remove last biscuit pieces that are not whole
    return score

In [6]:
def get_slice_score(position, size, lb_size, lb_value, lb_threshold, roll_defects):  # lb stands for last biscuit
    score = 0
    unassigned = [i for i in range(position - size, position)]  # Ensure we give a value to every biscuit of the last biscuit type 
    last_size = min(lb_size, size)
    while unassigned and last_size != 0:  # While we don't have assigned a value for every last position
        j = 0
        assigned = []
        while j <= len(unassigned) - last_size:
            pos = unassigned[j]
            start = pos  # Start of the continuation of the previous value
            end = pos + last_size
            if all(elem in unassigned for elem in range(start, end)):
                if respect_defects(lb_threshold, start, end, roll_defects):
                    score += (last_size/lb_size)**2 * lb_value  # the closer the biscuit is to its full size, the more importance is given to it
                    for rem in range(start, end):
                        assigned.append(rem)
                    j += last_size - 1  # Put -1 because there is a j+=1 at the end of the loop anw 
            j+=1
        for assi in  assigned:
            unassigned.remove(assi) 
        last_size -= 1
    score -= len(unassigned)  # -1 For all element that do not respect defects even alone
    return score

In [7]:
# We can try to give more importants to biscuits that aren't full but almost
def fitness_2(ind, biscuits_list, roll_defects):
    
    score = 0
    size = 0
    last_elem = ind[0]  # So we don't have to check if we test the first value of the list each iteration doing ind[i-1]
    for i, elem in enumerate(ind):
        if elem == last_elem:
            size += 1
        else:
            if last_elem == -1:
                score -= size
            else:
                slice_score = get_slice_score(i, size, biscuits_list[last_elem]["size"], biscuits_list[last_elem]["value"], biscuits_list[last_elem]["threshold"], roll_defects)
                score += slice_score
            size = 1
        last_elem = elem
    score += get_slice_score(len(ind), size, biscuits_list[last_elem]["size"], biscuits_list[last_elem]["value"], biscuits_list[last_elem]["threshold"], roll_defects)
                
    return score

In [8]:
def evolve_pop(population, mutation_rate, elite_ratio, biscuits_list, roll_defects, fitness):
    fitness_values = [fitness(ind, biscuits_list, roll_defects) for ind in population]

    min_fitness = min(fitness_values)
    shifted_fitness = [f - min_fitness + 1 for f in fitness_values]  # Add 1 to avoid zero fitness

    fitness_sum = sum(shifted_fitness)
    probabilities = [f / fitness_sum for f in shifted_fitness]
    
    elite_idx = np.argsort(fitness_values)[int(-len(population) * elite_ratio):]
    
    elites = [population[i] for i in elite_idx]

    new_population = []
    while len(new_population) < len(population) - len(elites):
        
        # Cross over
        # Select two parents based on fitness probabilities
        parents_indices = np.random.choice(len(population), size=2, replace=False, p=probabilities)
        parent1 = population[parents_indices[0]]
        parent2 = population[parents_indices[1]]

        child = np.zeros(parent1.shape)
        break_points = np.random.choice(len(parent1), size=3, replace=False)
        break_points = np.insert(break_points, 0, [0, len(parent1)])
        break_points.sort()
        #slices = [(break_points[i], break_points[i+1]) for i in range(len(break_points)-1)]
        for i in range(len(break_points)-1):
            start = break_points[i]
            end = break_points[i+1]
            chosen_p = parent1 if random.random() < 0.5 else parent2
            child[start:end] = chosen_p[start:end]

        # Mutation
        muted_child = np.array([gene if random.random() > mutation_rate else random.randint(-1, 3) for gene in child] )

        new_population.append(muted_child)
        
    return new_population + elites

In [9]:
def genetic_algorithm(pop_size, mutation_rate, elite_ratio, biscuits_list, roll_defects, roll_size, max_iter, display, fitness, population=None, rtype="elite"):
    if population is None:
        population = init_population(pop_size, roll_size)
    
    for i in range(max_iter):
        population = evolve_pop(population, mutation_rate, elite_ratio, biscuits_list, roll_defects, fitness)
        
        if display and (i+1) % display == 0:
            # Metric computation, remove 4 lines below to go faster
            fitness_values = [fitness(ind, biscuits_list, roll_defects) for ind in population]
            elite_idx = np.argsort(fitness_values)[-1]
            print(f'Generation {i+1}: Best fitness {fitness_values[elite_idx]}')

    if rtype != "population":  # if rtype = "population" we want to return the entire population for further improves
        fitness_values = [fitness(ind, biscuits_list, roll_defects) for ind in population]
        elite_idx = np.argsort(fitness_values)[-1]
        population = [population[elite_idx]]  # returns only the best element
    
    return population
    

---
<a name="resolution-and-fitness-comparison"></a>
## 4. Resolution and Fitness Comparison 📊

In [10]:
df = pd.read_csv("defects.csv")
print(df.shape)
df = df.sort_values(by="x")
df.head(1)

(500, 2)


Unnamed: 0,x,class
479,0.700561,a


In [11]:
# dict format -> id : (value, size, defects_threshold)

biscuits_list = {
    -1: ({"value": -1, "size": 1, "threshold": {"a":9, "b":9, "c":9}}),
     0: ({"value":  3, "size": 4, "threshold": {"a":4, "b":2, "c":3}}),
     1: ({"value": 12, "size": 8, "threshold": {"a":5, "b":4, "c":4}}),
     2: ({"value":  1, "size": 2, "threshold": {"a":1, "b":2, "c":1}}),
     3: ({"value":  8, "size": 5, "threshold": {"a":4, "b":2, "c":3}}),
}

In [12]:
roll_size = 500
roll_defects = {i: {"a": 0, "b": 0, "c": 0} for i in range(roll_size)}
for _, row in df.iterrows():
    if int(row["x"]) >= roll_size:
        break
    roll_defects[int(row["x"])][row["class"]] += 1

In [13]:
pop_size = 1000
#mutation_rate = 0.01
elite_ratio = 0.1
max_iter = 1000
display = max_iter // 10

In [16]:
for mutation_rate in [0.01, 0.02, 0.03, 0.04, 0.1]:
    result = genetic_algorithm(pop_size, mutation_rate, elite_ratio, biscuits_list, roll_defects, roll_size, max_iter, display, fitness_2)
    print(f"m_rate: {mutation_rate} best fit score 1: {fitness(result, biscuits_list, roll_defects)} | best fit score 2: {fitness_2(result, biscuits_list, roll_defects)}")

m_rate: 0.01 best fit score 1: 420 | best fit score 2: 645.0124999999999
m_rate: 0.02 best fit score 1: 281 | best fit score 2: 560.6774999999999
m_rate: 0.03 best fit score 1: 64 | best fit score 2: 463.4299999999999
m_rate: 0.04 best fit score 1: -14 | best fit score 2: 419.1599999999997
m_rate: 0.1 best fit score 1: -301 | best fit score 2: 258.6799999999998


In [13]:
pop_size = 1000
mutation_rate = 0.01
elite_ratio = 0.1
max_iter = 1000
display = max_iter // 10
population = genetic_algorithm(pop_size, mutation_rate, elite_ratio, biscuits_list, roll_defects, roll_size, max_iter, display, fitness_2, rtype="population")

Generation 100: Best fitness 382.1949999999999
Generation 200: Best fitness 496.0299999999999
Generation 300: Best fitness 545.9199999999998
Generation 400: Best fitness 583.5975
Generation 500: Best fitness 610.8075
Generation 600: Best fitness 617.42
Generation 700: Best fitness 626.23
Generation 800: Best fitness 634.1075
Generation 900: Best fitness 640.4324999999999
Generation 1000: Best fitness 645.0124999999999


In [14]:
pop_size = len(population)
mutation_rate = 0.01
elite_ratio = 0.1
max_iter = 1000
display = max_iter // 10
elite = genetic_algorithm(pop_size, mutation_rate, elite_ratio, biscuits_list, roll_defects, roll_size, max_iter, display, fitness, population=population)

Generation 100: Best fitness 449
Generation 200: Best fitness 459
Generation 300: Best fitness 462
Generation 400: Best fitness 472
Generation 500: Best fitness 478
Generation 600: Best fitness 481
Generation 700: Best fitness 487
Generation 800: Best fitness 490
Generation 900: Best fitness 496
Generation 1000: Best fitness 499


In [15]:
elite

[array([ 1.,  3.,  0.,  0.,  0.,  0.,  2.,  2.,  1., -1.,  3.,  3.,  3.,
         3.,  3.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  3.,  3.,
         3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  2.,  2.,  3.,  3.,  3.,
        -1., -1.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,
         3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  0.,  0.,  0.,  0.,
         3.,  3.,  3.,  3.,  3.,  2.,  2.,  0.,  3.,  3.,  3.,  3.,  3.,
         1.,  1., -1.,  1.,  3.,  3.,  3.,  3.,  3.,  3.,  2.,  2.,  3.,
         3.,  3.,  3.,  3.,  2.,  2.,  1.,  3.,  3.,  3.,  3.,  3.,  3.,
         3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  2.,  3.,
         3.,  0.,  2.,  2.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,
         3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  3.,  1.,  1.,
         3.,  3.,  3.,  3.,  3.,  3.,  3.,  0.,  3.,  3.,  3.,  3.,  3.,
         0.,  1.,  2.,  0.,  0.,  0.,  0.,  1.,  1.,  3.,  1.,  1.,  1.,
         1.,  1.,  1.,  1.,  1.,  1.,  3.,  3.,  3.

In [None]:
# Example fitness_2
fitness_2([-1, 3, 3, 3, 3, 2, 0, 0, 0, -1], biscuits_list, roll_defects)

get_slice_score from 1 to 5 with size: 4
try pos : 1 for size: 4
assigned :  1  in a biscuit size:   4
assigned :  2  in a biscuit size:   4
assigned :  3  in a biscuit size:   4
assigned :  4  in a biscuit size:   4
get_slice_score from 5 to 6 with size: 1
try pos : 5 for size: 1
assigned :  5  in a biscuit size:   1
get_slice_score from 6 to 9 with size: 3
try pos : 6 for size: 3
assigned :  6  in a biscuit size:   3
assigned :  7  in a biscuit size:   3
assigned :  8  in a biscuit size:   3
get_slice_score from 9 to 10 with size: 1
try pos : 9 for size: 1
assigned :  9  in a biscuit size:   1


5.057500000000001

---
<a name="is-genetic-algorithms-good-for-this-problem"></a>
## 5. Is Genetic Algorithms Good for This Problem? ❔🕵️‍♂️

*(Content to be added.)*

---
<a name="conclusion"></a>
## 6. Conclusion 🎉

*(Content to be added.)*