Copyright **`(c)`** 2023 Giovanni Squillero `<giovanni.squillero@polito.it>`  
[`https://github.com/squillero/computational-intelligence`](https://github.com/squillero/computational-intelligence)  
Free for personal or classroom use; see [`LICENSE.md`](https://github.com/squillero/computational-intelligence/blob/master/LICENSE.md) for details.


# LAB9

Write a local-search algorithm (eg. an EA) able to solve the _Problem_ instances 1, 2, 5, and 10 on a 1000-loci genomes, using a minimum number of fitness calls. That's all.

### Deadlines:

- Submission: Sunday, December 3 ([CET](https://www.timeanddate.com/time/zones/cet))
- Reviews: Sunday, December 10 ([CET](https://www.timeanddate.com/time/zones/cet))

Notes:

- Reviews will be assigned on Monday, December 4
- You need to commit in order to be selected as a reviewer (ie. better to commit an empty work than not to commit)


In [1]:
# import random
# from tqdm import tqdm
# import lab9_lib

In [2]:
# fitness = lab9_lib.make_problem(1)
# for n in range(10):
#     ind = random.choices([0, 1], k=10)
#     print(f"{''.join(str(g) for g in ind)}: {fitness(ind):.2%}")

# print(fitness.calls)

The Python code in the lib defines an abstract class `AbstractProblem` and a function `make_problem(a)`.

The `AbstractProblem` class has the following methods and properties:

- `__init__`: Initializes the instance variable `_calls` to 0.
- `x`: An abstract property. Subclasses of `AbstractProblem` are expected to implement this.
- `calls`: A property that returns the number of times the instance has been called.
- `onemax`: A static method that takes a genome (a sequence of genes) and returns the sum of its genes, treating each gene as a boolean value.
- `__call__`: A special method that allows instances of the class to be called like functions. It increments `_calls`, computes fitnesses of the genome by slicing it into segments of length `x` and applying `onemax` to each segment, and then computes a value based on these fitnesses.

The `make_problem(a)` function defines a subclass of `AbstractProblem` with `x` implemented as a property that always returns `a`, and returns this subclass. This allows you to create problems with different values of `x` easily.

In summary, this code provides a framework for defining and working with optimization problems where the goal is to maximize the sum of genes in a genome, with some penalty for non-maximal segments. The `x` property determines the segment length for this computation. The `calls` property allows you to track how many times a problem instance has been called. The `make_problem(a)` function makes it easy to create problem instances with different segment lengths.

In [5]:
import random
import lab9_lib
from tqdm import tqdm


def mutate(ind, fitness):
    """mutate one random gene and return mutated part if fitness is better. Full GA could be used here"""
    f1 = fitness(ind)
    if f1 == 1.0:
        return ind, f1

    mutated = ind.copy()
    i = random.randrange(len(ind))
    mutated[i] = 1 - mutated[i]
    f2 = fitness(mutated)

    if f2 > f1:
        return mutated, f2

    return ind, f1


def split_progenitor(progenitor, genome_length, problem_instance):
    """split progenitors in parts of length problem_instance"""
    divisible = genome_length % problem_instance == 0

    end = (
        genome_length if divisible else genome_length - (genome_length % problem_instance)
    )  # for non-divisible genome_length by problem_instance

    parts = []
    for i in range(0, end, problem_instance):
        parts.append(progenitor[i : i + problem_instance])

    if not divisible:
        parts.append(progenitor[end:])

    return parts


def run(problem_instance, genome_length):
    """run the algorithm:
    1. create progenitor
    2. split progenitor in parts
    3. mutate parts until fitness is 1.0
    4. join parts in individual
    5. return number of fitness calls and if individual is correct
    """

    fitness = lab9_lib.make_problem(problem_instance)

    progenitor = random.choices([0, 1], k=genome_length)
    parts = split_progenitor(progenitor, genome_length, problem_instance)

    evolved_parts = []
    pbar = tqdm(total=len(parts))
    for part in parts:
        fit = 0
        while fit < 1.0:
            part, fit = mutate(part, fitness)
        evolved_parts.append(g for g in part)
        pbar.update(1)

    individual = [gene for part in evolved_parts for gene in part]
    return fitness.calls, sum(individual) == genome_length


# ---------------------------------------------------

GENOME_LENGTH = 1000
instances = [1, 2, 5, 10]
instances = [1, 2, 3, 5, 7, 10, 20, 50, 100, 200, 500] # for testing perf and splitting diff numbers

for instance in instances:
    print(f"Problem instance: {instance}")
    print("Calls, isSol: ", run(instance, GENOME_LENGTH))
    print()

Problem instance: 1
part lengths:  1000


100%|██████████| 1000/1000 [00:00<00:00, 130773.67it/s]


Calls, isSol:  (1485, True)

Problem instance: 2
part lengths:  1000


100%|██████████| 500/500 [00:00<00:00, 52993.18it/s]


Calls, isSol:  (1841, True)

Problem instance: 3
part lengths:  1000


100%|██████████| 334/334 [00:00<00:00, 26411.60it/s]


Calls, isSol:  (2381, True)

Problem instance: 5
part lengths:  1000


100%|██████████| 200/200 [00:00<00:00, 10470.05it/s]


Calls, isSol:  (3000, True)

Problem instance: 7
part lengths:  1000


100%|██████████| 143/143 [00:00<00:00, 4689.27it/s]


Calls, isSol:  (3920, True)

Problem instance: 10
part lengths:  1000


100%|██████████| 100/100 [00:00<00:00, 2193.98it/s]


Calls, isSol:  (4644, True)

Problem instance: 20
part lengths:  1000


100%|██████████| 50/50 [00:00<00:00, 478.89it/s]


Calls, isSol:  (5658, True)

Problem instance: 50
part lengths:  1000


100%|██████████| 20/20 [00:00<00:00, 67.31it/s]


Calls, isSol:  (7862, True)

Problem instance: 100
part lengths:  1000


100%|██████████| 10/10 [00:00<00:00, 14.91it/s]


Calls, isSol:  (9264, True)

Problem instance: 200
part lengths:  1000


100%|██████████| 5/5 [00:01<00:00,  3.35it/s]


Calls, isSol:  (10596, True)

Problem instance: 500
part lengths:  1000


100%|██████████| 2/2 [00:05<00:00,  2.65s/it]

Calls, isSol:  (15362, True)




