## Halloween Challenge
Find the best solution with the fewest calls to the fitness functions for:
* num_points = [100, 1_000, 5_000]
* num_sets = num_points
* density = [.3, .7]

### Local Search - Set Covering
To resolve this challenge I implemented these different **Single-State Methods**:
* Random-Mutation Hill Climbing
* Steepest Ascent Hill Climbing
* Steepest Ascent with Replacement Hill Climbing
* K population Random-Mutation Hill Climbing
* K population Steepest Ascent Hill Climbing
* Simulated Annealing

In [1]:
from random import random, randint, seed
from functools import reduce
import numpy as np
from copy import copy
from itertools import product
from scipy import sparse

In [2]:
def make_set_covering_problem(num_points, num_sets, density):
    """Returns a sparse array where rows are sets and columns are the covered items"""
    seed(num_points * 2654435761 + num_sets + density)
    sets = sparse.lil_array((num_sets, num_points), dtype=bool)
    for s, p in product(range(num_sets), range(num_points)):
        if random() < density:
            sets[s, p] = True
    for p in range(num_points):
        sets[randint(0, num_sets - 1), p] = True
    return sets.toarray()

In [3]:
PROBLEM_SIZE = [100, 1_000, 5_000]
DENSITY = [0.3, 0.7]

I defined how fitness function a function that return the number of covered cells, and the negative number of sets used

In [4]:
def fitness(state, sets):
    num_true = np.sum(
        reduce(
            np.logical_or,
            [sets[i] for i, t in enumerate(state) if t],
            False,
        )
    )
    return num_true, -sum(state)

In [5]:
def generate_initial_state(prob, problem_size):
    return [True if random() < prob else False for _ in range(problem_size)]

With the following cell I checked if all the configuration of the problem are solvable and saved the sets, initial state and conf in an array problems

In [6]:
problems = []
for p in PROBLEM_SIZE:
    for d in DENSITY:
        initial_state = generate_initial_state(0.3, p)
        sets = make_set_covering_problem(p, p, d)
        assert fitness([True for _ in range(p)], sets)[0] == p, "Problem not solvable"
        problems.append((sets, initial_state, (p, d)))

I defined a tweak function that random modify one set from taken to untaken or from untaken to taken 

In [8]:
def tweak(state, problem_size):
    new_state = copy(state)
    index = randint(0, problem_size - 1)
    new_state[index] = not state[index]
    return new_state

### Random Mutation Hill Climbing

In [18]:
def RMHC(sets, problem_size, state, max_iteration):
    """
    Random Mutation Hill Climbing implementation.

    Args:
        sets: 2D boolean array;
        problem_size: the number of sets and points;
        state: initial state;
        max_iteration: the maximum number of iteration;

    Returns:
        state: solution state;
        goodness: the result of the fitness function on the solution state;
        evaluations: number of evaluations;

    """
    count = 0
    limit = problem_size // 2
    goodness = fitness(state, sets)
    evaluations = max_iteration + 1
    for i in range(max_iteration):
        if count == limit:
            evaluations = i + 1
            break
        new_state = tweak(state, problem_size)
        goodness_new_state = fitness(new_state, sets)
        if goodness_new_state > goodness:
            state = new_state
            goodness = goodness_new_state
            count = 0
        count += 1
    return state, goodness, evaluations

In [17]:
max_iteration = 100_000
for sets, initial_state, (p, d) in problems:
    solution_state, goodness, evaluations = RMHC(sets, p, initial_state, max_iteration)
    print(
        f"Random Mutation Hill Climbing:\nDensity: {d}\nProblem size: {p}\nNumber of calls to fitness function: {evaluations:,}\nGoodness: {goodness}\n"
    )

Random Mutation Hill Climbing:
Density: 0.3
Problem size: 100
Number of calls to fitness function: 140
Goodness: (100, -10)

Random Mutation Hill Climbing:
Density: 0.7
Problem size: 100
Number of calls to fitness function: 213
Goodness: (100, -4)

Random Mutation Hill Climbing:
Density: 0.3
Problem size: 1000
Number of calls to fitness function: 3721
Goodness: (1000, -16)

Random Mutation Hill Climbing:
Density: 0.7
Problem size: 1000
Number of calls to fitness function: 5299
Goodness: (1000, -6)

Random Mutation Hill Climbing:
Density: 0.3
Problem size: 5000
Number of calls to fitness function: 28354
Goodness: (5000, -18)

Random Mutation Hill Climbing:
Density: 0.7
Problem size: 5000
Number of calls to fitness function: 30497
Goodness: (5000, -7)



### Steepest-step Hill Climbing

I defined a new tweak function that change a specif value from the index

In [42]:
def tweak_index(state, index):
    new_state = copy(state)
    new_state[index] = not state[index]
    return new_state

In [41]:
def SAHC(sets, problem_size, state, max_iteration):
    """
    Steepest-step Hill Climbing implementation.

    Args:
        sets: 2D boolean array;
        problem_size: the number of sets and points;
        state: initial state;
        max_iteration: the maximum number of iteration;

    Returns:
        state: solution state;
        goodness: the result of the fitness function on the solution state;
        evaluations: number of evaluations;

    """
    goodness = fitness(state, sets)
    evaluations = max_iteration * problem_size
    for i in range(max_iteration):
        new_state = tweak_index(state, 0)
        new_state_goodness = fitness(new_state, sets)
        for idx in range(1, problem_size):
            tmp = tweak_index(state, idx)
            tmp_goodness = fitness(tmp, sets)
            if tmp_goodness > new_state_goodness:
                new_state = tmp
                new_state_goodness = tmp_goodness
        if new_state_goodness > goodness:
            state = new_state
            goodness = new_state_goodness
        else:
            evaluations = i * problem_size
            break
    return state, goodness, evaluations

In [43]:
max_iteration = 100_000
for sets, initial_state, (p, d) in problems:
    solution_state, goodness, evaluations = SAHC(sets, p, initial_state, max_iteration)
    print(
        f"Stepest-step Hill Climbing:\nDensity: {d}\nProblem size: {p}\nNumber of calls to fitness function: {evaluations:,}\nGoodness: {goodness}\n"
    )

Stepest-step Hill Climbing:
Density: 0.3
Problem size: 100
Number of calls to fitness function: 2,100
Goodness: (100, -10)

Stepest-step Hill Climbing:
Density: 0.7
Problem size: 100
Number of calls to fitness function: 1,800
Goodness: (100, -4)

Stepest-step Hill Climbing:
Density: 0.3
Problem size: 1000
Number of calls to fitness function: 312,000
Goodness: (1000, -15)

Stepest-step Hill Climbing:
Density: 0.7
Problem size: 1000
Number of calls to fitness function: 296,000
Goodness: (1000, -5)



KeyboardInterrupt: 

### Steepest Ascent w/Replacement 

In [30]:
def SAHC_with_replacement(sets, problem_size, state, max_iteration, n):
    """
    Steepest Ascent w/Repleacement Hill Climbing implementation.

    Args:
        sets: 2D boolean array;
        problem_size: the number of sets and points;
        state: initial state;
        max_iteration: the maximum number of iteration;
        n: number of tweaks for each iteration

    Returns:
        state: solution state;
        goodness: the result of the fitness function on the solution state;
        evaluations: number of evaluations;

    """
    count = 0
    limit = problem_size // 25
    goodness = fitness(state, sets)
    evaluations = (max_iteration * n) + 1
    for i in range(max_iteration):
        if count == limit:
            evaluations = (i * n) + 1
            break
        new_state = tweak(state, problem_size)
        new_state_goodness = fitness(new_state, sets)
        for _ in range(n):
            tmp = tweak(state, problem_size)
            tmp_goodness = fitness(tmp, sets)
            if tmp_goodness > new_state_goodness:
                new_state = tmp
                new_state_goodness = tmp_goodness
        if new_state_goodness > goodness:
            state = new_state
            goodness = new_state_goodness
            count = 0
        count += 1
    return state, goodness, evaluations

In [31]:
max_iteration = 100_000
n = 20
for sets, initial_state, (p, d) in problems:
    solution_state, goodness, evaluations = SAHC_with_replacement(sets, p, initial_state, max_iteration, n)
    print(
        f"Steepest Ascent with replacement Hill Climbing with number of tweaks {n}:\nDensity: {d}\nProblem size: {p}\nNumber of calls to fitness function: {evaluations:,}\nGoodness: {goodness}\n"
    )

Stepest-step Hill Climbing with number of tweaks 20:
Density: 0.3
Problem size: 100
Number of calls to fitness function: 561
Goodness: (100, -9)

Stepest-step Hill Climbing with number of tweaks 20:
Density: 0.7
Problem size: 100
Number of calls to fitness function: 421
Goodness: (100, -4)

Stepest-step Hill Climbing with number of tweaks 20:
Density: 0.3
Problem size: 1000
Number of calls to fitness function: 9,361
Goodness: (1000, -15)

Stepest-step Hill Climbing with number of tweaks 20:
Density: 0.7
Problem size: 1000
Number of calls to fitness function: 8,161
Goodness: (1000, -6)

Stepest-step Hill Climbing with number of tweaks 20:
Density: 0.3
Problem size: 5000
Number of calls to fitness function: 49,001
Goodness: (5000, -19)

Stepest-step Hill Climbing with number of tweaks 20:
Density: 0.7
Problem size: 5000
Number of calls to fitness function: 49,021
Goodness: (5000, -7)



### K Population Random Mutation Hill Climbing

In [33]:
def k_population_RMHC(sets, problem_size, max_iteration, k):
    """
    Random Mutation with K population Hill Climbing implementation.

    Args:
        sets: 2D boolean array;
        problem_size: the number of sets and points;
        state: initial state;
        max_iteration: the maximum number of iteration;
        k: number of population

    Returns:
        state: solution state;
        goodness: the result of the fitness function on the solution state;
        evaluations: number of evaluations;

    """
    number_of_evaluations = 0
    solution_states = []
    goodness_states = []
    for _ in range(k):
        state = generate_initial_state(0.3, problem_size)
        solution_state, goodness, evaluations = RMHC(sets, problem_size, state, max_iteration)
        solution_states.append(solution_state)
        goodness_states.append(goodness)
        number_of_evaluations += evaluations

    solution_goodness = max(goodness_states)
    return solution_states[goodness_states.index(solution_goodness)], solution_goodness, number_of_evaluations

In [35]:
max_iteration = 100_000
k = 5
for sets, _, (p, d) in problems:
    solution_state, goodness, evaluations = k_population_RMHC(sets, p, max_iteration, k)
    print(
        f"Random Mutation Hill Climbing with {k} random population:\nDensity: {d}\nProblem size: {p}\nNumber of calls to fitness function: {evaluations:,}\nGoodness: {goodness}\n"
    )

Random Mutation Hill Climbing with 5 random population:
Density: 0.3
Problem size: 100
Number of calls to fitness function: 877
Goodness: (100, -9)

Random Mutation Hill Climbing with 5 random population:
Density: 0.7
Problem size: 100
Number of calls to fitness function: 1002
Goodness: (100, -4)

Random Mutation Hill Climbing with 5 random population:
Density: 0.3
Problem size: 1000
Number of calls to fitness function: 19700
Goodness: (1000, -14)

Random Mutation Hill Climbing with 5 random population:
Density: 0.7
Problem size: 1000
Number of calls to fitness function: 23151
Goodness: (1000, -5)

Random Mutation Hill Climbing with 5 random population:
Density: 0.3
Problem size: 5000
Number of calls to fitness function: 129277
Goodness: (5000, -19)

Random Mutation Hill Climbing with 5 random population:
Density: 0.7
Problem size: 5000
Number of calls to fitness function: 145526
Goodness: (5000, -6)



### K Population Steepest Ascent Hill Climbing

In [39]:
def k_population_SAHC(sets, problem_size, max_iteration, n, k):
    """
    Steepest Ascent w/Replacement with K population Hill Climbing implementation.

    Args:
        sets: 2D boolean array;
        problem_size: the number of sets and points;
        state: initial state;
        max_iteration: the maximum number of iteration;
        n: number of tweaks for each iteration;
        k: number of population;

    Returns:
        state: solution state;
        goodness: the result of the fitness function on the solution state;
        evaluations: number of evaluations;

    """
    number_of_evaluations = 0
    solution_states = []
    goodness_states = []
    for _ in range(k):
        state = generate_initial_state(0.3, p)
        solution_state, goodness, evaluations = SAHC(sets, problem_size, state, max_iteration, n)
        solution_states.append(solution_state)
        goodness_states.append(goodness)
        number_of_evaluations += evaluations

    solution_goodness = max(goodness_states)
    return solution_states[goodness_states.index(solution_goodness)], solution_goodness, number_of_evaluations

In [40]:
max_iteration = 100_000
k = 5
n = 10
for sets, _, (p, d) in problems:
    solution_state, goodness, evaluations = k_population_SAHC(sets, p, max_iteration, n, k)
    print(
        f"Stepest-step Hill Climbing with number of tweaks 20 and 5 random population:\nDensity: {d}\nProblem size: {p}\nNumber of calls to fitness function: {evaluations:,}\nGoodness: {goodness}\n"
    )

Stepest-step Hill Climbing with number of tweaks 20 and 5 random population:
Density: 0.3
Problem size: 100
Number of calls to fitness function: 1,165
Goodness: (100, -7)

Stepest-step Hill Climbing with number of tweaks 20 and 5 random population:
Density: 0.7
Problem size: 100
Number of calls to fitness function: 1,875
Goodness: (100, -4)

Stepest-step Hill Climbing with number of tweaks 20 and 5 random population:
Density: 0.3
Problem size: 1000
Number of calls to fitness function: 25,945
Goodness: (1000, -14)

Stepest-step Hill Climbing with number of tweaks 20 and 5 random population:
Density: 0.7
Problem size: 1000
Number of calls to fitness function: 27,055
Goodness: (1000, -5)

Stepest-step Hill Climbing with number of tweaks 20 and 5 random population:
Density: 0.3
Problem size: 5000
Number of calls to fitness function: 157,435
Goodness: (5000, -18)

Stepest-step Hill Climbing with number of tweaks 20 and 5 random population:
Density: 0.7
Problem size: 5000
Number of calls to 

### Simulated Annealing

In [36]:
def simulated_annealing(sets, problem_size, state, max_iteration):
    """
    Simulated Annealing implementation.

    Args:
        sets: 2D boolean array;
        problem_size: the number of sets and points;
        state: initial state;
        max_iteration: the maximum number of iteration;

    Returns:
        state: solution state;
        goodness: the result of the fitness function on the solution state;
        evaluations: number of evaluations;

    """
    count = 0
    limit = problem_size // 2
    goodness = fitness(state, sets)
    evaluations = max_iteration + 1
    for i in range(max_iteration):
        t = 1 - ((i + 1) / max_iteration)
        if t == 0 or count == limit:
            evaluations = i + 1
            break
        new_state = tweak(state, problem_size)
        goodness_new_state = fitness(new_state, sets)
        if goodness_new_state >= goodness or random() > np.exp((goodness[1] - goodness_new_state[1]) / t):
            state = new_state
            goodness = goodness_new_state
            count = 0
        count += 1

    return state, goodness, evaluations

In [37]:
max_iteration = 100_000
for sets, initial_state, (p, d) in problems:
    solution_state, goodness, evaluations = simulated_annealing(sets, p, initial_state, max_iteration)
    print(
        f"Simulated Annealing:\nDensity: {d}\nProblem size: {p}\nNumber of calls to fitness function: {evaluations:,}\nGoodness: {goodness}\n"
    )

Simulated Annealing:
Density: 0.3
Problem size: 100
Number of calls to fitness function: 253
Goodness: (100, -10)

Simulated Annealing:
Density: 0.7
Problem size: 100
Number of calls to fitness function: 311
Goodness: (100, -5)

Simulated Annealing:
Density: 0.3
Problem size: 1000
Number of calls to fitness function: 11456
Goodness: (1000, -14)

Simulated Annealing:
Density: 0.7
Problem size: 1000
Number of calls to fitness function: 4036
Goodness: (1000, -6)



  if goodness_new_state >= goodness or random() > np.exp((goodness[1] - goodness_new_state[1]) / t):


Simulated Annealing:
Density: 0.3
Problem size: 5000
Number of calls to fitness function: 100000
Goodness: (5000, -20)

Simulated Annealing:
Density: 0.7
Problem size: 5000
Number of calls to fitness function: 46129
Goodness: (5000, -8)

