## Halloween Challenge
Find the best solution with the fewest calls to the fitness functions for:
* num_points = [100, 1_000, 5_000]
* num_sets = num_points
* density = [.3, .7]

### Local Search - Set Covering
To resolve this challenge I implemented these different **Single-State Methods**:
* Random-Mutation Hill Climbing
* Steepest Ascent Hill Climbing
* Steepest Ascent with Replacement Hill Climbing
* Tabu Search
* K population Random-Mutation Hill Climbing
* K population Steepest Ascent with Replacement Hill Climbing
* Simulated Annealing


In [2]:
from random import random, randint, seed
from functools import reduce
import numpy as np
from copy import copy
from itertools import product
from scipy import sparse
from collections import deque

In [5]:
def make_set_covering_problem(num_points, num_sets, density):
    """Returns a sparse array where rows are sets and columns are the covered items"""
    seed(num_points * 2654435761 + num_sets + density)
    sets = sparse.lil_array((num_sets, num_points), dtype=bool)
    for s, p in product(range(num_sets), range(num_points)):
        if random() < density:
            sets[s, p] = True
    for p in range(num_points):
        sets[randint(0, num_sets - 1), p] = True
    return sets.toarray()

In [6]:
PROBLEM_SIZE = [100, 1_000, 5_000]
DENSITY = [0.3, 0.7]
PROBABILITY = [0.01]

I defined how fitness function a function that return a tupla with the number of covered cells, and the negative number of sets used

In [8]:
def fitness(state, sets):
    """
    Fitness function implementation.

    Args:
        state: the state to evaluate;
        sets: 2D boolean array;

    Returns:
        A tupla with the number of covered cells, and negative of number of sets used to cover them
    """
    num_true = np.sum(np.sum(sets[state], axis=0) > 0)
    cost = -sum(state)
    return num_true, cost

In [9]:
def generate_initial_state(prob, problem_size):
    """
    A random generation of initial state, utils in the case of K random generation algoritmhs.

    Args:
        prob: the probability to take a Set in the state;
        problem_size: the number of sets and points;

    Returns:
        Initial state with length problem size and probabilty of True prob
    """
    return [True if random() < prob else False for _ in range(problem_size)]

With the following cell I checked if all the configuration of the problem are solvable and saved the sets, initial state (all False) and conf in an array problems

In [10]:
problems = []
for idx_prob, p in enumerate(PROBLEM_SIZE):
    for d in DENSITY:
        # initial_state = generate_initial_state(idx_prob, p)
        initial_state = [False for _ in range(p)]
        sets = make_set_covering_problem(p, p, d)
        assert fitness([True for _ in range(p)], sets)[0] == p, "Problem not solvable"
        problems.append((sets, initial_state, (p, d)))

I defined a tweak function that random modify one set from taken to untaken or from untaken to taken 

In [13]:
def tweak(state, problem_size):
    """
    Tweak function that implement a random change in the state.

    Args:
        state: the state to tweak;
        problem_size: the number of sets and points;

    Returns:
        new_state: the new state after the tweak;
    """
    new_state = copy(state)
    index = randint(0, problem_size - 1)
    new_state[index] = not state[index]
    return new_state

### Random Mutation Hill Climbing

In [36]:
def RMHC(sets, problem_size, state, max_iteration, limit):
    """
    Random Mutation Hill Climbing implementation.

    Args:
        sets: 2D boolean array;
        problem_size: the number of sets and points;
        state: initial state;
        max_iteration: the maximum number of iteration;
        limit: max number of fitness calls without an improvment;

    Returns:
        state: solution state;
        goodness: the result of the fitness function on the solution state;
        evaluations: number of evaluations;

    """
    no_improvment_count = 0
    goodness = fitness(state, sets)
    evaluations = max_iteration + 1
    for i in range(max_iteration):
        if no_improvment_count == limit:
            evaluations = i + 1
            break
        new_state = tweak(state, problem_size)
        goodness_new_state = fitness(new_state, sets)
        if goodness_new_state > goodness:
            state = new_state
            goodness = goodness_new_state
            no_improvment_count = 0
        no_improvment_count += 1
    return state, goodness, evaluations

In [54]:
max_iteration = 100_000
limit = 50
for sets, initial_state, (p, d) in problems:
    solution_state, goodness, evaluations = RMHC(sets, p, initial_state, max_iteration, limit)
    print(
        f"Random Mutation Hill Climbing:\nDensity: {d}\nProblem size: {p}\nNumber of calls to fitness function: {evaluations:,}\nGoodness: {goodness}\n"
    )

Random Mutation Hill Climbing:
Density: 0.3
Problem size: 100
Number of calls to fitness function: 95
Goodness: (100, -11)

Random Mutation Hill Climbing:
Density: 0.7
Problem size: 100
Number of calls to fitness function: 54
Goodness: (100, -4)

Random Mutation Hill Climbing:
Density: 0.3
Problem size: 1000
Number of calls to fitness function: 74
Goodness: (1000, -15)

Random Mutation Hill Climbing:
Density: 0.7
Problem size: 1000
Number of calls to fitness function: 55
Goodness: (1000, -5)

Random Mutation Hill Climbing:
Density: 0.3
Problem size: 5000
Number of calls to fitness function: 77
Goodness: (5000, -25)

Random Mutation Hill Climbing:
Density: 0.7
Problem size: 5000
Number of calls to fitness function: 58
Goodness: (5000, -8)



### Steepest-step Hill Climbing

I defined a new tweak function that change a specif value from the index

In [55]:
def tweak_index(state, index):
    new_state = copy(state)
    new_state[index] = not state[index]
    return new_state

In [56]:
def SAHC(sets, problem_size, state, max_iteration):
    """
    Steepest-step Hill Climbing implementation.

    Args:
        sets: 2D boolean array;
        problem_size: the number of sets and points;
        state: initial state;
        max_iteration: the maximum number of iteration;

    Returns:
        state: solution state;
        goodness: the result of the fitness function on the solution state;
        evaluations: number of evaluations;

    """
    goodness = fitness(state, sets)
    evaluations = (max_iteration * problem_size) + 1
    for i in range(max_iteration):
        new_state = tweak_index(state, 0)
        new_state_goodness = fitness(new_state, sets)
        for idx in range(1, problem_size):
            tmp = tweak_index(state, idx)
            tmp_goodness = fitness(tmp, sets)
            if tmp_goodness > new_state_goodness:
                new_state = tmp
                new_state_goodness = tmp_goodness
        if new_state_goodness > goodness:
            state = new_state
            goodness = new_state_goodness
        else:
            evaluations = (i * problem_size) + 1
            break
    return state, goodness, evaluations

In [17]:
max_iteration = 100_000
for sets, initial_state, (p, d) in problems:
    solution_state, goodness, evaluations = SAHC(sets, p, initial_state, max_iteration)
    print(
        f"Stepest-step Hill Climbing:\nDensity: {d}\nProblem size: {p}\nNumber of calls to fitness function: {evaluations:,}\nGoodness: {goodness}\n"
    )

Stepest-step Hill Climbing:
Density: 0.3
Problem size: 100
Number of calls to fitness function: 601
Goodness: (100, -6)

Stepest-step Hill Climbing:
Density: 0.7
Problem size: 100
Number of calls to fitness function: 301
Goodness: (100, -3)

Stepest-step Hill Climbing:
Density: 0.3
Problem size: 1000
Number of calls to fitness function: 10,001
Goodness: (1000, -10)

Stepest-step Hill Climbing:
Density: 0.7
Problem size: 1000
Number of calls to fitness function: 4,001
Goodness: (1000, -4)

Stepest-step Hill Climbing:
Density: 0.3
Problem size: 5000
Number of calls to fitness function: 65,001
Goodness: (5000, -13)

Stepest-step Hill Climbing:
Density: 0.7
Problem size: 5000
Number of calls to fitness function: 25,001
Goodness: (5000, -5)



### Steepest Ascent w/Replacement 

In [57]:
def SAHC_with_replacement(sets, problem_size, state, max_iteration, n):
    """
    Steepest Ascent w/Repleacement Hill Climbing implementation.

    Args:
        sets: 2D boolean array;
        problem_size: the number of sets and points;
        state: initial state;
        max_iteration: the maximum number of iteration;
        n: number of tweaks for each iteration

    Returns:
        state: solution state;
        goodness: the result of the fitness function on the solution state;
        evaluations: number of evaluations;

    """
    goodness = fitness(state, sets)
    evaluations = (max_iteration * n) + 1
    for i in range(max_iteration):
        new_state = tweak(state, problem_size)
        new_state_goodness = fitness(new_state, sets)
        for _ in range(1, n):
            tmp = tweak(state, problem_size)
            tmp_goodness = fitness(tmp, sets)
            if tmp_goodness > new_state_goodness:
                new_state = tmp
                new_state_goodness = tmp_goodness
        if new_state_goodness > goodness:
            state = new_state
            goodness = new_state_goodness
        else:
            evaluations = (i * n) + 1
            break
    return state, goodness, evaluations

In [58]:
max_iteration = 100_000
n = 50
for i, (sets, initial_state, (p, d)) in enumerate(problems):
    solution_state, goodness, evaluations = SAHC_with_replacement(sets, p, initial_state, max_iteration, n)
    print(
        f"Steepest Ascent with replacement Hill Climbing with number of tweaks {n}:\nDensity: {d}\nProblem size: {p}\nNumber of calls to fitness function: {evaluations:,}\nGoodness: {goodness}\n"
    )

Steepest Ascent with replacement Hill Climbing with number of tweaks 50:
Density: 0.3
Problem size: 100
Number of calls to fitness function: 301
Goodness: (100, -6)

Steepest Ascent with replacement Hill Climbing with number of tweaks 50:
Density: 0.7
Problem size: 100
Number of calls to fitness function: 151
Goodness: (100, -3)

Steepest Ascent with replacement Hill Climbing with number of tweaks 50:
Density: 0.3
Problem size: 1000
Number of calls to fitness function: 551
Goodness: (1000, -11)

Steepest Ascent with replacement Hill Climbing with number of tweaks 50:
Density: 0.7
Problem size: 1000
Number of calls to fitness function: 251
Goodness: (1000, -5)

Steepest Ascent with replacement Hill Climbing with number of tweaks 50:
Density: 0.3
Problem size: 5000
Number of calls to fitness function: 801
Goodness: (5000, -16)

Steepest Ascent with replacement Hill Climbing with number of tweaks 50:
Density: 0.7
Problem size: 5000
Number of calls to fitness function: 301
Goodness: (5000,

In [59]:
def tabu_search(sets, problem_size, state, max_iteration, n, max_tabu_length):
    """
    Tabu Search implementation as variant of Steepest Ascent w/Repleacement.

    Args:
        sets: 2D boolean array;
        problem_size: the number of sets and points;
        state: initial state;
        max_iteration: the maximum number of iteration;
        n: number of tweaks for each iteration, and max length of tabu list
        max_tabu: max length of tabu list

    Returns:
        state: solution state;
        goodness: the result of the fitness function on the solution state;
        evaluations: number of evaluations;

    """
    tabu_list = deque()
    goodness = fitness(state, sets)
    evaluations = 1
    for _ in range(max_iteration):
        new_state = tweak(state, problem_size)
        new_state_goodness = fitness(new_state, sets)
        evaluations += 1
        for _ in range(1, n):
            tmp = tweak(state, problem_size)
            if tmp not in tabu_list:
                tmp_goodness = fitness(tmp, sets)
                evaluations += 1
                if tmp_goodness > new_state_goodness:
                    new_state = tmp
                    new_state_goodness = tmp_goodness
        if new_state_goodness > goodness:
            state = new_state
            goodness = new_state_goodness
        else:
            break
        tabu_list.append(state)
        if len(tabu_list) > max_tabu_length:
            tabu_list.popleft()

    return state, goodness, evaluations

In [60]:
max_iteration = 100_000
n = 50
max_tabu_length = 100
for i, (sets, initial_state, (p, d)) in enumerate(problems):
    solution_state, goodness, evaluations = tabu_search(sets, p, initial_state, max_iteration, n, max_tabu_length)
    print(
        f"Tabu Search as Steepest Ascent with replacement variant with number of tweaks,and max tabu list length {n}:\nDensity: {d}\nProblem size: {p}\nNumber of calls to fitness function: {evaluations:,}\nGoodness: {goodness}\n"
    )

Tabu Search as Steepest Ascent with replacement variant with number of tweaks,and max tabu list length 50:
Density: 0.3
Problem size: 100
Number of calls to fitness function: 399
Goodness: (100, -7)

Tabu Search as Steepest Ascent with replacement variant with number of tweaks,and max tabu list length 50:
Density: 0.7
Problem size: 100
Number of calls to fitness function: 201
Goodness: (100, -3)

Tabu Search as Steepest Ascent with replacement variant with number of tweaks,and max tabu list length 50:
Density: 0.3
Problem size: 1000
Number of calls to fitness function: 651
Goodness: (1000, -12)

Tabu Search as Steepest Ascent with replacement variant with number of tweaks,and max tabu list length 50:
Density: 0.7
Problem size: 1000
Number of calls to fitness function: 251
Goodness: (1000, -4)

Tabu Search as Steepest Ascent with replacement variant with number of tweaks,and max tabu list length 50:
Density: 0.3
Problem size: 5000
Number of calls to fitness function: 851
Goodness: (5000

### K Population Random Mutation Hill Climbing

In [62]:
def k_population_RMHC(sets, problem_size, max_iteration, probabilty, limit, k):
    """
    Random Mutation with K population Hill Climbing implementation.

    Args:
        sets: 2D boolean array;
        problem_size: the number of sets and points;
        state: initial state;
        max_iteration: the maximum number of iteration;
        probability: probability of true value in the k generated population;
        k: number of population

    Returns:
        state: solution state;
        goodness: the result of the fitness function on the solution state;
        evaluations: number of evaluations;

    """
    number_of_evaluations = 0
    solution_states = []
    goodness_states = []
    for _ in range(k):
        state = generate_initial_state(probabilty, problem_size)
        solution_state, goodness, evaluations = RMHC(sets, problem_size, state, max_iteration, limit)
        solution_states.append(solution_state)
        goodness_states.append(goodness)
        number_of_evaluations += evaluations

    solution_goodness = max(goodness_states)
    return solution_states[goodness_states.index(solution_goodness)], solution_goodness, number_of_evaluations

In [63]:
max_iteration = 100_000
limit = 20
k = 5
for i, (sets, _, (p, d)) in enumerate(problems):
    solution_state, goodness, evaluations = k_population_RMHC(sets, p, max_iteration, PROBABILITY[0], limit, k)
    print(
        f"Random Mutation Hill Climbing with {k} random population:\nDensity: {d}\nProblem size: {p}\nNumber of calls to fitness function: {evaluations:,}\nGoodness: {goodness}\n"
    )

Random Mutation Hill Climbing with 5 random population:
Density: 0.3
Problem size: 100
Number of calls to fitness function: 175
Goodness: (100, -9)

Random Mutation Hill Climbing with 5 random population:
Density: 0.7
Problem size: 100
Number of calls to fitness function: 116
Goodness: (100, -4)

Random Mutation Hill Climbing with 5 random population:
Density: 0.3
Problem size: 1000
Number of calls to fitness function: 156
Goodness: (1000, -18)

Random Mutation Hill Climbing with 5 random population:
Density: 0.7
Problem size: 1000
Number of calls to fitness function: 105
Goodness: (1000, -7)

Random Mutation Hill Climbing with 5 random population:
Density: 0.3
Problem size: 5000
Number of calls to fitness function: 105
Goodness: (5000, -41)

Random Mutation Hill Climbing with 5 random population:
Density: 0.7
Problem size: 5000
Number of calls to fitness function: 105
Goodness: (5000, -49)



### K Population Steepest Ascent Hill Climbing

In [64]:
def k_population_SAHC_with_replacement(sets, problem_size, max_iteration, n, probability, k):
    """
    Steepest Ascent with Repalacement and K random population Hill Climbing implementation.

    Args:
        sets: 2D boolean array;
        problem_size: the number of sets and points;
        state: initial state;
        max_iteration: the maximum number of iteration;
        n: number of tweaks for each iteration;
        probability:  probability of true value in the k generated population;
        k: number of population;

    Returns:
        state: solution state;
        goodness: the result of the fitness function on the solution state;
        evaluations: number of evaluations;

    """
    number_of_evaluations = 0
    solution_states = []
    goodness_states = []
    for _ in range(k):
        state = generate_initial_state(probability, p)
        solution_state, goodness, evaluations = SAHC_with_replacement(sets, problem_size, state, max_iteration, n)
        solution_states.append(solution_state)
        goodness_states.append(goodness)
        number_of_evaluations += evaluations

    solution_goodness = max(goodness_states)
    return solution_states[goodness_states.index(solution_goodness)], solution_goodness, number_of_evaluations

In [65]:
max_iteration = 100_000
n = 20
k = 5
for sets, _, (p, d) in problems:
    solution_state, goodness, evaluations = k_population_SAHC_with_replacement(
        sets, p, max_iteration, n, PROBABILITY[0], k
    )
    print(
        f"Stepest-step Hill Climbing with {k} random population:\nDensity: {d}\nProblem size: {p}\nNumber of calls to fitness function: {evaluations:,}\nGoodness: {goodness}\n"
    )

Stepest-step Hill Climbing with 5 random population:
Density: 0.3
Problem size: 100
Number of calls to fitness function: 565
Goodness: (100, -7)

Stepest-step Hill Climbing with 5 random population:
Density: 0.7
Problem size: 100
Number of calls to fitness function: 165
Goodness: (100, -3)

Stepest-step Hill Climbing with 5 random population:
Density: 0.3
Problem size: 1000
Number of calls to fitness function: 665
Goodness: (1000, -13)

Stepest-step Hill Climbing with 5 random population:
Density: 0.7
Problem size: 1000
Number of calls to fitness function: 25
Goodness: (1000, -7)

Stepest-step Hill Climbing with 5 random population:
Density: 0.3
Problem size: 5000
Number of calls to fitness function: 45
Goodness: (5000, -41)

Stepest-step Hill Climbing with 5 random population:
Density: 0.7
Problem size: 5000
Number of calls to fitness function: 45
Goodness: (5000, -31)



### Simulated Annealing

In [19]:
def simulated_annealing(sets, problem_size, state, max_iteration, limit):
    """
    Simulated Annealing implementation.

    Args:
        sets: 2D boolean array;
        problem_size: the number of sets and points;
        state: initial state;
        max_iteration: the maximum number of iteration;

    Returns:
        state: solution state;
        goodness: the result of the fitness function on the solution state;
        evaluations: number of evaluations;

    """
    no_improvment_count = 0
    goodness = fitness(state, sets)
    evaluations = max_iteration + 1
    for i in range(max_iteration):
        t = 1 - ((i + 1) / max_iteration)
        if t == 0 or no_improvment_count == limit:
            evaluations = i + 1
            break
        new_state = tweak(state, problem_size)
        goodness_new_state = fitness(new_state, sets)
        if goodness_new_state >= goodness or random() < (np.exp(-(goodness[1] - goodness_new_state[1])) / t):
            state = new_state
            goodness = goodness_new_state
            no_improvment_count = 0
        no_improvment_count += 1

    return state, goodness, evaluations

In [35]:
max_iteration = 5_000
limit = 50
for sets, initial_state, (p, d) in problems:
    solution_state, goodness, evaluations = simulated_annealing(sets, p, initial_state, max_iteration, limit)
    print(
        f"Simulated Annealing:\nDensity: {d}\nProblem size: {p}\nNumber of calls to fitness function: {evaluations:,}\nGoodness: {goodness}\n"
    )

Simulated Annealing:
Density: 0.3
Problem size: 100
Number of calls to fitness function: 5,000
Goodness: (100, -52)

Simulated Annealing:
Density: 0.7
Problem size: 100
Number of calls to fitness function: 5,000
Goodness: (100, -52)

Simulated Annealing:
Density: 0.3
Problem size: 1000
Number of calls to fitness function: 5,000
Goodness: (1000, -494)

Simulated Annealing:
Density: 0.7
Problem size: 1000
Number of calls to fitness function: 5,000
Goodness: (1000, -482)

Simulated Annealing:
Density: 0.3
Problem size: 5000
Number of calls to fitness function: 5,000
Goodness: (5000, -1888)

Simulated Annealing:
Density: 0.7
Problem size: 5000
Number of calls to fitness function: 5,000
Goodness: (5000, -1875)

