# Set Covering
---
## Single State Local Search

### Description of the problem
Given the number of sets ``NUM_SETS`` and the number of elements inside each set ``PROBLEM_SIZE``, determine, if possible, the collection of sets through which all the elements are available.

A state is made up of the sets of items that I take and the sets that I don't take. 
* State=({1,3,5}, {0,2,4,6,7}) -> I'm taking the second array of items from ``SETS``, the fourth and the sixth.
* The quality of a solution (the chosen state) is given by the smallest number of taken sets to get all the elements

# Code
---

### Imported Libraries

In [118]:
from random import random, choice, randint
from functools import reduce
from math import ceil
from collections import namedtuple
from queue import PriorityQueue, SimpleQueue, LifoQueue
import numpy as np
from tqdm.auto import tqdm
from copy import copy

### Problem instance
We implement the sets as an array of arrays where one element has a 20% chance of being true. A set indicates which item is inside the set and which is not, if an element of the set has a value of true it means that it is present otherwise not

In [119]:
PROBLEM_SIZE = 5
NUM_SETS = 10
SETS = tuple(np.array([random() < .2 for _ in range(PROBLEM_SIZE)]) for _ in range(NUM_SETS))
State = namedtuple('State', ['taken', 'not_taken'])

In [120]:
def goal_check(state):
    return np.all(reduce(np.logical_or, [SETS[i] for i in state.taken], np.array([False for _ in range(PROBLEM_SIZE)])))
assert goal_check(State(set(range(NUM_SETS)), set())), "Problem not solvable"

# Hill Climbing
The ``current_state`` is an array of boolean where true indicates if the state contains that particular set, false if it doesn't contain it. We have to initialize it to a random possible solution.

- In ``tweak`` function we swap one of the set randomly, if it was taken we change it into not taken and vice versa.
- The ``fintess1`` function returns a boolean that indicates if the state given as input is a solution and the negative cost that is the number of taken sets as negative. That's because when we check if we want to swap the  ``current_state`` we first check if the state is a solution (False < True) and then we want to take the solution with the smallest number of taken sets. The problem with this function is that if we start with a ``current_state`` that is not a solution, the algorithm will go to another invalid solution with just less taken sets. The more he takes away the set, the more difficult it is to move towards the solution.
- the ``fitness2`` function solves the problem of the previous function using the number of covered elements as the first object of the tuple instead of the boolean that indicate if the ``current_state`` is a solution.

Speaking in a more general way, the ``fitness`` function is as if it gives a rank to the current state and the tweak function is the one that allows us to move between different solutions.

In [121]:
def fitness1(state):
    cost = sum(state)
    valid = np.all(
        reduce(
            np.logical_or,
            [SETS[i] for i, t in enumerate(state) if t],
            np.array([False for _ in range(PROBLEM_SIZE)]),
        )
    )
    return valid, -cost


def fitness2(state):
    cost = sum(state)
    valid = np.sum(
        reduce(
            np.logical_or,
            [SETS[i] for i, t in enumerate(state) if t],
            np.array([False for _ in range(PROBLEM_SIZE)]),
        )
    )
    return valid, -cost

def tweak(state):
    new_state = copy(state)
    index = randint(0, PROBLEM_SIZE - 1) # pick a random index
    new_state[index] = not new_state[index] # swap
    return new_state

fitness = fitness2

In [None]:
current_state = [choice([True, False]) for _ in range(NUM_SETS)]
print(fitness(current_state))

for step in range(100):
    new_state = tweak(current_state)
    if fitness(new_state) > fitness(current_state): # with fitness2 we have to use >, with fitness1 >=
        current_state = new_state
        print(fitness(current_state))