### Set cover problem 

Given a set of elements {1, 2, …, n} (called the universe) and a collection S of m subsets whose union equals the universe, the set cover problem is to identify the smallest sub-collection of S whose union equals the universe.Given a set of elements {1, 2, …, n} (called the universe) and a collection S of m subsets whose union equals the universe, the set cover problem is to identify the smallest sub-collection of S whose union equals the universe.

In [31]:
from random import random, choice, randint
from functools import reduce
from collections import namedtuple
from queue import PriorityQueue, SimpleQueue, LifoQueue
from copy import  copy

import numpy as np
from tqdm.auto import tqdm

### Setting problem parameters

In [32]:
PROBLEM_SIZE  = 80 #elements to cover
NUM_SETS = 100
SETS =  tuple(np.array([random() < 0.3 for _ in range(PROBLEM_SIZE)])  for _ in range(NUM_SETS) )
    #the value True means that the set contains the element
    #we randomly create NUM_SETS sets of PROBLEM_SIZE elements (True/False)\

#print('Problem size:', SETS)
State = namedtuple('State', ['taken', 'not_taken'])

In [33]:
#Function to check all the elements are covered

def goal_check(state):
    return np.all(covered(state))

def covered(state):
    return reduce(
        np.logical_or,
        [SETS[i] for i in state.taken],
        np.array([False for _ in range(PROBLEM_SIZE)]),
    )

assert goal_check( ##check if taking all sets a solution exists
    State(set(range(NUM_SETS)), set())
), "Probelm not solvable"

## A* algorithm
The heuristic function should be always optimistic and respect some constraint to provide the best solution. 

In [34]:
#using the distance from the goal as a heuristic
def heuristic1(state):
    return PROBLEM_SIZE - len(state.taken)

#This heuristic does not provide always the best solution. Indeed it is not admissible.
#I tried many different run using the breadth-first to find the minimun number of sets to reach the goal 
#and not always this heuristic provides the best solution.

In [35]:
#define the heuristic to use
heuristic = heuristic1


def actual_cost(state): 
    return len(state.taken)

def a_star(state): 
    return actual_cost(state) + heuristic(state)

In [41]:
frontier = PriorityQueue()
state = State(set(), set(range(NUM_SETS)))

frontier.put((a_star(state), state))

steps = 0
weight , current_state = frontier.get()

with tqdm(total=None) as pbar: 
    while not goal_check(current_state): 
        steps += 1
        for action in current_state.not_taken:
            new_state = State(
                current_state.taken ^ {action},
                current_state.not_taken ^ {action},
            )
            frontier.put((a_star(new_state), new_state))
        weight, current_state = frontier.get()
        pbar.update(1)

print(f'Solution found in {steps} steps and {len(current_state.taken)} tiles')
print(f'Final state: {current_state.taken}')


0it [00:00, ?it/s]

93it [00:00, 816.25it/s]

Solution found in 93 steps and 9 tiles
Final state: {96, 0, 98, 99, 97, 50, 93, 94, 95}



