One state is composed by a set of 'taken tiles' and a set 'not-taken tiles' as {T} {N}
The cost of a state is defined as the number of sets it consideres as taken.
The heuristic measures how many tiles are needed to cover the missing points of a state based on the number of points covered by each of the not taken tiles.
The A* algorithm searches for the optimal set coverage with the minimum number of tiles taken: the function f(n) which computes the priority of a state in this implementation sums its cost and an optimist heuristic. The search stops when a solution is reached (which corresponds to the optimal one). My implementation is slower than the one implemented by the professor but provides the same optimal solutions.

Credits: https://github.com/squillero/computational-intelligence/blob/master/2023-24/set-covering_path-search.ipynb

In [281]:
from random import random
from functools import reduce
from collections import namedtuple
from queue import PriorityQueue

import numpy as np
from tqdm.auto import tqdm

In [282]:
PROBLEM_SIZE = 20
NUM_SETS = 40
CHANCE = 0.2
SETS = tuple(
    np.array([random() < CHANCE for _ in range(PROBLEM_SIZE)])
    for _ in range(NUM_SETS)
)
State = namedtuple("State", ["taken", "not_taken"])

In [291]:
def covered(state):
    return reduce(
        np.logical_or,
        [SETS[i] for i in state.taken],
        np.array([False for _ in range(PROBLEM_SIZE)]),
    )

def distance(state):
    return PROBLEM_SIZE - sum(covered(state))

def goal_check(state):
    return np.all(covered(state))

def h(state):
    already_covered = covered(state)
    if np.all(already_covered):
        return 0
    missing_size = PROBLEM_SIZE - sum(already_covered)
    candidates = sorted([distance(State({i}, set(range(NUM_SETS)) ^ {i})) for i in state.not_taken], reverse=True)
    taken = 1
    while sum(candidates[:taken]) < missing_size:
        taken += 1
    return taken
    

In [284]:
assert goal_check(
    State(set(range(NUM_SETS)), set())
), "Problem not solvable"

In [289]:
def f(state):
    return len(state.taken) + h(state)

frontier = PriorityQueue()
state = State(set(), set(range(NUM_SETS)))
frontier.put((f(state), state))    

counter = 0
_, current_state = frontier.get()
with tqdm(total=None) as pbar:
  while not goal_check(current_state):
      counter += 1
      for action in current_state[1]:
          new_state = State(
              current_state.taken ^ {action},
              current_state.not_taken ^ {action},
          )
          frontier.put((f(new_state), new_state))
      _, current_state = frontier.get()
      pbar.update(1)

print(f"Solved in {counter} steps ({len(current_state.taken)} tiles)", current_state)

0it [00:00, ?it/s]

Solved in 6397 steps (4 tiles) State(taken={2, 34, 14, 31}, not_taken={0, 1, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 32, 33, 35, 36, 37, 38, 39})
