### Problem

First things first, we need a problem.
It has been kindly provided by the Professor.

In [1]:
import random

def problem(N, seed=None):
    random.seed(seed)
    return [
        list(set(random.randint(0, N - 1) for n in range(random.randint(N // 5, N // 2))))
        for n in range(random.randint(N, N * 5))
    ]

def remove_duplicates(problem):
    unique = set()
    for el in problem:
        unique.add(frozenset(el))
    return unique    

def goal(N):
    return set(range(N))

### Cost

The cost function associates a cost to the selected choice, given the current state.
The cost of a choice is increased by one for every element that's already in the state, but it's also divided by the number of elements in the state, as it (in theory) compensates the fact that it's more likely for earlier attempts to have a lower chance of adding a duplicate.

In [2]:
def cost_fun(new, current):
    cost = 0
    cset = set(current)
    for e in new:
        if (e in cset):
            cost += 1
    return cost/(len(current) + 1)

### Goal

The goal test function verifies if the problems constraints are satisfied (in this case, since we're working with sets, it just checks if solution and goal coincide).

In [3]:
def goal_test(state, goal):
    return state == goal

### Generator

The generator function provides a set containing every possible choice given a state.

In [4]:
def possible_actions(state, problem):
    return set(s for s in problem if s not in state)

In [5]:
from queue import PriorityQueue

def search(goal, problem):
    frontier = PriorityQueue()
    parent_state = {}
    state_cost = {}
    state_elems = {}
    state = frozenset()
    solution = list()
    parent_state[state] = None
    state_cost[state] = 0
    state_elems[state] = 0

    while state is not None and not goal_test(state, goal):
        for a in possible_actions(state, problem):
            new_state = frozenset(list(state) + list(a))
            cost = cost_fun(a, state)
            if new_state not in state_cost and new_state not in frontier.queue:
                parent_state[new_state] = state
                state_cost[new_state] = cost
                state_elems[new_state] = state_elems[state] + len(a)
                frontier.put((int(N - cost), new_state))
            elif new_state in frontier.queue and cost < state_cost[new_state]:
                parent_state[new_state] = state
                state_cost[new_state] = cost
        if frontier:
            state = frontier.get()[1]
        else:
            state = None
    return state_elems[state], len(state_cost)

In [6]:
for N in [5, 10, 20, 100, 500, 1000]:
    go = goal(N)
    pr = remove_duplicates(problem(N, 42))
    solution, n = search(go, pr)
    print(f"Solution for N = {N}:\n{solution} elements\n{n} nodes visited")

Solution for N = 5:
7 elements
28 nodes visited
Solution for N = 10:
11 elements
706 nodes visited
Solution for N = 20:
46 elements
751 nodes visited
