# DeduplicatedGenerator Xopt Wrapper
This notebook demonstrates the use of the `DeduplicatedGenerator` wrapper for xopt generator objects. The wrapper will work with any xopt generator and guarantees that the method `generate(...)` only outputs never-before-seen sets of decision variables. It does this by maintaining an array of all previously returned decision variables and using them as a filter on the generators output. By doing this, the wrapper also solves the "poisoning" problem with duplicate individuals in the CNSGA generator.

The notebook demonstrates running an optimization which produces duplicate individuals. The optimization is then repeated using the `DeduplicatedGenerator` wrapper and it is confirmed that the duplicates vanish.

In [1]:
from electronsandstuff.xopt.deduplicated import DeduplicatedGenerator
from electronsandstuff.xopt.paretobench import XoptProblemWrapper
from paretobench import Problem
from xopt import Xopt, Evaluator
from xopt.generators.ga.cnsga import CNSGAGenerator

### Optimization Without Deduplication
This cell runs an optimization on the CF1 test problem and prints the number of detected duplicates in each generation.

In [2]:
# Our test problem
prob = XoptProblemWrapper(Problem.from_line_fmt("CF1"))

# Create a test problem and use NSGA-II to solve it
population_size = 50
ev = Evaluator(function=prob, vectorized=True, max_workers=population_size)
X = Xopt(
    generator=CNSGAGenerator(vocs=prob.vocs, population_size=population_size),
    evaluator=ev,
    vocs=prob.vocs,
)
X.strict = False

print("Using CNSGA Generator:")
for gen in range(8):
    X.step()

    # Calculate the number of non-unique elements
    pop = X.generator.population[X.generator.vocs.variable_names]
    n_dups = len(pop) - len(pop.copy().drop_duplicates())
    print(f"[{gen+1}] Duplicate individuals: {n_dups}")

Using CNSGA Generator:
[1] Duplicate individuals: 0
[2] Duplicate individuals: 0
[3] Duplicate individuals: 1
[4] Duplicate individuals: 2
[5] Duplicate individuals: 3
[6] Duplicate individuals: 4
[7] Duplicate individuals: 5
[8] Duplicate individuals: 7


### Optimization With Wrapper
This cell performs the same optimization, but using `DeduplicatedGenerator` to force unique output from the generator. The number of duplicates is printed at each generation and will show zero demonstrating that the filtering is working.

In [3]:
# Our test problem
prob = XoptProblemWrapper(Problem.from_line_fmt("CF1"))

# Create a test problem and use NSGA-II to solve it
population_size = 50
ev = Evaluator(function=prob, vectorized=True, max_workers=population_size)
X = Xopt(
    generator=CNSGAGenerator(vocs=prob.vocs, population_size=population_size),
    evaluator=ev,
    vocs=prob.vocs,
)
X.strict = False

# Inject the deduplicator
DeduplicatedGenerator.inject(X)

print("Using DeduplicatedGenerator Wrapper:")
for gen in range(8):
    X.step()

    # Calculate the number of non-unique elements
    pop = X.generator.generator.population[X.generator.vocs.variable_names]
    n_dups = len(pop) - len(pop.copy().drop_duplicates())
    print(f"[{gen+1}] Duplicate individuals: {n_dups}")

Using DeduplicatedGenerator Wrapper:
[1] Duplicate individuals: 0
[2] Duplicate individuals: 0
[3] Duplicate individuals: 0
[4] Duplicate individuals: 0
[5] Duplicate individuals: 0
[6] Duplicate individuals: 0
[7] Duplicate individuals: 0
[8] Duplicate individuals: 0
