## Evolutionary search for creative artifacts: concepts-first walkthrough

This notebook explains the evolutionary ideas driving Evo-Luminate, using short code snippets only to illustrate the concepts. We’ll focus on:

- Representation: what is an “artifact,” genotype vs. phenotype
- Initialization: creating diverse seeds
- Variation: generating children (mutation/crossover) as idea proposals
- Evaluation: embedding artifacts and measuring diversity
- Selection: k-NN novelty as the fitness signal
- Iteration: logging, reproducibility, and parallelism

We’ll use a tiny dummy artifact so the conceptual steps run fast and are easy to interpret.


In [1]:
import os, json, random
import torch, numpy as np

from src import run_evolution_experiment as ree
from src.population import Population
from src.utils import get_device

print("Device:", get_device())



Loaded 7 creativity strategies
Device: mps


### The evolutionary loop, conceptually
- We maintain a population of artifacts (genotypes as text/code, phenotypes as images or renderings).
- Each generation:
  1) Select parents and propose children (variation via mutation/crossover prompts or operators)
  2) Evaluate artifacts by embedding them in a vector space (text/image encoders)
  3) Select the next generation by novelty: keep artifacts most distant from neighbors
- Repeat for a fixed number of generations, logging intermediate metrics.

In code, all of this is orchestrated by `run_evolution_experiment(config)`, but the key is the idea: we reward diversity (novelty) rather than a fixed scalar fitness.


In [2]:
class DummyArtifact:
    name = "dummy"

    def __init__(self, vec):
        self.id = f"d{hash(tuple(vec)) % 100000}"
        self.genome = "GENOME"
        self.phenome = None
        self.prompt = None
        self.embedding = torch.tensor(vec, dtype=torch.float32)
        self.metadata = {}

    @classmethod
    def create_from_prompt(cls, prompt: str, output_dir: str, **kwargs):
        # deterministic base vectors
        bases = [[1,0,0],[0,1,0],[0,0,1]]
        idx = random.randint(0, len(bases)-1)
        return cls(bases[idx])

    def compute_embedding(self) -> torch.Tensor:
        return self.embedding

# Monkeypatch the artifact class factory used by the algorithm
from src.artifacts.load_artifacts import get_artifact_class as real_get_artifact_class

def fake_get_artifact_class(config):
    return DummyArtifact

ree.get_artifact_class = fake_get_artifact_class
print("Patched get_artifact_class -> DummyArtifact")


Patched get_artifact_class -> DummyArtifact


### Minimal experiment for illustration
We’ll run a minimal configuration to illustrate each concept with intermediate outputs. Small sizes keep the focus on ideas, not performance.


In [3]:
cfg = {
    "random_seed": 7,
    "prompt": "",
    "initial_population_size": 6,
    "population_size": 6,
    "children_per_generation": 3,
    "num_generations": 1,
    "k_neighbors": 2,
    "max_workers": 2,
    "artifact_class": "DummyArtifact",
    "evolution_mode": "variation",
    "reasoning_effort": "low",
    "use_creative_strategies": False,
    "use_summary": False,
    "crossover_rate": 0.0,
}
out_dir = os.path.join("results", "walkthrough_demo")
if os.path.exists(out_dir):
    import shutil
    shutil.rmtree(out_dir)
os.makedirs(out_dir, exist_ok=True)
print(json.dumps(cfg, indent=2))


{
  "random_seed": 7,
  "prompt": "",
  "initial_population_size": 6,
  "population_size": 6,
  "children_per_generation": 3,
  "num_generations": 1,
  "k_neighbors": 2,
  "max_workers": 2,
  "artifact_class": "DummyArtifact",
  "evolution_mode": "variation",
  "reasoning_effort": "low",
  "use_creative_strategies": false,
  "use_summary": false,
  "crossover_rate": 0.0
}


### Initialization: diverse seeds
We start by generating a diverse initial set of artifacts. Conceptually, this defines the search space we’ll explore. In practice, we create artifacts in parallel to populate the initial generation quickly.


In [4]:
pop = ree.run_evolution_experiment(out_dir, cfg)
print("Population size (post-gen 1):", len(pop.get_all()))

# Inspect saved metrics
with open(os.path.join(out_dir, "novelty_metrics.jsonl")) as f:
    lines = f.readlines()
print("Novelty metrics entries:", len(lines))
print("Last metrics:", lines[-1].strip())


Population size (post-gen 1): 6
Novelty metrics entries: 2
Last metrics: {"generation": 1, "timestamp": "2025-09-30T23:01:08.727566", "avg_distance_to_neighbors": [0.5, 0.5, 0.5, 0.5, 0.5, 0.5], "mean_novelty": 0.5, "mean_genome_length": 6.0, "strategy_metrics": {"None": {"count": 5, "avg_novelty": 0.5, "std_novelty": 0.0}, "null": {"count": 1, "avg_novelty": 0.5, "std_novelty": 0.0}}}


### Variation and selection
- Variation proposes new artifacts by recombining or transforming parents. Here, proposals are generated as new “ideas” rather than strict gene-level operators.
- Evaluation uses embeddings (text/image encoders) to place artifacts into a shared vector space.
- Selection applies novelty search: keep those most distant from their neighbors (encourages exploration and diversity).


In [5]:
from src.run_evolution_experiment import get_embeddings
from src.population import Population

# Create a small hand-crafted population
A = DummyArtifact([1,0,0])
B = DummyArtifact([0,1,0])
C = DummyArtifact([0,0,1])
D = DummyArtifact([1,1,0])
pop2 = Population()
pop2.add_all([A,B,C,D])

emb = get_embeddings(pop2.get_all())
print("Embeddings shape:", tuple(emb.shape), ", device:", emb.device, ", dtype:", emb.dtype)

# Compute novelty ordering and inspect distances
order, knn_mean = pop2.select_by_novelty(emb, k_neighbors=2, return_distances=True)
print("Novelty order (indices):", order)
print("Avg distance to k-nearest neighbors:", knn_mean.tolist())

Embeddings shape: (4, 3) , device: mps:0 , dtype: torch.float32
Novelty order (indices): [2, 0, 1, 3]
Avg distance to k-nearest neighbors: [0.6464465856552124, 0.6464465856552124, 1.0, 0.2928932309150696]


### Iterate and observe
We iterate: initialize → vary → evaluate → select, for a few generations. Along the way we log population snapshots and novelty metrics to understand how diversity evolves.


In [6]:
cfg2 = dict(cfg)
cfg2["num_generations"] = 2
out_dir2 = os.path.join("results", "walkthrough_demo2")
if os.path.exists(out_dir2):
    import shutil
    shutil.rmtree(out_dir2)
os.makedirs(out_dir2, exist_ok=True)

pop2 = ree.run_evolution_experiment(out_dir2, cfg2)
print("Final population:", len(pop2.get_all()))

print("Files in output dir:")
print(sorted(os.listdir(out_dir2))[:10])

print("Artifacts dir exists:", os.path.exists(os.path.join(out_dir2, "artifacts")))
print("Population log:", os.path.exists(os.path.join(out_dir2, "population_data.jsonl")))
print("Novelty log:", os.path.exists(os.path.join(out_dir2, "novelty_metrics.jsonl")))


Final population: 6
Files in output dir:
['artifacts', 'config.json', 'novelty_metrics.jsonl', 'population_data.jsonl']
Artifacts dir exists: True
Population log: True
Novelty log: True
