# Experiment Builder — Examples

This notebook shows how to build experiment *plans* (CSV grids) decoupled from actually running generation.

You can inspect the planned runs (Cartesian product) before spending compute.


In [4]:


from experiments.build.experiment_builder import ExperimentSpec, build_and_write
from guidance_registry import SCHEDULER_PROPERTIES

list(SCHEDULER_PROPERTIES.keys())


['linear',
 'cosine',
 'exponential',
 'logarithmic',
 'sigmoid',
 'triangular',
 'parabolic',
 'constant']

## Example A — Core sweep (baseline)

- Compare multiple scheduler kinds × directions × prompts with a standard guidance range.
- Intended for broad coverage.


In [8]:
EXPERIMENT = "EXP_A_CORE_SWEEP"

kinds = list(SCHEDULER_PROPERTIES.keys())
directions = ["increasing", "decreasing"]
weight_ranges = [(1.0, 7.5)]
seeds = [42, 43]
num_steps = [25]

prompts = {
    "P_DETAIL": "A close-up portrait of a cyberpunk robot with intricate clockwork gears and neon eyes, macro photography",
    "P_COMP":   "A symmetrical wide shot of a lone tree in a snowy field, Wes Anderson style, pastel colors",
    "P_TEXT":   "A wooden sign hanging on a door that says 'CLOSED' clearly carved into the wood"
}

spec = ExperimentSpec(
    experiment_group=EXPERIMENT,
    kinds=kinds,
    directions=directions,
    weight_ranges=weight_ranges,
    prompts=prompts,
    seeds=seeds,
    num_steps=num_steps,
    metadata={"notes": "baseline sweep", "model": "SD1.5"},
    num_workers=3
)

df_a = build_and_write(spec, f"{EXPERIMENT}_plan.csv")
df_a

Unnamed: 0,experiment_group,experiment_id,kind,direction,w_min,w_max,num_steps,seed,prompt_id,prompt_text,params,notes,model,worker_id
0,EXP_A_CORE_SWEEP,50831c599181,linear,increasing,1.0,7.5,25,42,P_DETAIL,A close-up portrait of a cyberpunk robot with ...,{},baseline sweep,SD1.5,0
1,EXP_A_CORE_SWEEP,03b11ac95b8b,linear,increasing,1.0,7.5,25,43,P_DETAIL,A close-up portrait of a cyberpunk robot with ...,{},baseline sweep,SD1.5,1
2,EXP_A_CORE_SWEEP,05d61aad41da,linear,increasing,1.0,7.5,25,42,P_COMP,A symmetrical wide shot of a lone tree in a sn...,{},baseline sweep,SD1.5,2
3,EXP_A_CORE_SWEEP,f7cab3859543,linear,increasing,1.0,7.5,25,43,P_COMP,A symmetrical wide shot of a lone tree in a sn...,{},baseline sweep,SD1.5,0
4,EXP_A_CORE_SWEEP,51af3ffbf734,linear,increasing,1.0,7.5,25,42,P_TEXT,A wooden sign hanging on a door that says 'CLO...,{},baseline sweep,SD1.5,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
85,EXP_A_CORE_SWEEP,86124dc3e3dd,constant,increasing,1.0,7.5,25,43,P_DETAIL,A close-up portrait of a cyberpunk robot with ...,{},baseline sweep,SD1.5,1
86,EXP_A_CORE_SWEEP,7c59c2f5856a,constant,increasing,1.0,7.5,25,42,P_COMP,A symmetrical wide shot of a lone tree in a sn...,{},baseline sweep,SD1.5,2
87,EXP_A_CORE_SWEEP,7153f3c9da54,constant,increasing,1.0,7.5,25,43,P_COMP,A symmetrical wide shot of a lone tree in a sn...,{},baseline sweep,SD1.5,0
88,EXP_A_CORE_SWEEP,c518073a58b2,constant,increasing,1.0,7.5,25,42,P_TEXT,A wooden sign hanging on a door that says 'CLO...,{},baseline sweep,SD1.5,1


## Example B — Make the alignment ↔ fidelity tradeoff obvious (extreme ranges + “checkable” prompts)

Idea:
- Use **very low** and **very high** guidance ranges.
- Use prompts with *verifiable* details (counting, text, specific attributes).
- Goal is to clearly demonstrate failure modes: too-low guidance ignores prompt, too-high guidance overconstrains / artifacts.


In [10]:
EXPERIMENT = "EXP_B_EXTREME_TRADEOFF"

kinds = ["constant", "linear", "cosine", "sigmoid", "exponential"]
directions = ["increasing", "decreasing"]
weight_ranges = [
    (1.0, 2.0),   # very weak guidance band
    (1.0, 12.0),  # very strong ceiling (often causes collapse/artifacts)
]
seeds = [0, 1, 2, 3]
num_steps = [25]

prompts = {
    "P_COUNT": "A photo of exactly THREE red apples on a white plate, shot from above, sharp focus",
    "P_TEXT":  "A storefront window sign that clearly reads 'OPEN 24 HOURS' in bold black letters",
    "P_ATTR":  "A blue sports car with a yellow racing stripe, parked on wet asphalt at night, reflections visible",
}

spec = ExperimentSpec(
    experiment_group=EXPERIMENT,
    kinds=kinds,
    directions=directions,
    weight_ranges=weight_ranges,
    prompts=prompts,
    seeds=seeds,
    num_steps=num_steps,
    metadata={"notes": "tradeoff demo", "bucket": "checkable-details"},
    num_workers=4
)

df_b = build_and_write(spec, f"{EXPERIMENT}_plan.csv")
df_b


(           experiment_group experiment_id      kind   direction  w_min  w_max  \
 151  EXP_B_EXTREME_TRADEOFF  0142f920bbbe   sigmoid  decreasing    1.0    2.0   
 56   EXP_B_EXTREME_TRADEOFF  36bbbc1c4a2a    linear  decreasing    1.0    2.0   
 123  EXP_B_EXTREME_TRADEOFF  3a7cd8406c58   sigmoid  increasing    1.0    2.0   
 60   EXP_B_EXTREME_TRADEOFF  f7dd9d2e8c0c    linear  decreasing    1.0   12.0   
 100  EXP_B_EXTREME_TRADEOFF  9ede8a02bd6a    cosine  decreasing    1.0    2.0   
 51   EXP_B_EXTREME_TRADEOFF  9df665471bb7    linear  decreasing    1.0    2.0   
 7    EXP_B_EXTREME_TRADEOFF  267286caca94  constant  increasing    1.0    2.0   
 5    EXP_B_EXTREME_TRADEOFF  ce0e6b9fc523  constant  increasing    1.0    2.0   
 
      num_steps  seed prompt_id  \
 151         25     3    P_TEXT   
 56          25     0    P_ATTR   
 123         25     3   P_COUNT   
 60          25     0   P_COUNT   
 100         25     0    P_TEXT   
 51          25     3   P_COUNT   
 7           25

## Example C — “Out-of-distribution” steering hypothesis (increasing guidance)

Hypothesis:
- For prompts that are *hard / out-of-distribution*, starting with lower guidance may keep samples on-manifold early,
  then increasing guidance later might steer fine details toward adherence.

We encode that by focusing on **increasing** schedules and optionally stronger ceilings.


In [11]:
EXPERIMENT = "EXP_B_EXTREME_TRADEOFF"

kinds = ["constant", "linear", "exponential", "triangular"]
directions = ["increasing", "decreasing"]
weight_ranges = [
    (1.0, 2.0),   # very weak guidance band
    (1.0, 12.0),  # very strong ceiling (often causes collapse/artifacts)
    (7,50),
    (7,100),
]
seeds = [42, 1337]
num_steps = [40]

prompts = {
    "P_APPLE_COUNT_DETAILED": "masterpiece, best quality, 3 red apples on a wooden table, studio lighting, sharp focus",
    "P_VASE_COUNT_DETAILED":  "masterpiece, best quality, 5 ceramic vases on a minimalist shelf, distinct shapes, clean shadows"
}

spec = ExperimentSpec(
    experiment_group=EXPERIMENT,
    kinds=kinds,
    directions=directions,
    weight_ranges=weight_ranges,
    prompts=prompts,
    seeds=seeds,
    num_steps=num_steps,
    metadata={"notes": "tradeoff demo", "bucket": "checkable-details"},
    num_workers=4
)

df_b = build_and_write(spec, f"{EXPERIMENT}_plan.csv")
df_b


Unnamed: 0,experiment_group,experiment_id,kind,direction,w_min,w_max,num_steps,seed,prompt_id,prompt_text,params,notes,bucket,worker_id
0,EXP_B_EXTREME_TRADEOFF,162bc693207e,constant,increasing,1.0,2.0,40,42,P_APPLE_COUNT_DETAILED,"masterpiece, best quality, 3 red apples on a w...",{},tradeoff demo,checkable-details,0
1,EXP_B_EXTREME_TRADEOFF,d3378258bcd1,constant,increasing,1.0,2.0,40,1337,P_APPLE_COUNT_DETAILED,"masterpiece, best quality, 3 red apples on a w...",{},tradeoff demo,checkable-details,1
2,EXP_B_EXTREME_TRADEOFF,4b9789b4dda0,constant,increasing,1.0,2.0,40,42,P_VASE_COUNT_DETAILED,"masterpiece, best quality, 5 ceramic vases on ...",{},tradeoff demo,checkable-details,2
3,EXP_B_EXTREME_TRADEOFF,3fba8cdefd65,constant,increasing,1.0,2.0,40,1337,P_VASE_COUNT_DETAILED,"masterpiece, best quality, 5 ceramic vases on ...",{},tradeoff demo,checkable-details,3
4,EXP_B_EXTREME_TRADEOFF,64eb4c98180a,constant,increasing,1.0,12.0,40,42,P_APPLE_COUNT_DETAILED,"masterpiece, best quality, 3 red apples on a w...",{},tradeoff demo,checkable-details,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
139,EXP_B_EXTREME_TRADEOFF,853ad4a67b17,exponential,decreasing,7.0,50.0,40,1337,P_VASE_COUNT_DETAILED,"masterpiece, best quality, 5 ceramic vases on ...",{},tradeoff demo,checkable-details,3
140,EXP_B_EXTREME_TRADEOFF,9b659c65502e,exponential,decreasing,7.0,100.0,40,42,P_APPLE_COUNT_DETAILED,"masterpiece, best quality, 3 red apples on a w...",{},tradeoff demo,checkable-details,0
141,EXP_B_EXTREME_TRADEOFF,7073f65d93c8,exponential,decreasing,7.0,100.0,40,1337,P_APPLE_COUNT_DETAILED,"masterpiece, best quality, 3 red apples on a w...",{},tradeoff demo,checkable-details,1
142,EXP_B_EXTREME_TRADEOFF,c4827042ad0f,exponential,decreasing,7.0,100.0,40,42,P_VASE_COUNT_DETAILED,"masterpiece, best quality, 5 ceramic vases on ...",{},tradeoff demo,checkable-details,2


In [None]:
EXPERIMENT = "EXP_C_OOD_STEERING"

kinds = ["linear", "cosine", "exponential"]
directions = ["increasing"]
weight_ranges = [(1.0, 7.5), (1.0, 10.0)]
seeds = [10, 11, 12]
num_steps = [25]

prompts = {
    "P_OOD1": "A medieval oil painting of a modern smartphone made of stained glass, highly detailed",
    "P_OOD2": "A realistic photo of a transparent wooden chair, studio lighting, clean background",
}

# Example: sweep exponential alpha to change curvature (optional)
params_list = [
    {},                 # default params from registry functions
    {"alpha": 2.0},
    {"alpha": 4.0},
]

spec = ExperimentSpec(
    experiment_group=EXPERIMENT,
    kinds=kinds,
    directions=directions,
    weight_ranges=weight_ranges,
    prompts=prompts,
    seeds=seeds,
    num_steps=num_steps,
    params_list=params_list,
    metadata={"notes": "ood steering", "bucket": "ood"},
    num_workers=3
)

df_c = build_and_write(spec, f"{EXPERIMENT}_plan.csv")
df_c.head(), len(df_c)


## Quick sanity checks
- Any duplicate experiment_ids?
- How many runs per prompt?


In [None]:
def sanity(df):
    return {
        "rows": len(df),
        "unique_ids": df["experiment_id"].nunique(),
        "dupes": int(len(df) - df["experiment_id"].nunique()),
        "runs_per_prompt": df.groupby("prompt_id").size().to_dict(),
        "kinds": sorted(df["kind"].unique().tolist()),
        "directions": sorted(df["direction"].unique().tolist()),
        "ranges": sorted({(float(a), float(b)) for a, b in zip(df["w_min"], df["w_max"])}),
    }

sanity(df_a), sanity(df_b), sanity(df_c)
