Skip to content

GuardianAI1/closure-repro

Repository files navigation

Closure Repro

Minimal reproduction artifact for the trajectory-state phenomena described in:

This package is intentionally small. It is not a framework, SDK, or product surface. It exists so researchers can reproduce the core open-vs-closed trajectory signatures without needing GuardianAI access.

Public links:

What this package shows

Two profiles are included:

  • lab2_recursive_split Raw reinjection preserves a strict downstream contract violation, while sanitized reinjection reopens the path and stays contract-valid.
  • lab5_closure_boundary The same perturbation lands after closure has already formed in RAW, but while SANITIZED remains open. The perturbation is then absorbed only by the closed branch.

These profiles map to the paper's core distinction:

  • localized error: still revisable
  • distributed error: absorbed into a non-reopenable trajectory

Quick start

Reference mode reproduces the packaged signatures immediately and does not require an API key:

cd closure-repro
python3 run.py --profile lab5_closure_boundary
python3 analysis.py --input outputs/reference_lab5_closure_boundary.json

You can also reproduce the recursive split:

python3 run.py --profile lab2_recursive_split
python3 analysis.py --input outputs/reference_lab2_recursive_split.json

By default, reference runs write profile-specific output files under outputs/ using the pattern reference_<profile_id>.json.

Live model mode

An optional OpenAI-compatible backend is included for researchers who want to probe the same protocols on live models:

pip install -r requirements.txt
export OPENAI_API_KEY=your_key_here
python3 run.py --profile lab5_closure_boundary --backend openai-compatible --model gpt-4o-mini

Optional environment variables:

  • OPENAI_API_KEY
  • CLOSURE_REPRO_API_KEY
  • CLOSURE_REPRO_BASE_URL
  • CLOSURE_REPRO_MODEL

Important note on reference vs live

The packaged reference backend is the stable reproduction artifact. It encodes the canonical split and boundary signatures directly so that anyone can inspect the phenomenon quickly.

The openai-compatible backend is exploratory. It uses the same profile definitions and analysis logic, but live results will vary by model family and endpoint.

Package layout

  • profiles/ Experiment definitions.
  • run.py Entry point for running a profile.
  • experiment.py Reference and live backends.
  • analysis.py Minimal summary / interpretation pass.
  • outputs/ Included sample outputs.
  • docs/paper_link.md Paper and evidence links.

What to look for

For lab2_recursive_split:

  • split observed: YES
  • raw first failure turn: 1
  • sanitized first failure turn: N/A

For lab5_closure_boundary:

  • raw closure proxy turn: 5
  • perturbation turn: 18
  • raw amplification persistence turn: 21
  • sanitized closure proxy turn: N/A
  • raw basin state: stabilized
  • sanitized basin state: open

Live model mode — closure detection note

In the reference backend, closure is detected directly: the closure_proxy flag is set by a hardcoded turn threshold (turn ≥ 5 in the LAB5 profile), so the signal is always present and unambiguous.

In the openai-compatible live backend, closure_proxy is set to False on every trace. Closure is instead inferred post-hoc by compute_lab5_metrics using a streak heuristic: if the model holds the wrong claim with confidence ≥ 0.80 for closure_window consecutive turns (default: 3), a closure_proxy_turn is recorded.

This means live runs can miss or delay closure detection in two situations:

  • The model drifts to the wrong claim but reports low confidence — the streak counter resets, and closure_proxy_turn comes back None even though the trajectory has effectively closed.
  • The model self-corrects mid-streak — the window resets and closure is not flagged, even if it re-closes later.

What to look for with live models: focus on whether the raw branch holds the wrong claim persistently after the perturbation turn, while the sanitized branch does not. The drift_flag column in the traces is the per-turn indicator; basin_state and belief_basin_strength in the summary reflect the tail of the run. Closure proxy turn is a convenience signal, not the ground truth.

Relation to GuardianAI

This package demonstrates the phenomenon without GuardianAI.

GuardianAI adds a structured observability and control layer on top of related dynamics in live systems:

Citation

If you use this artifact, cite:

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages