Signal Relay: Multi-Agent Telephone

Measure how meaning decays across a chain of LLM agents and evaluate error-correction strategies that preserve fidelity.

Project Structure

FinalProject/
├── main.py                      # CLI entry point
├── config.py                    # Configuration dataclasses
├── requirements.txt             # Python dependencies
├── Knowledge/
│   └── instructions.md          # Experiment design document
├── signal_relay/
│   ├── __init__.py
│   ├── schema.py                # Message, HopRecord data models
│   ├── prompts.py               # Relay prompt templates
│   ├── relay.py                 # RelayChain engine (LLM calls)
│   ├── metrics.py               # Scoring & fidelity metrics
│   ├── experiment.py            # Experiment runner & batch matrix
│   ├── tasks.py                 # Pre-built baseline task messages
│   └── visualize.py             # Decay curves & comparison plots
└── experiments/                 # Auto-created output directory
    └── <run_id>/
        ├── original.yaml
        ├── hop_01.yaml … hop_NN.yaml
        ├── metrics.csv
        ├── decay_curve.png
        └── run_meta.json

Quick Start

1. Install dependencies

pip install -r requirements.txt

2. Set your API key

export OPENAI_API_KEY="sk-..."
# or
export ANTHROPIC_API_KEY="sk-ant-..."

3. List available tasks

python main.py tasks

4. Run a single experiment

# Baseline relay, 5 hops, solar flare task
python main.py run --task solar_flare --mode baseline --hops 5

# Error-corrected relay, 7 hops
python main.py run --task solar_flare --mode error_corrected --hops 7

# With periodic repair prompts every 3 hops
python main.py run --task recipe --mode error_corrected --hops 10 --repair

5. Run the full experiment matrix

# All tasks × both modes × depths 3,5,7,10
python main.py matrix

# Custom subset
python main.py matrix --tasks solar_flare,recipe --modes baseline,error_corrected --depths 3,5,7

6. Plot & compare results

# Plot a single run
python main.py plot --run-dir experiments/<run_id>

# Compare multiple runs
python main.py compare --run-dirs experiments/run1,experiments/run2 --labels "Baseline,Error-Corrected"

Configuration Options

Flag	Default	Description
`--provider`	`openai`	LLM provider (`openai`, `anthropic`)
`--model`	`gpt-4o-mini`	Model name
`--temperature`	`0.0`	Sampling temperature (0 = deterministic)
`--seed`	`42`	Random seed for reproducibility
`--output-dir`	`experiments`	Base output directory

Metrics

Each hop is scored against the original message:

Metric	Weight	Description
Constraint Fidelity	0.4	% of constraints preserved exactly
Keyword Retention	0.3	% of checksum keywords still present
Item Retention	0.2	% of content items retained (fuzzy match ≥ 0.7)
Order Preservation	0.1	Longest common subsequence of item IDs
Overall Fidelity	—	Weighted aggregate of the above

Additional tracked metrics: edit distance ratio, hallucination count, number retention.

Modes

Baseline — Simple "rewrite for the next agent" instruction
Error-Corrected — Strict fidelity rules + self-check verification
Repair (optional) — Periodic drift-correction prompt every N hops

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Signal Relay: Multi-Agent Telephone

Project Structure

Quick Start

1. Install dependencies

2. Set your API key

3. List available tasks

4. Run a single experiment

5. Run the full experiment matrix

6. Plot & compare results

Configuration Options

Metrics

Modes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Knowledge		Knowledge
signal_relay		signal_relay
.gitignore		.gitignore
README.md		README.md
config.py		config.py
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Signal Relay: Multi-Agent Telephone

Project Structure

Quick Start

1. Install dependencies

2. Set your API key

3. List available tasks

4. Run a single experiment

5. Run the full experiment matrix

6. Plot & compare results

Configuration Options

Metrics

Modes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages