SENTINEL-GRAPH is a learned sidecar controller for influence topology in multi-agent LLM systems.
It does not fully orchestrate reasoning. Instead, it observes a time-varying communication and influence graph over agents or reasoning branches, then predicts topology actions that:
- damp harmful branches
- amplify useful branches
- rebalance unhealthy concentration
- preserve graph health when intervention is unnecessary
The public-facing repository is organized around the main topology-control path. Internally, this corresponds to the validated V5.1 topology-aligned checkpoint, but public users do not need the earlier phase numbering to run the main result.
This public branch is designed to be:
- cloneable on ordinary GitHub limits
- runnable on another machine without bundled checkpoints
- focused on the validated paper path plus clearly labeled exploratory work
At each control step, SENTINEL-GRAPH featurizes the current multi-agent interaction graph and predicts a small topology action:
preserveamplify_productivedampen_harmfulrebalancequarantine_outlierwhen supported
The learned controller acts as a sidecar: it modifies influence, but it does not directly own memory policy, stop policy, or full reasoning orchestration.
Earlier broader controller formulations were harder to interpret and harder to validate closed loop. The narrowed topology-control formulation is more research-friendly:
- it asks a smaller, clearer graph-control question
- its action space is interpretable
- it supports deterministic non-learning baselines
- it makes it easier to separate useful local graph signals from noisy global graph summaries
In the current prototype:
- local edge-relation features are useful
- non-graph failure-pressure and task-state features are strong
- global connectivity and raw influence-weight summaries are better treated as diagnostics than as primary predictive features
Validated closed-loop headline:
- learned MLP topology controller:
0.597final-answer accuracy - rule-oracle topology controller:
0.580 no_controller:0.333static_equal_weights:0.333- classical topology controller:
0.333 - group-dropout MLP refinement:
0.557 - ablation-informed MLP refinement:
0.487
Interpretation:
- the learned MLP topology controller clearly beats the non-intervening baselines
- it slightly beats the current conservative rule-oracle topology controller
dampen_harmfulremains the most important beneficial learned behavior- exact values may vary slightly with seeds and environment details
This repository also includes several offline and exploratory control extensions that are useful for research, but are not the validated headline system result.
Graph-native offline probes:
- pure-PyTorch mean-aggregation GNN:
0.497 / 0.355 / 0.468 - custom edge-aware GAT:
0.532 / 0.444 / 0.539 - hybrid GAT + engineered features:
0.756 / 0.785 / 0.759
Runtime-safe control probes:
- runtime-safe edge
ΔA_tprediction is feasible offline - conservative and threshold-swept edge-update controllers were safe but did not improve closed-loop task outcomes
- edge-to-node aggregation and hierarchical local-global controllers remained exploratory and did not beat the validated topology MLP online
Main takeaway:
- stronger graph-native representations can look very good offline
- offline performance alone did not guarantee stronger closed-loop control
- the validated public claim therefore remains the engineered-feature MLP topology controller
The repository ships as a small Python package with optional development and modeling extras.
git clone https://github.com/nbashyal/SENTINAL_GRAPH.git
cd SENTINAL_GRAPH
python3 -m pip install -e ".[dev,modeling]"If you only want the non-modeling core, omit the modeling extra.
If you want the shortest path from clone to a runnable experiment:
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 pytest -q
python3 scripts/build_topology_rows.py
python3 scripts/train_topology_mlp.py --class-weighted-loss
python3 scripts/run_topology_benchmark.pyThis public path rebuilds the needed artifacts locally instead of relying on large tracked outputs.
You can use either the versioned validated scripts or the public wrapper
aliases in scripts/.
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 pytest -qValidated command:
python3 scripts/build_v5_1_topology_rows.py \
--output data/training_rows_real_v5_1_topology_aligned.jsonlPublic wrapper alias:
python3 scripts/build_topology_rows.pyExpected output:
- validated path:
data/training_rows_real_v5_1_topology_aligned.jsonl - public alias path:
data/topology_rows_aligned.jsonl
Validated command:
python3 scripts/train_v5_topology_mlp.py \
--input data/training_rows_real_v5_1_topology_aligned.jsonl \
--class-weighted-loss \
--output-dir results/topology_mlpPublic wrapper alias:
python3 scripts/train_topology_mlp.py --class-weighted-lossExpected output directory:
results/topology_mlp/
Validated command:
python3 scripts/train_v5_topology_sequence_gru.py \
--input data/training_rows_real_v5_1_topology_aligned.jsonl \
--class-weighted-loss \
--output-dir results/topology_gruPublic wrapper alias:
python3 scripts/train_topology_gru.py --class-weighted-lossExpected output directory:
results/topology_gru/
The GRU path is included as a secondary sequence baseline, but it is not the main validated paper controller.
Validated command:
python3 scripts/run_v5_topology_closed_loop.py \
--output-dir results/topology_closed_loop \
--include-v5-1-modelsPublic wrapper alias:
python3 scripts/run_topology_benchmark.pyExpected output directory:
results/topology_closed_loop/
Expected benchmark artifacts include:
aggregate_metrics.csvcontroller_comparison.csvby_failure_mode_metrics.csvaction_distribution.csvper_episode_traces.jsonl
For the validated internal paths, see docs/REPRODUCIBILITY.md.
Beyond the validated topology MLP benchmark, the repo includes additional research paths:
- graph-feature ablation:
python3 scripts/run_v5_2_graph_feature_ablation.py
- group-regularized topology MLP variants:
python3 scripts/run_v5_3_group_regularized_mlp.py
- pure-PyTorch graph-native probes:
python3 scripts/run_v5_4_graph_native_probe.pypython3 scripts/train_v5_4_torch_gnn.pypython3 scripts/train_v5_4_custom_gat.pypython3 scripts/train_v5_4_hybrid_gat.py
- runtime-safe
ΔA_tand local-global control:python3 scripts/build_runtime_safe_graph_dataset.pypython3 scripts/train_runtime_safe_delta_a.pypython3 scripts/build_runtime_safe_local_global_control_dataset.pypython3 scripts/train_runtime_safe_local_global_model.py
These paths are included so other researchers can extend the work locally, but they should be interpreted through the accompanying docs rather than treated as headline validated results.
sentinel_graph/
v5_topology.py
v5_topology_closed_loop.py
v5_classical_topology_controller.py
v5_feature_ablation.py
v5_group_regularization.py
v5_4_torch_gnn.py
v5_4_custom_gat.py
v5_4_hybrid_gat.py
runtime_safe_graph_dataset.py
runtime_safe_delta_model.py
runtime_safe_local_global_model.py
hierarchical_outcome_controller.py
scripts/
build_topology_rows.py
train_topology_mlp.py
train_topology_gru.py
run_topology_benchmark.py
build_v5_1_topology_rows.py
train_v5_topology_mlp.py
train_v5_topology_sequence_gru.py
run_v5_topology_closed_loop.py
run_v5_2_graph_feature_ablation.py
run_v5_3_group_regularized_mlp.py
run_v5_4_graph_native_probe.py
train_v5_4_torch_gnn.py
train_v5_4_custom_gat.py
train_v5_4_hybrid_gat.py
build_runtime_safe_graph_dataset.py
train_runtime_safe_delta_a.py
build_runtime_safe_local_global_control_dataset.py
train_runtime_safe_local_global_model.py
docs/
SENTINEL_GRAPH_REFOCUSED_THESIS.md
SENTINEL_GRAPH_V5_TOPOLOGY_RESULTS.md
SENTINEL_GRAPH_CLASSICAL_TOPOLOGY_BASELINE.md
SENTINEL_GRAPH_V5_2_GRAPH_FEATURE_ABLATION.md
SENTINEL_GRAPH_V5_3_GROUP_REGULARIZED_MLP.md
SENTINEL_GRAPH_V5_4_GRAPH_NATIVE_PROBE.md
SENTINEL_GRAPH_RUNTIME_SAFE_DELTA_A_TRAINING.md
SENTINEL_GRAPH_RUNTIME_SAFE_LOCAL_GLOBAL_TRAINING.md
SENTINEL_GRAPH_HIERARCHICAL_OUTCOME_CONTROLLER.md
REPRODUCIBILITY.md
archive/
README.md
This public repository is intentionally paper-focused. Large generated datasets, checkpoints, and experiment outputs are omitted from Git history so the repository stays inspectable and pushable on standard GitHub limits.
Main aligned dataset:
- validated path:
data/training_rows_real_v5_1_topology_aligned.jsonl - public alias path:
data/topology_rows_aligned.jsonl
Main offline result paths:
- validated:
results/v5_1_topology_offline/ - public-friendly examples:
results/topology_mlp/,results/topology_gru/
Exploratory result families discussed in the docs:
- graph-native probe outputs under
results/v5_4_graph_native_probe/ - runtime-safe
ΔA_toutputs underresults/runtime_safe_delta_a_training/ - local-global graph-control outputs under
results/runtime_safe_local_global_control_training/ - hierarchical outcome-aware controller outputs under
results/hierarchical_outcome_aware_controller/
Main closed-loop result paths:
- validated:
results/v5_1_topology_closed_loop/ - public-friendly example:
results/topology_closed_loop/
The active interpretation of those outputs is summarized in:
A compact paper-facing subset of validated summary tables is included under:
Generated artifacts that are intentionally not tracked in this public branch include:
- large derived
data/*.jsonltraining rows - runtime-safe graph datasets
- model checkpoints such as
model.pt - bulk
results/directories and per-episode traces
Those files can be recreated from the checked-in code and commands in docs/REPRODUCIBILITY.md.
Repository-wide validation:
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 pytest -qTargeted topology smoke path:
- build aligned topology rows
- train topology MLP for
1epoch - train topology GRU for
1epoch - run a closed-loop benchmark with
1episode per dataset on a small failure-mode subset
Exact smoke commands are listed in docs/REPRODUCIBILITY.md.
- Topology-only control does not solve
liveness_loop. - Local edge-relation features matter more than current global connectivity or raw influence-weight summaries.
- Spectral and connectivity features are currently better treated as diagnostics than as primary learning inputs.
- Strong graph-native and local-global offline results did not reliably transfer into stronger closed-loop control.
- This repository does not claim a formal Byzantine fault-tolerance guarantee.
- The current rule-oracle topology controller is heuristic, not a true upper bound.
Broad V2/V3/V4 controller experiments and earlier validation artifacts are preserved under archive/.
Those experiments motivated the narrowed topology-control thesis, but they are not required to reproduce the current public result.
Citation placeholder:
@misc{sentinel_graph,
title = {SENTINEL-GRAPH},
author = {SENTINEL-GRAPH Contributors},
year = {2026},
note = {Topology-control sidecar for multi-agent LLM systems}
}