Skip to content

nbashyal/SENTINEL_GRAPH

Repository files navigation

SENTINEL-GRAPH

Project Overview

SENTINEL-GRAPH is a learned sidecar controller for influence topology in multi-agent LLM systems.

It does not fully orchestrate reasoning. Instead, it observes a time-varying communication and influence graph over agents or reasoning branches, then predicts topology actions that:

  • damp harmful branches
  • amplify useful branches
  • rebalance unhealthy concentration
  • preserve graph health when intervention is unnecessary

The public-facing repository is organized around the main topology-control path. Internally, this corresponds to the validated V5.1 topology-aligned checkpoint, but public users do not need the earlier phase numbering to run the main result.

This public branch is designed to be:

  • cloneable on ordinary GitHub limits
  • runnable on another machine without bundled checkpoints
  • focused on the validated paper path plus clearly labeled exploratory work

Core Idea

At each control step, SENTINEL-GRAPH featurizes the current multi-agent interaction graph and predicts a small topology action:

  • preserve
  • amplify_productive
  • dampen_harmful
  • rebalance
  • quarantine_outlier when supported

The learned controller acts as a sidecar: it modifies influence, but it does not directly own memory policy, stop policy, or full reasoning orchestration.

Why Topology Control?

Earlier broader controller formulations were harder to interpret and harder to validate closed loop. The narrowed topology-control formulation is more research-friendly:

  • it asks a smaller, clearer graph-control question
  • its action space is interpretable
  • it supports deterministic non-learning baselines
  • it makes it easier to separate useful local graph signals from noisy global graph summaries

In the current prototype:

  • local edge-relation features are useful
  • non-graph failure-pressure and task-state features are strong
  • global connectivity and raw influence-weight summaries are better treated as diagnostics than as primary predictive features

Main Result

Validated closed-loop headline:

  • learned MLP topology controller: 0.597 final-answer accuracy
  • rule-oracle topology controller: 0.580
  • no_controller: 0.333
  • static_equal_weights: 0.333
  • classical topology controller: 0.333
  • group-dropout MLP refinement: 0.557
  • ablation-informed MLP refinement: 0.487

Interpretation:

  • the learned MLP topology controller clearly beats the non-intervening baselines
  • it slightly beats the current conservative rule-oracle topology controller
  • dampen_harmful remains the most important beneficial learned behavior
  • exact values may vary slightly with seeds and environment details

Exploratory Extensions

This repository also includes several offline and exploratory control extensions that are useful for research, but are not the validated headline system result.

Graph-native offline probes:

  • pure-PyTorch mean-aggregation GNN: 0.497 / 0.355 / 0.468
  • custom edge-aware GAT: 0.532 / 0.444 / 0.539
  • hybrid GAT + engineered features: 0.756 / 0.785 / 0.759

Runtime-safe control probes:

  • runtime-safe edge ΔA_t prediction is feasible offline
  • conservative and threshold-swept edge-update controllers were safe but did not improve closed-loop task outcomes
  • edge-to-node aggregation and hierarchical local-global controllers remained exploratory and did not beat the validated topology MLP online

Main takeaway:

  • stronger graph-native representations can look very good offline
  • offline performance alone did not guarantee stronger closed-loop control
  • the validated public claim therefore remains the engineered-feature MLP topology controller

Installation

The repository ships as a small Python package with optional development and modeling extras.

git clone https://github.com/nbashyal/SENTINAL_GRAPH.git
cd SENTINAL_GRAPH
python3 -m pip install -e ".[dev,modeling]"

If you only want the non-modeling core, omit the modeling extra.

Quickstart

If you want the shortest path from clone to a runnable experiment:

PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 pytest -q
python3 scripts/build_topology_rows.py
python3 scripts/train_topology_mlp.py --class-weighted-loss
python3 scripts/run_topology_benchmark.py

This public path rebuilds the needed artifacts locally instead of relying on large tracked outputs.

Reproduce the Main Experiment

You can use either the versioned validated scripts or the public wrapper aliases in scripts/.

1. Run tests

PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 pytest -q

2. Build aligned topology rows

Validated command:

python3 scripts/build_v5_1_topology_rows.py \
  --output data/training_rows_real_v5_1_topology_aligned.jsonl

Public wrapper alias:

python3 scripts/build_topology_rows.py

Expected output:

  • validated path: data/training_rows_real_v5_1_topology_aligned.jsonl
  • public alias path: data/topology_rows_aligned.jsonl

3. Train the topology MLP

Validated command:

python3 scripts/train_v5_topology_mlp.py \
  --input data/training_rows_real_v5_1_topology_aligned.jsonl \
  --class-weighted-loss \
  --output-dir results/topology_mlp

Public wrapper alias:

python3 scripts/train_topology_mlp.py --class-weighted-loss

Expected output directory:

  • results/topology_mlp/

4. Train the topology GRU baseline

Validated command:

python3 scripts/train_v5_topology_sequence_gru.py \
  --input data/training_rows_real_v5_1_topology_aligned.jsonl \
  --class-weighted-loss \
  --output-dir results/topology_gru

Public wrapper alias:

python3 scripts/train_topology_gru.py --class-weighted-loss

Expected output directory:

  • results/topology_gru/

The GRU path is included as a secondary sequence baseline, but it is not the main validated paper controller.

5. Run the closed-loop topology benchmark

Validated command:

python3 scripts/run_v5_topology_closed_loop.py \
  --output-dir results/topology_closed_loop \
  --include-v5-1-models

Public wrapper alias:

python3 scripts/run_topology_benchmark.py

Expected output directory:

  • results/topology_closed_loop/

Expected benchmark artifacts include:

  • aggregate_metrics.csv
  • controller_comparison.csv
  • by_failure_mode_metrics.csv
  • action_distribution.csv
  • per_episode_traces.jsonl

For the validated internal paths, see docs/REPRODUCIBILITY.md.

Exploratory Reproduction Paths

Beyond the validated topology MLP benchmark, the repo includes additional research paths:

  • graph-feature ablation:
    • python3 scripts/run_v5_2_graph_feature_ablation.py
  • group-regularized topology MLP variants:
    • python3 scripts/run_v5_3_group_regularized_mlp.py
  • pure-PyTorch graph-native probes:
    • python3 scripts/run_v5_4_graph_native_probe.py
    • python3 scripts/train_v5_4_torch_gnn.py
    • python3 scripts/train_v5_4_custom_gat.py
    • python3 scripts/train_v5_4_hybrid_gat.py
  • runtime-safe ΔA_t and local-global control:
    • python3 scripts/build_runtime_safe_graph_dataset.py
    • python3 scripts/train_runtime_safe_delta_a.py
    • python3 scripts/build_runtime_safe_local_global_control_dataset.py
    • python3 scripts/train_runtime_safe_local_global_model.py

These paths are included so other researchers can extend the work locally, but they should be interpreted through the accompanying docs rather than treated as headline validated results.

Repository Structure

sentinel_graph/
  v5_topology.py
  v5_topology_closed_loop.py
  v5_classical_topology_controller.py
  v5_feature_ablation.py
  v5_group_regularization.py
  v5_4_torch_gnn.py
  v5_4_custom_gat.py
  v5_4_hybrid_gat.py
  runtime_safe_graph_dataset.py
  runtime_safe_delta_model.py
  runtime_safe_local_global_model.py
  hierarchical_outcome_controller.py

scripts/
  build_topology_rows.py
  train_topology_mlp.py
  train_topology_gru.py
  run_topology_benchmark.py
  build_v5_1_topology_rows.py
  train_v5_topology_mlp.py
  train_v5_topology_sequence_gru.py
  run_v5_topology_closed_loop.py
  run_v5_2_graph_feature_ablation.py
  run_v5_3_group_regularized_mlp.py
  run_v5_4_graph_native_probe.py
  train_v5_4_torch_gnn.py
  train_v5_4_custom_gat.py
  train_v5_4_hybrid_gat.py
  build_runtime_safe_graph_dataset.py
  train_runtime_safe_delta_a.py
  build_runtime_safe_local_global_control_dataset.py
  train_runtime_safe_local_global_model.py

docs/
  SENTINEL_GRAPH_REFOCUSED_THESIS.md
  SENTINEL_GRAPH_V5_TOPOLOGY_RESULTS.md
  SENTINEL_GRAPH_CLASSICAL_TOPOLOGY_BASELINE.md
  SENTINEL_GRAPH_V5_2_GRAPH_FEATURE_ABLATION.md
  SENTINEL_GRAPH_V5_3_GROUP_REGULARIZED_MLP.md
  SENTINEL_GRAPH_V5_4_GRAPH_NATIVE_PROBE.md
  SENTINEL_GRAPH_RUNTIME_SAFE_DELTA_A_TRAINING.md
  SENTINEL_GRAPH_RUNTIME_SAFE_LOCAL_GLOBAL_TRAINING.md
  SENTINEL_GRAPH_HIERARCHICAL_OUTCOME_CONTROLLER.md
  REPRODUCIBILITY.md

archive/
  README.md

Data and Generated Artifacts

This public repository is intentionally paper-focused. Large generated datasets, checkpoints, and experiment outputs are omitted from Git history so the repository stays inspectable and pushable on standard GitHub limits.

Main aligned dataset:

  • validated path: data/training_rows_real_v5_1_topology_aligned.jsonl
  • public alias path: data/topology_rows_aligned.jsonl

Main offline result paths:

  • validated: results/v5_1_topology_offline/
  • public-friendly examples: results/topology_mlp/, results/topology_gru/

Exploratory result families discussed in the docs:

  • graph-native probe outputs under results/v5_4_graph_native_probe/
  • runtime-safe ΔA_t outputs under results/runtime_safe_delta_a_training/
  • local-global graph-control outputs under results/runtime_safe_local_global_control_training/
  • hierarchical outcome-aware controller outputs under results/hierarchical_outcome_aware_controller/

Main closed-loop result paths:

  • validated: results/v5_1_topology_closed_loop/
  • public-friendly example: results/topology_closed_loop/

The active interpretation of those outputs is summarized in:

A compact paper-facing subset of validated summary tables is included under:

Generated artifacts that are intentionally not tracked in this public branch include:

  • large derived data/*.jsonl training rows
  • runtime-safe graph datasets
  • model checkpoints such as model.pt
  • bulk results/ directories and per-episode traces

Those files can be recreated from the checked-in code and commands in docs/REPRODUCIBILITY.md.

Tests

Repository-wide validation:

PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 pytest -q

Targeted topology smoke path:

  • build aligned topology rows
  • train topology MLP for 1 epoch
  • train topology GRU for 1 epoch
  • run a closed-loop benchmark with 1 episode per dataset on a small failure-mode subset

Exact smoke commands are listed in docs/REPRODUCIBILITY.md.

Limitations

  • Topology-only control does not solve liveness_loop.
  • Local edge-relation features matter more than current global connectivity or raw influence-weight summaries.
  • Spectral and connectivity features are currently better treated as diagnostics than as primary learning inputs.
  • Strong graph-native and local-global offline results did not reliably transfer into stronger closed-loop control.
  • This repository does not claim a formal Byzantine fault-tolerance guarantee.
  • The current rule-oracle topology controller is heuristic, not a true upper bound.

Archived Experiments

Broad V2/V3/V4 controller experiments and earlier validation artifacts are preserved under archive/.

Those experiments motivated the narrowed topology-control thesis, but they are not required to reproduce the current public result.

Citation / References

Citation placeholder:

@misc{sentinel_graph,
  title  = {SENTINEL-GRAPH},
  author = {SENTINEL-GRAPH Contributors},
  year   = {2026},
  note   = {Topology-control sidecar for multi-agent LLM systems}
}

About

Learned influence-topology control for robust multi-agent LLM networks.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages