CrossCheck is a Python package for repairing and validating inconsistent network telemetry data using Democratic Trust Propagation (DTP) and demand-invariant checking. This work was published at NSDI 2026.
@inproceedings{krentsel2026crosscheck,
title={CrossCheck: Input Validation for WAN Control Systems},
author={Krentsel, Alexander and Iyer, Rishabh and Keslassy, Isaac and Ratnasamy, Sylvia and Modhipalli, Bharath and Shaikh, Anees and Shakir, Rob},
booktitle={20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 26)},
year={2026},
address={Renton, WA, USA},
publisher={USENIX Association}
}Network telemetry data—such as interface counter values and traffic demands—is often inconsistent due to measurement errors, packet loss, or misconfiguration. CrossCheck addresses this challenge through:
-
Repair: Democratic Trust Propagation (DTP) uses a voting mechanism where each router's flow conservation constraints provide independent opinions about correct values. The algorithm iteratively locks high-confidence values and propagates trust to resolve inconsistencies.
-
Validation: Validates repaired data against demand invariants and optionally path-based predictions to ensure data consistency and accuracy.
The system achieves high repair accuracy (>95%) even with significant measurement errors (10-20%), making it suitable for network monitoring, troubleshooting, and analysis pipelines.
This implementation accompanies the paper "CrossCheck: Input Validation for WAN Control Systems" presented at NSDI 2026.
We recommend using uv for Python package management:
git clone <repository-url>
cd crosscheck
uv pip install -e .Or with pip:
pip install -e .import pandas as pd
import json
from crosscheck import CrossCheck, CrossCheckConfig
# Load topology and data
with open('topology.json') as f:
topology = json.load(f)
df = pd.read_pickle('telemetry.pkl')
# Initialize CrossCheck
cc = CrossCheck(topology, CrossCheckConfig(disable_cache=True))
# Run repair and validation
result = cc.process_df(df)
# Inspect results
print(f"Repaired {len(result)} snapshots")
print(f"Avg repair confidence: {result['repair_confidence'].mean():.2f}")
print(f"Avg validation confidence: {result['validation_confidence'].mean():.2f}")Two complete working examples are provided in the examples/ directory:
-
simple_example.py: Basic 3-node synthetic network demonstrating core functionalitycd examples uv run python3 simple_example.pyShows 36-49% error reduction on a small linear topology (A-B-C).
-
abilene_example.py: Real-world Abilene network (12 nodes, 15 links)cd examples uv run python3 abilene_example.pyShows 40%+ error reduction on real network telemetry with 98%+ validation pass rate.
See examples/README.md for detailed documentation.
{
"nodes": [
{"id": 0, "name": "NodeA"},
{"id": 1, "name": "NodeB"}
],
"links": [
{"source": 0, "target": 1}
],
"external_nodes": ["NodeA", "NodeB"]
}Each row represents a network snapshot with columns:
- Metadata:
timestamp,telemetry_perturbed_type,input_perturbed_type,true_detect_inconsistent - Demands (
high_*): Traffic demands between node pairs as float values - Counters (
low_*): Interface counter values as dicts with:ground_truth: Original correct value (for evaluation)perturbed: Measured value with errorscorrected: Repaired value (added by algorithm)confidence: Repair confidence score (added by algorithm)
Example counter format:
{
'ground_truth': 100.0,
'perturbed': 92.5,
'corrected': 98.7,
'confidence': 0.85
}See examples/sample_data/README.md for detailed format specification.
CrossCheck: Main pipeline class combining repair and validation
cc = CrossCheck(topology, config=CrossCheckConfig())
result_df = cc.process_df(df, paths_path=None)CrossCheckConfig: Configuration for the pipeline
config = CrossCheckConfig(
repair_config=RepairConfig(...),
validator_config=ValidatorConfig(...),
disable_cache=False
)RepairConfig: Configuration for DTP repair algorithm
repair_config = RepairConfig(
num_trials=30, # Number of repair trials
similarity_threshold=0.05, # Fuzzy matching tolerance
seed=42, # Random seed for reproducibility
disable_cache=False # Enable/disable result caching
)ValidatorConfig: Configuration for validation
validator_config = ValidatorConfig(
confidence_cutoff=0.0, # Confidence threshold for validation
threshold=0.03, # Percentage equality tolerance
counter_bias_correction=0.024, # Bias correction factor
disable_cache=False # Enable/disable result caching
)CrossCheck.process_df(df, paths_path): Process complete DataFrame through repair and validationCrossCheck.repair_row(row, paths): Repair a single rowCrossCheck.validate_row(row, paths): Validate a single rowDemocraticTrustPropagationRepair.repair_df(df, paths_path): Repair DataFrame (standalone)Validator.validate_df(df, paths_path): Validate DataFrame (standalone)
crosscheck/
├── crosscheck.py # Main CrossCheck pipeline
├── dtp.py # Democratic Trust Propagation repair
├── validate.py # Validation logic
├── common_utils.py # Shared utilities (NetworkTopology, etc.)
└── snapshot_cache.py # Performance caching
- Python >= 3.8
- pandas >= 1.3.0
- numpy >= 1.20.0
- tqdm >= 4.60.0
If you use CrossCheck in your research, please cite:
@inproceedings{krentsel2026crosscheck,
title={CrossCheck: Input Validation for WAN Control Systems},
author={Krentsel, Alexander and Iyer, Rishabh and Keslassy, Isaac and Ratnasamy, Sylvia and Modhipalli, Bharath and Shaikh, Anees and Shakir, Rob},
booktitle={20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 26)},
year={2026},
address={Renton, WA, USA},
publisher={USENIX Association}
}MIT License. See LICENSE file for details.
Contributions are welcome! Please open an issue or pull request on GitHub.
For questions or issues, please open a GitHub issue or contact the authors:
- Alexander Krentsel (akrentsel@berkeley.edu)
- Paper: "CrossCheck: Input Validation for WAN Control Systems" (NSDI 2026)