This repository contains the simulation framework for studying Gossipsub protocol parameter optimization under churn, as presented in the paper "Lean Gossip: Big Gains From Small Talk."
The framework consists of three main components:
- meshsub.py - Core Gossipsub protocol simulator
- pareto_sweep.py - Parameter sweep for generating experimental data
- churn_regime_plots.py & dominance_plots.py - Visualization and analysis tools
meshsub.py implements a complete discrete-event simulator for the Gossipsub v1.0 protocol with Ethereum-realistic parameters. It models:
- Eager-push mesh overlay with configurable degree D
- Lazy-pull gossip with configurable degree D_lazy
- Bimodal churn model (stable + churny peers, based on Kiffer et al.)
- Geographic regions with realistic inter-region latencies
- Token-bucket bandwidth constraints (25 Mbps default per peer)
- Message production following Ethereum's attestation workload
- IHAVE/IWANT protocol for gossip-based recovery
- Mesh maintenance (GRAFT/PRUNE) via periodic heartbeats
- Fine-grained timing: 100ms tick resolution (configurable)
- Heartbeat interval: 1 second (10 ticks at default resolution)
- Message cache (mcache): Maintains last 3 heartbeat rounds for gossip
- Persistent delivery tracking: Accurately measures delivery even when peers rejoin
- Comprehensive metrics: Delivery rate, latency (p50/p90/p99), bandwidth breakdown
from meshsub import MeshSubComplete
# Create simulator with Ethereum defaults
sim = MeshSubComplete(
num_peers=1000, # Network size
tick_sec=0.1, # 100ms ticks
heartbeat_interval_sec=1.0, # Heartbeat every 1 second
peer_table_min=40, # Min peer connections
peer_table_max=80, # Max peer connections
D=8, # Eager mesh degree
D_lazy=6, # Gossip degree
validators_per_node=1, # Validators per peer
global_bandwidth_bps=25_000_000, # 25 Mbps per peer
msg_size_default=1536, # 1.5 KiB message size
churny_fraction=0.2, # 20% churny peers
churny_leave_rate=0.01, # Leave probability per tick
churny_rejoin_rate=0.05, # Rejoin probability per tick
)
# Run simulation: 10,000 ticks measurement + 2,000 ticks warmup
sim.run(n_ticks=10000, warmup_ticks=2000, verbose=True)
# Print comprehensive report
sim.print_report()
# Get detailed statistics
stats = sim.get_stats()
print(f"Delivery rate: {stats['delivery_mean']:.1%}")
print(f"Total bandwidth: {stats['total_bytes']/1024/1024:.2f} MB")
# Get latency statistics
latency = sim.get_latency_statistics()
print(f"p99 latency: {latency['p99_latency_sec']:.2f}s")num_peers: Total network size (default: 200)peer_table_min/max: Peer connection bounds (40-80, Kiffer et al.)D: Target eager mesh degree (default: 8, Ethereum)D_lazy: Gossip degree for lazy pull (default: 6)
tick_sec: Simulation tick duration (default: 0.1s = 100ms)heartbeat_interval_sec: Gossip heartbeat interval (default: 1.0s)
validators_per_node: Validators per peer (default: 1)- Determines message production rate: 1 attestation per 384 seconds per validator
msg_size_default: Message payload size (default: 1536 bytes)
global_bandwidth_bps: Per-peer upload capacity (default: 25 Mbps)
churny_fraction: Fraction of churny peers (default: 0.2 = 20%)stable_leave_rate: Leave probability for stable peers (default: 0.0)churny_leave_rate: Leave probability per tick for churny peers (default: 0.01)churny_rejoin_rate: Rejoin probability per tick (default: 0.05)
region_names: List of region labels (default: ['EU', 'US', 'Asia'])region_fractions: Distribution across regions (default: [0.3, 0.4, 0.3])
The simulator tracks:
- Delivery Rate: Fraction of online peers who received each message
- Latency Statistics: p50, p90, p99 message propagation delay
- Bandwidth Breakdown:
- Payload traffic (application messages)
- Control traffic (IHAVE/IWANT/GRAFT/PRUNE)
- Mesh maintenance vs. gossip overhead
- Churn Events: Leaves and joins by peer type
- Drop Statistics: Messages dropped due to bandwidth limits or offline peers
pareto_sweep.py performs exhaustive parameter sweeps over D (mesh degree) and D_lazy (gossip degree) across multiple churn regimes. It runs multiple random seeds per configuration to ensure statistical confidence.
# Full parameter sweep (default: 525 configurations × 3 seeds = 1,575 runs)
python pareto_sweep.py --output pareto_results.csv --workers 8# Reduced parameter space for testing
python pareto_sweep.py --quick --output test_results.csv# Custom number of seeds and parallel workers
python pareto_sweep.py --seeds 5 --workers 12 --output results.csv--quick: Use reduced parameter space (faster testing)--seeds N: Number of random seeds per configuration (default: 3)--output FILE: Output CSV filename (default: pareto_results.csv)--workers N: Number of parallel workers (default: 3)
- D: [2, 4, 6, 8, 10, 12, 14] (7 values)
- D_lazy: [3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 33] (15 values)
- Churn: [0.0, 0.1, 0.2, 0.3, 0.4] (5 values)
- Total: 7 × 15 × 5 = 525 configurations
- D: [6, 8, 10]
- D_lazy: [6, 12]
- Churn: [0.0, 0.2]
- Total: 3 × 2 × 2 = 12 configurations
The sweep generates a CSV file (pareto_results.csv) with columns:
D,D_lazy,churny_fraction,num_seeds
delivery_rate_mean/std: Fraction of messages deliveredp99_latency_sec_mean/std: 99th percentile latencybytes_per_delivery_mean/std: Total bandwidth cost per successful delivery
p50_latency_sec_mean/std,p90_latency_sec_mean/std: Additional latency percentilespayload_bytes_MB_mean/std: Application payload trafficcontrol_bytes_MB_mean/std: Protocol overhead (IHAVE/IWANT/GRAFT/PRUNE)meshmnt_bytes_MB_mean/std: Mesh maintenance trafficgossip_bytes_MB_mean/std: Gossip trafficdrops_peer_mean/std: Drops due to bandwidth limitsdrops_offline_mean/std: Drops due to offline peers
The script supports resuming interrupted sweeps:
- If
pareto_results.csvexists, it reads completed configurations - Skips already-completed (D, D_lazy, churn) tuples
- Appends new results to the existing file
# Initial run (interrupted after 100 configs)
python pareto_sweep.py --output pareto_results.csv --workers 8
# Resume from where it left off
python pareto_sweep.py --output pareto_results.csv --workers 8
# Will skip the 100 completed configs and continue- Single config runtime: ~30-60 seconds (1000 peers, 10,000 ticks)
- Full sweep: ~4-8 hours with 8 workers
- Memory: ~500 MB per worker
- Recommended workers: Number of CPU cores - 1
churn_regime_plots.py generates performance plots showing how mesh degree (D) and gossip degree (D_lazy) affect delivery, bandwidth cost, and latency across different churn regimes.
# Generate all plots from pareto_results.csv
python churn_regime_plots.py
# Use custom input/output paths
INPUT_CSV=my_results.csv OUTPUT_DIR=figures python churn_regime_plots.pyThree subplots showing D (x-axis) vs. metrics (y-axis), with separate lines per churn level:
- (a) Delivery Rate: Shows diminishing returns of increasing D
- (b) Normalized Cost: Bandwidth cost (β/P) vs. D, log-scale y-axis
- (c) Tail Latency (p99): Latency decreases with higher D
Error bars: Show range (min/max) across all D_lazy values at each D
FloodSub reference: If D=60, D_lazy=0 data exists, shown as separate reference point with visual break
Three subplots showing D_lazy (x-axis) vs. metrics:
- (a) Delivery Rate: Gossip improves delivery under churn
- (b) Normalized Cost: Cost increases sharply with D_lazy
- (c) Tail Latency (p99): Modest latency reduction with more gossip
Error bars: Show range across all D values at each D_lazy
Shows D sweep for a single D_lazy value (default: 9, Ethereum-like):
- Cleaner view without aggregation across D_lazy
- Ethernet's D=8 marked with vertical dashed line
Shows D_lazy sweep for a single D value (default: 8, Ethereum):
- Illustrates gossip's effectiveness at a realistic mesh degree
- Dynamic axis limits: Automatically adjusts y-axis ranges based on data
- Delivery rate formatting: Y-axis capped at 1.00 even if visual padding extends beyond
- Log-scale axes: Cost and latency use log scale for clarity
- Churn color scheme: Consistent colors across all plots
- 0% churn: Green
- 10%: Blue
- 20%: Orange
- 30%: Red
- 40%: Purple
- Graceful handling of incomplete data: Skips plots if insufficient data available
Environment variables:
# Custom input CSV
INPUT_CSV=custom_results.csv python churn_regime_plots.py
# Custom output directory
OUTPUT_DIR=my_figures python churn_regime_plots.py
# Both
INPUT_CSV=data.csv OUTPUT_DIR=plots python churn_regime_plots.pydominance_plots.py generates the Pareto dominance visualization showing that Gossipsub dominates Floodsub across delivery-cost tradeoffs.
# Generate Pareto dominance plot
python dominance_plots.pyfig_pareto_dominance.png: Scatter plot showing:
- X-axis: Bytes per delivery (log scale)
- Y-axis: Delivery rate
- Marker size: Scaled by D_lazy (smaller = less gossip)
- Marker opacity: D_lazy ≤ 1 shown at full opacity, higher values de-emphasized
Key elements:
- GossipSub Pareto frontier: Black line connecting non-dominated configurations
- FloodSub feasible region: Red vertical band at D=60, D_lazy=0
- Annotations:
- "Minimal gossip suffices" pointing to D_lazy=1 threshold
- FloodSub characteristics (low latency, high cost, lower delivery)
The plot demonstrates:
- GossipSub (D=2, D_lazy=1) achieves equivalent delivery to FloodSub at 29× lower cost
- Pareto frontier shows configurations that are not dominated on both delivery and cost
- Minimal gossip (D_lazy=1) captures most reliability benefits
- FloodSub lies outside the efficient frontier despite maximal connectivity
Saved to figs/fig_pareto_dominance.png (directory created automatically)
# Full experimental sweep (takes 4-8 hours with 8 workers)
python pareto_sweep.py --seeds 3 --workers 8 --output pareto_results.csv# Generate churn regime analysis
python churn_regime_plots.py
# Generate Pareto dominance plot
python dominance_plots.pyAfter execution, you'll have:
pareto_results.csv: Raw experimental data (645 rows for full sweep)plots/fig_D_by_churn.png: D impact analysisplots/fig_Dlazy_by_churn.png: D_lazy impact analysisplots/fig_D_fixed_Dlazy9.png: D sweep for Ethereum's D_lazyplots/fig_Dlazy_fixed_D8.png: D_lazy sweep for Ethereum's Dfigs/fig_pareto_dominance.png: Pareto frontier visualization
pip install numpy pandas matplotlibnumpy: Numerical operations and random samplingpandas: Data manipulation and CSV I/Omatplotlib: Plotting and visualization
- Requires Python 3.7+ (uses dataclasses, type hints)
- Cause: Some configurations may not stabilize within 10,000 ticks
- Solution: Increase
n_ticksinpareto_sweep.py(line 23) or warmup period
- Cause: Too many workers for available RAM
- Solution: Reduce
--workersargument (each worker uses ~500 MB)
- Cause: Incomplete experimental data for those parameters
- Solution: Scripts gracefully skip missing data; check CSV for available configurations
- Cause: No D=60, D_lazy=0 configuration in results
- Solution: Add FloodSub to parameter sweep by including D=60, D_lazy=0 in ranges
This simulation framework is released as open-source software accompanying the "Lean Gossip" paper. See repository for license details.
For questions or issues, please open an issue on the GitHub repository or contact the authors.