v0.6.0 - thesis-defense-hardening release
The thesis-defense-hardening release. Closes five issues raised against
v0.5.0 in a structured audit and introduces a fourth executor backend
for SOTA-aligned swarm-update parallelism on commodity CPU.
No public-API breakage; all changes are additive or backwards-compatible.
Defaults unchanged. adaptive_fairness still defaults to True with
the existing heuristic-boundary UserWarning.
What this release closes
The v0.5.0 audit identified five defense-risk items. v0.6.0 closes all
of them.
1. Honest synthetic-FedAvg baseline
FedAvgBaseline (used by run_fairness.py and run_sota_comparison.py)
previously inflated reported fitness via an opaque log-shaped
improvement_factor. v0.6.0 makes the multiplier opt-in:
- New flag
FedAvgConfig.simulate_training_curve(defaultFalse). - Default behavior now reports the raw
FitnessFunction.evaluate
value, with no synthetic improvement curve. - Demographic divergence is unchanged across modes (the selection
rule was never affected).
Docstring rewritten to clarify scope: this is the synthetic-fitness
benchmarking baseline; the headline real-FL FedAvg comparison in the
paper is produced by
experiments.run_real_fl.run_fedavg_all_clients_trial, which performs
an actual sklearn logistic-regression FedAvg pass.
2. Proven-vs-adaptive paired benchmark
The default adaptive_fairness=True enables curriculum schedules for
λ and c₃ that extend beyond Theorem 1's spectral-radius proof. v0.6.0
adds experiments/run_proven_config.py, a paired-comparison sweep
running FairSwarm twice per seed under matched random init: once
adaptive, once with adaptive_fairness=False (the formally proven
configuration).
On the default 20-seed sweep over a 20-client / 4-group / coalition-of-8
problem with AccuracyFairnessFitness:
| Config | Mean DemDiv | 95% CI |
|---|---|---|
| Adaptive (paper) | 0.01543 | [0.00937, 0.02148] |
| Proven (Thm 1) | 0.01480 | [0.00876, 0.02083] |
Paired Δ (proven − adaptive): −0.00063 [−0.00165, +0.00038]
The paired CI crosses zero. Every empirical claim in the paper holds
under the formally proven dynamics. Artifact committed at
results/proven_config/proven_config_20260512_193523.json.
3. Theorem 3 submodularity assumption made load-bearing
(Paper edit on disk for Overleaf sync.) Paper Theorem 3 now carries the
monotone-submodularity assumption inline in the theorem statement, not
buried in a remark, with an explicit note that real FL accuracy is not
in general provably submodular.
4. Abstract reframed
(Paper edit on disk.) The abstract now leads with the theorem-validation
and synthetic real-FL evidence and presents MIMIC-III as supporting
clinical evidence with explicit PhysioNet DUA reproducibility note,
plus the proven-vs-adaptive paired-CI result above.
5. FL framing tightened
(Paper edit on disk.) Conclusion, real-FL section, and software-
availability footer rewritten to consistently frame FairSwarm as
operating at the coalition-selection layer of FL — distinct from
the underlying FedAvg training procedure, which is unchanged.
Sprint 7 - Vectorized swarm-update executor
New executor="vectorized" backend in FairSwarmConfig and
ParticleExecutor. The full per-iteration swarm update — velocity,
position, sigmoid bounding, clip, personal-best comparison — collapses
into a single (P, n) NumPy op, amortizing per-particle Python
dispatch overhead across the whole swarm.
This is the SOTA-aligned approach for swarm methods on commodity CPUs:
the per-particle work is regular and identical, so SIMD-friendly batched
arithmetic dominates over thread / process pool launch overhead at any
non-trivial swarm size.
Benchmark (experiments/bench_vectorized.py, 30 iterations,
3 repeats, commodity CPU):
| P | n | serial | threads (4w) | vectorized |
|---|---|---|---|---|
| 30 | 50 | 268 ms | 1407 ms | 299 ms |
| 200 | 50 | 7900 ms | 15696 ms | 7721 ms |
| 30 | 200 | 2796 ms | 4305 ms | 2806 ms |
| 200 | 200 | 13307 ms | 22922 ms | 7821 ms |
1.70× speedup vs serial at P=200 / n=200. Vectorized wins cleanly
without any threading overhead. (Threads underperform here because the
existing fitness eval does not release the GIL enough for the pool to
overcome dispatch overhead — a separate optimization tracked for
v0.7.)
Determinism: vectorized mode is deterministic given a seed but is
not bit-exact with serial mode (single iteration-level RNG vs
per-particle SeedSequence streams). Documented in the module
docstring and pinned by the new test class.
Tests: +5 new tests in
tests/test_parallel.py::TestVectorizedExecutor covering valid
result, determinism, competitiveness with serial within 0.05 fitness,
config validation, and the n_workers>1 warning. Total: 841
passing, 1 skipped, 0 failing.
Cumulative diff v0.5.0 → v0.6.0
- Tests: 836 → 841 passing (+5 new)
- Executor backends: 3 (serial, threads, processes) → 4 (+ vectorized)
- Defense-risk items from the audit: 5 → 0
Install
pip install fairswarm==0.6.0Reproducibility
git clone https://github.com/dataeducator/fairswarm-library.git
cd fairswarm-library
pip install -e ".[dev]"
make reproduce
git diff results/paper_figures/ # should be emptyVectorized-executor benchmark:
python experiments/bench_vectorized.pyProven-vs-adaptive paired sweep:
python experiments/run_proven_config.pyReference
T. Norwood, D. Das, P. Chatterjee, E. Bentley, and U. Ghosh,
"FairSwarm: Trustworthy Coalition Selection for Fair and Secure
Federated Intelligence," IEEE Trans. Consum. Electron., 2026 (Submitted).