Skip to content

v0.6.0 - thesis-defense-hardening release

Choose a tag to compare

@dataeducator dataeducator released this 13 May 00:31
· 27 commits to main since this release

The thesis-defense-hardening release. Closes five issues raised against
v0.5.0 in a structured audit and introduces a fourth executor backend
for SOTA-aligned swarm-update parallelism on commodity CPU.

No public-API breakage; all changes are additive or backwards-compatible.
Defaults unchanged. adaptive_fairness still defaults to True with
the existing heuristic-boundary UserWarning.

What this release closes

The v0.5.0 audit identified five defense-risk items. v0.6.0 closes all
of them.

1. Honest synthetic-FedAvg baseline

FedAvgBaseline (used by run_fairness.py and run_sota_comparison.py)
previously inflated reported fitness via an opaque log-shaped
improvement_factor. v0.6.0 makes the multiplier opt-in:

  • New flag FedAvgConfig.simulate_training_curve (default False).
  • Default behavior now reports the raw FitnessFunction.evaluate
    value, with no synthetic improvement curve.
  • Demographic divergence is unchanged across modes (the selection
    rule was never affected).

Docstring rewritten to clarify scope: this is the synthetic-fitness
benchmarking baseline; the headline real-FL FedAvg comparison in the
paper is produced by
experiments.run_real_fl.run_fedavg_all_clients_trial, which performs
an actual sklearn logistic-regression FedAvg pass.

2. Proven-vs-adaptive paired benchmark

The default adaptive_fairness=True enables curriculum schedules for
λ and c₃ that extend beyond Theorem 1's spectral-radius proof. v0.6.0
adds experiments/run_proven_config.py, a paired-comparison sweep
running FairSwarm twice per seed under matched random init: once
adaptive, once with adaptive_fairness=False (the formally proven
configuration).

On the default 20-seed sweep over a 20-client / 4-group / coalition-of-8
problem with AccuracyFairnessFitness:

Config Mean DemDiv 95% CI
Adaptive (paper) 0.01543 [0.00937, 0.02148]
Proven (Thm 1) 0.01480 [0.00876, 0.02083]

Paired Δ (proven − adaptive): −0.00063 [−0.00165, +0.00038]

The paired CI crosses zero. Every empirical claim in the paper holds
under the formally proven dynamics. Artifact committed at
results/proven_config/proven_config_20260512_193523.json.

3. Theorem 3 submodularity assumption made load-bearing

(Paper edit on disk for Overleaf sync.) Paper Theorem 3 now carries the
monotone-submodularity assumption inline in the theorem statement, not
buried in a remark, with an explicit note that real FL accuracy is not
in general provably submodular.

4. Abstract reframed

(Paper edit on disk.) The abstract now leads with the theorem-validation
and synthetic real-FL evidence and presents MIMIC-III as supporting
clinical evidence
with explicit PhysioNet DUA reproducibility note,
plus the proven-vs-adaptive paired-CI result above.

5. FL framing tightened

(Paper edit on disk.) Conclusion, real-FL section, and software-
availability footer rewritten to consistently frame FairSwarm as
operating at the coalition-selection layer of FL — distinct from
the underlying FedAvg training procedure, which is unchanged.

Sprint 7 - Vectorized swarm-update executor

New executor="vectorized" backend in FairSwarmConfig and
ParticleExecutor. The full per-iteration swarm update — velocity,
position, sigmoid bounding, clip, personal-best comparison — collapses
into a single (P, n) NumPy op, amortizing per-particle Python
dispatch overhead across the whole swarm.

This is the SOTA-aligned approach for swarm methods on commodity CPUs:
the per-particle work is regular and identical, so SIMD-friendly batched
arithmetic dominates over thread / process pool launch overhead at any
non-trivial swarm size.

Benchmark (experiments/bench_vectorized.py, 30 iterations,
3 repeats, commodity CPU):

P n serial threads (4w) vectorized
30 50 268 ms 1407 ms 299 ms
200 50 7900 ms 15696 ms 7721 ms
30 200 2796 ms 4305 ms 2806 ms
200 200 13307 ms 22922 ms 7821 ms

1.70× speedup vs serial at P=200 / n=200. Vectorized wins cleanly
without any threading overhead. (Threads underperform here because the
existing fitness eval does not release the GIL enough for the pool to
overcome dispatch overhead — a separate optimization tracked for
v0.7.)

Determinism: vectorized mode is deterministic given a seed but is
not bit-exact with serial mode (single iteration-level RNG vs
per-particle SeedSequence streams). Documented in the module
docstring and pinned by the new test class.

Tests: +5 new tests in
tests/test_parallel.py::TestVectorizedExecutor covering valid
result, determinism, competitiveness with serial within 0.05 fitness,
config validation, and the n_workers>1 warning. Total: 841
passing
, 1 skipped, 0 failing.

Cumulative diff v0.5.0 → v0.6.0

  • Tests: 836 → 841 passing (+5 new)
  • Executor backends: 3 (serial, threads, processes) → 4 (+ vectorized)
  • Defense-risk items from the audit: 5 → 0

Install

pip install fairswarm==0.6.0

Reproducibility

git clone https://github.com/dataeducator/fairswarm-library.git
cd fairswarm-library
pip install -e ".[dev]"
make reproduce
git diff results/paper_figures/  # should be empty

Vectorized-executor benchmark:

python experiments/bench_vectorized.py

Proven-vs-adaptive paired sweep:

python experiments/run_proven_config.py

Reference

T. Norwood, D. Das, P. Chatterjee, E. Bentley, and U. Ghosh,
"FairSwarm: Trustworthy Coalition Selection for Fair and Secure
Federated Intelligence," IEEE Trans. Consum. Electron., 2026 (Submitted).