Add hello-numpy-robust example with optional median aggregation and poisoned-client simulation by rwilliamspbg-ops · Pull Request #4373 · NVIDIA/NVFlare

rwilliamspbg-ops · 2026-03-27T15:22:07Z

Summary

Add hello-world NumPy robust aggregation example
Compare default FedAvg vs median aggregation under poisoned-client simulation
Add focused custom-aggregator tests
Add docs and register example in hello-world README

Scope

Example-level contribution only
No core NVFlare API changes

Validation

Unit tests pass for custom median aggregator
Simulator runs complete for both default and median modes
README includes output snapshot showing robustness behavior

greptile-apps · 2026-03-27T15:26:52Z

Greptile Summary

This PR adds a hello-numpy-robust example that extends the existing NumPy hello-world recipe with a pluggable MedianAggregator and a simple Byzantine-poisoning simulation, demonstrating the robustness advantage of element-wise median aggregation over vanilla FedAvg. The change is entirely example-level with no modifications to NVFlare core APIs.

Key observations:

The MedianAggregator implementation and its tests are clean and correctly demonstrate the expected robustness pattern.
Three issues flagged in prior review threads remain unaddressed in the current code: (1) aggregate_model() silently returns an empty FLModel instead of raising when no client models are present; (2) aggregate_model() does not reset internal state per the base-class docstring contract; (3) passing --poison_client_name \"\" to job.py drops the empty-string token from train_args, causing argparse in client.py to mis-parse --poison_scale as the value for --poison_client_name — which breaks the documented "disable poisoning" use case from the README.
requirements.txt lists tensorboard as a hard dependency while the README describes TensorBoard tracking as optional (see inline comment).

Confidence Score: 4/5

Merging is blocked by the previously-flagged but still-unresolved P1 defect where --poison_client_name "" breaks argument parsing in the documented disable poisoning example.

Three P1 issues from prior review rounds remain unaddressed in the current code (silent empty return in aggregate_model, missing state reset in aggregate_model, and broken empty-string argument passing in job.py). These are real defects in the example paths documented by the README. Because they are unresolved, a score of 4/5 is appropriate rather than the default 5/5 for P2-only findings.

examples/hello-world/hello-numpy-robust/job.py (empty poison_client_name argument handling) and examples/hello-world/hello-numpy-robust/custom_aggregators.py (aggregate_model silent return and missing state reset).

Important Files Changed

Filename	Overview
examples/hello-world/hello-numpy-robust/custom_aggregators.py	Implements MedianAggregator correctly for the happy path, but aggregate_model does not reset internal state per the base-class docstring contract, and the empty-input path silently returns an empty FLModel rather than raising — both previously flagged in review threads and still unaddressed.
examples/hello-world/hello-numpy-robust/job.py	Orchestrates the simulation cleanly; however, the f-string construction of train_args loses the empty-string token when --poison_client_name "" is passed, breaking the documented "disable poisoning" use case (previously flagged, still unaddressed in current code).
examples/hello-world/hello-numpy-robust/client.py	Clean FL client with mock training, optional poisoning injection, and SummaryWriter metrics; logic is straightforward and self-consistent.
examples/hello-world/hello-numpy-robust/tests/test_custom_aggregators.py	Covers the three key scenarios (outlier reduction, params_type mismatch, reset); test_reset_stats_clears_state validates the silent-empty-return behaviour that was previously flagged as problematic.
examples/hello-world/hello-numpy-robust/requirements.txt	tensorboard listed as a hard dependency while the README describes it as optional; minor inconsistency that forces unnecessary installation.
examples/hello-world/hello-numpy-robust/README.md	Good demo documentation with expected output snapshot; the Disable poisoning code sample relies on the empty-string argument-passing behaviour that is currently broken.
examples/hello-world/README.md	Adds a concise entry for the new hello-numpy-robust example in the correct section; no issues.

Sequence Diagram

sequenceDiagram
    participant U as User (job.py)
    participant R as NumpyFedAvgRecipe
    participant S as Server (ScatterAndGather)
    participant A as Aggregator (Default or MedianAggregator)
    participant C as Clients (client.py x N)

    U->>R: NumpyFedAvgRecipe(aggregator=...)
    U->>R: recipe.execute(SimEnv)
    loop Each round
        S->>C: Broadcast global model
        C->>C: train(params) → maybe_poison_update()
        C->>S: Send FLModel (FULL or DIFF)
        S->>A: accept_model(client_model) x N
        S->>A: aggregate_model()
        A-->>S: FLModel (median or weighted-avg)
        S->>S: reset() → reset_stats()
    end
    S-->>U: run.get_result() / run.get_status()

_{Reviews (2): Last reviewed commit: "Merge branch 'main' into feat/hello-nump..." | Re-trigger Greptile}

greptile-apps · 2026-03-27T15:26:56Z

+    def aggregate_model(self) -> FLModel:
+        if not self._client_models:
+            self.error("No client models to aggregate.")
+            return FLModel(params={})


Silent empty return masks aggregation data preparation failures

When _client_models is empty, the method logs an error and returns FLModel(params={}). This silent empty return can propagate downstream — the server persistor or controller receives an empty model dict and may overwrite a previously-good checkpoint with empty params, or silently skip aggregation without surfacing the root cause.

Per the project's error-handling convention, an empty aggregation input is an exceptional state that should fail loudly with a descriptive ValueError:

Suggested change

def aggregate_model(self) -> FLModel:

if not self._client_models:

self.error("No client models to aggregate.")

return FLModel(params={})

def aggregate_model(self) -> FLModel:

if not self._client_models:

raise ValueError(

"MedianAggregator.aggregate_model called with no accepted client models. "

"Ensure at least one client submitted a model before aggregation."

)

This surfaces the problem immediately at the aggregation step rather than letting an empty model silently propagate.

Rule Used: When data_loader might be empty during loss comput... (source)

Learnt From
NVIDIA/NVFlare#4001

greptile-apps · 2026-03-27T15:26:57Z

+    def aggregate_model(self) -> FLModel:
+        if not self._client_models:
+            self.error("No client models to aggregate.")
+            return FLModel(params={})
+
+        aggregated_params = {}
+        for key in self._client_models[0].keys():
+            stacked = np.stack([m[key] for m in self._client_models], axis=0)
+            aggregated_params[key] = np.median(stacked, axis=0)
+
+        return FLModel(params=aggregated_params, params_type=self._params_type)


aggregate_model() does not reset internal state

The ModelAggregator base class docstring for aggregate_model states it "needs to implement aggregation logic and reset any internal stats". The current implementation aggregates but leaves _client_models populated, so a second call to aggregate_model() (without an intervening reset_stats()) would re-aggregate the same client updates.

While the framework's ScatterAndGather controller calls reset() → reset_stats() between rounds, custom orchestration or direct unit-test usage could hit a double-aggregate bug. Consider clearing state at the end of aggregate_model():

def aggregate_model(self) -> FLModel: ... result = FLModel(params=aggregated_params, params_type=self._params_type) self.reset_stats() # clear accumulated state per base-class contract return result

greptile-apps · 2026-03-27T15:26:58Z

+    train_args = (
+        f"--update_type {args.update_type} "
+        f"--poison_client_name {args.poison_client_name} "
+        f"--poison_scale {args.poison_scale}"
+    )


Empty poison_client_name breaks client argument parsing

When --poison_client_name "" is passed to job.py (the documented way to disable poisoning), args.poison_client_name is an empty string. The f-string produces a train_args string where --poison_client_name is followed by nothing before --poison_scale. When the script runner tokenizes that string, --poison_scale becomes the next positional token and argparse in client.py will raise an error because it expects a value for --poison_client_name, not another flag.

The documented "disable poisoning" invocation from the README therefore fails at runtime.

To properly pass an empty string as an argument token, wrap the value with shlex.quote:

import shlex train_args = ( f"--update_type {args.update_type} " f"--poison_client_name {shlex.quote(args.poison_client_name)} " f"--poison_scale {args.poison_scale}" )

shlex.quote("") produces '', which the shell/tokenizer passes correctly as an empty-string argument to the client.

Copilot

Pull request overview

Adds a new hello-world NumPy example demonstrating robustness to poisoned/outlier clients by optionally switching from default FedAvg aggregation to an element-wise median aggregator, with accompanying docs and a small unit test.

Changes:

Introduce hello-numpy-robust example (job launcher, client script, custom median aggregator) with optional poisoned-client simulation.
Add example README and register the example in examples/hello-world/README.md.
Add targeted unit tests for the custom median aggregator.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
examples/hello-world/hello-numpy-robust/custom_aggregators.py	Adds `MedianAggregator` implementation for robust aggregation.
examples/hello-world/hello-numpy-robust/job.py	Adds simulator job entrypoint with `--aggregator` and poisoning controls.
examples/hello-world/hello-numpy-robust/client.py	Adds client training loop with optional poisoned-update scaling and metrics logging.
examples/hello-world/hello-numpy-robust/requirements.txt	Declares dependencies for running the new example.
examples/hello-world/hello-numpy-robust/README.md	Documents running baseline vs median aggregation and poisoning toggles.
examples/hello-world/hello-numpy-robust/tests/test_custom_aggregators.py	Adds unit tests for median aggregation behavior and error handling.
examples/hello-world/README.md	Registers and documents the new example in the hello-world index.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-27T15:27:45Z

+from custom_aggregators import MedianAggregator
+
+from nvflare.app_common.abstract.fl_model import FLModel, ParamsType
+from nvflare.app_common.np.constants import NPConstants


The test imports MedianAggregator via from custom_aggregators import MedianAggregator, but when running pytest from the repo root (default), the example directory isn’t on sys.path, so this will raise ModuleNotFoundError. Add a small sys.path adjustment in the test (e.g., insert the example root directory), or package the example so the module can be imported reliably under the repo’s test runner.

Copilot · 2026-03-27T15:27:45Z

+Disable poisoning:
+
+```bash
+python job.py --aggregator median --poison_client_name ""


The “Disable poisoning” command uses --poison_client_name "", but job.py forwards args to the client by building a whitespace-split string; an empty value will be dropped, causing --poison_client_name to have no argument and the client script to fail argparse parsing. Update the instructions (and corresponding code) to use a safe disable mechanism (e.g., --poison_client_name none or --no-poison).

Suggested change

python job.py --aggregator median --poison_client_name ""

python job.py --aggregator median --poison_client_name none

Copilot · 2026-03-27T15:27:45Z

+    train_args = (
+        f"--update_type {args.update_type} "
+        f"--poison_client_name {args.poison_client_name} "
+        f"--poison_scale {args.poison_scale}"
+    )


--poison_client_name is documented as “use empty string to disable”, but train_args is built via string concatenation. When poison_client_name is empty, TaskScriptRunner later splits args on whitespace, producing --poison_client_name --poison_scale ... (missing value) and the client argparse will fail. Use a boolean flag (e.g., --poison/--no-poison) or a sentinel value (e.g., none) and only include --poison_client_name <value> in train_args when a non-empty value is intended; update client defaults accordingly.

Suggested change

train_args = (

f"--update_type {args.update_type} "

f"--poison_client_name {args.poison_client_name} "

f"--poison_scale {args.poison_scale}"

)

train_args_parts = [

f"--update_type {args.update_type}",

f"--poison_scale {args.poison_scale}",

]

# Only include poison_client_name if non-empty; empty string disables poisoning.

if args.poison_client_name != "":

train_args_parts.insert(1, f"--poison_client_name {args.poison_client_name}")

train_args = " ".join(train_args_parts)

chesterxgchen · 2026-03-29T04:35:40Z

@@ -0,0 +1,67 @@
+# Hello NumPy Robust Aggregation


Please follow the readme.md from hello-pt for readme structure (https://github.com/NVIDIA/NVFlare/blob/main/examples/hello-world/hello-pt/README.md)

installation
data
code structure
client code
server side
Job Recipe
Run Job
Result
etc.

For the code structure,

please follow
client.py
model.py ( if exists)
job.py

format when possible

Added fixes

chesterxgchen · 2026-03-29T15:18:35Z

@rwilliamspbg-ops Numpy Recipe was a misstep we made, We did not have time to fix in 2.7.2 as it was discovered late close to the release date. As the FedAvg recipe itself should cover Numpy, there should no need for a separate numpyfedavg recipe. We were plan to remove it and fold into FedAvg or simply remove NumPy Example all togather.

If you think your example is important to expand the FedAvg algorithms, please consider add it to the PyTorch FedAvg example such as hello-pt or hello-lightning for pytorch lighnining, or in the examples/advanced examples

rwilliamspbg-ops · 2026-03-29T15:27:11Z

sounds good

Add robust NumPy hello-world example with median aggregation

1fc7fd2

Copilot AI review requested due to automatic review settings March 27, 2026 15:22

Copilot started reviewing on behalf of rwilliamspbg-ops March 27, 2026 15:22 View session

greptile-apps Bot reviewed Mar 27, 2026

View reviewed changes

Copilot AI reviewed Mar 27, 2026

View reviewed changes

Merge branch 'main' into feat/hello-numpy-robust-aggregation

672cd06

chesterxgchen reviewed Mar 29, 2026

View reviewed changes

rwilliamspbg-ops closed this Mar 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add hello-numpy-robust example with optional median aggregation and poisoned-client simulation#4373

Add hello-numpy-robust example with optional median aggregation and poisoned-client simulation#4373
rwilliamspbg-ops wants to merge 2 commits intoNVIDIA:mainfrom
rwilliamspbg-ops:feat/hello-numpy-robust-aggregation

rwilliamspbg-ops commented Mar 27, 2026

Uh oh!

greptile-apps Bot commented Mar 27, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot Mar 27, 2026

Uh oh!

greptile-apps Bot Mar 27, 2026

Uh oh!

greptile-apps Bot Mar 27, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 27, 2026

Uh oh!

Copilot AI Mar 27, 2026

Uh oh!

Copilot AI Mar 27, 2026

Uh oh!

chesterxgchen Mar 29, 2026 •

edited

Loading

Uh oh!

rwilliamspbg-ops Mar 29, 2026

Uh oh!

chesterxgchen commented Mar 29, 2026

Uh oh!

rwilliamspbg-ops commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	python job.py --aggregator median --poison_client_name ""
	python job.py --aggregator median --poison_client_name none

-    train_args = (
-        f"--update_type {args.update_type} "
-        f"--poison_client_name {args.poison_client_name} "
-        f"--poison_scale {args.poison_scale}"
-    )
+    train_args_parts = [
+        f"--update_type {args.update_type}",
+        f"--poison_scale {args.poison_scale}",
+    ]
+    # Only include poison_client_name if non-empty; empty string disables poisoning.
+    if args.poison_client_name != "":
+        train_args_parts.insert(1, f"--poison_client_name {args.poison_client_name}")
+    train_args = " ".join(train_args_parts)

Conversation

rwilliamspbg-ops commented Mar 27, 2026

Summary

Scope

Validation

Uh oh!

greptile-apps Bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

chesterxgchen Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rwilliamspbg-ops Mar 29, 2026

Choose a reason for hiding this comment

Uh oh!

chesterxgchen commented Mar 29, 2026

Uh oh!

rwilliamspbg-ops commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

greptile-apps Bot commented Mar 27, 2026 •

edited

Loading

chesterxgchen Mar 29, 2026 •

edited

Loading