-
Notifications
You must be signed in to change notification settings - Fork 229
Add cross-site evaluation utility and examples #3923
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Greptile SummaryThis PR adds cross-site evaluation (CSE) utility and examples to NVFlare, replacing the previous approach from PR #3895 and integrating with the Recipe API from PR #3695. Key changes:
Architecture: Minor issue: Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant User
participant Recipe
participant Server
participant ModelLocator
participant CSEController as CrossSiteModelEval
participant Client1
participant Client2
participant Validator
User->>Recipe: Create recipe (FedAvg or CSE-only)
User->>Recipe: add_cross_site_evaluation()
Note over Recipe: Adds ModelLocator, CrossSiteModelEval,<br/>ValidationJsonGenerator to server
User->>Recipe: execute(env)
Recipe->>Server: Initialize job
Note over Server,Client2: Training Phase (if FedAvg mode)
Server->>Client1: Training tasks
Server->>Client2: Training tasks
Client1-->>Server: Updated models
Client2-->>Server: Updated models
Note over Server,Client2: Cross-Site Evaluation Phase
Server->>CSEController: Start CSE workflow
CSEController->>ModelLocator: Get models to evaluate
ModelLocator-->>CSEController: Return model references
CSEController->>Client1: Submit models for validation
CSEController->>Client2: Submit models for validation
Client1->>Validator: Validate all models on local data
Client2->>Validator: Validate all models on local data
Validator-->>Client1: Evaluation metrics
Validator-->>Client2: Evaluation metrics
Client1-->>CSEController: Return validation results
Client2-->>CSEController: Return validation results
CSEController->>Server: Aggregate cross-site results
Server->>Server: ValidationJsonGenerator saves results
Server-->>User: Job complete with CSE matrix
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (3)
-
examples/hello-world/hello-numpy-cross-val/job.py, line 90 (link)logic:
ValidationJsonGeneratoris NOT added automatically when using plainFedJob. It's only added automatically byBaseFedJob. You need to explicitly add it here. -
examples/hello-world/hello-pt/README.md, line 32 (link)syntax: Typo: "traiing" should be "training"
-
examples/hello-world/hello-pt/README.md, line 48 (link)syntax: Misplaced
#character after comma - should be a space
9 files reviewed, 3 comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a cross-site evaluation utility function and examples to NVFlare's Recipe API, enabling users to easily evaluate models across different client sites without sharing data.
Key Changes:
- Added
add_cross_site_evaluation()utility function tonvflare/recipe/utils.pyfor programmatically enabling cross-site model evaluation - Added cross-site evaluation support to the PyTorch hello-world example with a
--cross_site_evalcommand-line flag - Unified the NumPy cross-validation example into a single
job.pywith two modes: standalone CSE and training+CSE
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 24 comments.
Show a summary per file
| File | Description |
|---|---|
| nvflare/recipe/utils.py | Adds add_cross_site_evaluation() utility and MODEL_LOCATOR_REGISTRY for PyTorch and NumPy |
| examples/hello-world/hello-pt/job.py | Adds cross-site evaluation support via --cross_site_eval flag and --train_script parameter |
| examples/hello-world/hello-pt/README.md | Updates documentation with cross-site evaluation instructions and usage examples |
| examples/hello-world/hello-numpy-cross-val/job.py | Replaces multiple job scripts with unified implementation supporting pretrained and training modes |
| examples/hello-world/hello-numpy-cross-val/client.py | Adds NumPy training client script with training and evaluation functions |
| examples/hello-world/hello-numpy-cross-val/generate_pretrain_models.py | Updates comment to clarify cross-site evaluation terminology |
| examples/hello-world/hello-numpy-cross-val/README.md | Comprehensive rewrite documenting both CSE modes and Recipe API usage patterns |
| examples/hello-world/hello-numpy-cross-val/job_train_and_cse.py | Deleted - functionality consolidated into unified job.py |
| examples/hello-world/hello-numpy-cross-val/job_cse.py | Deleted - functionality consolidated into unified job.py |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (1)
-
nvflare/recipe/utils.py, line 107-108 (link)logic:
comp_idsdoesn't exist on plainFedJobobjects - only onBaseFedJob.NumpyFedAvgRecipeuses plainFedJobwhich will causeAttributeErrorat runtime when callingadd_cross_site_evaluationwithpersistor_id=None.
9 files reviewed, 1 comment
|
/build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (2)
-
nvflare/recipe/utils.py, line 39-50 (link)style:
CLIENT_MODEL_DIRconstant is defined but never used in the PR. Therun_pretrained_cse()function in job.py referencesCLIENT_MODEL_DIRdirectly without using this registry value. -
examples/hello-world/hello-numpy-cross-val/job.py, line 38-39 (link)style:
CLIENT_MODEL_DIRis defined but never used anywhere in the code.
9 files reviewed, 2 comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (2)
-
nvflare/recipe/utils.py, line 79 (link)style: parameter should have type annotation for consistency with other parameters
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
-
examples/hello-world/hello-numpy-cross-val/README.md, line 140 (link)syntax: Missing import statement for SimEnv
9 files reviewed, 2 comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (1)
-
examples/hello-world/hello-numpy-cross-val/job.py, line 39 (link)style:
CLIENT_MODEL_DIRis defined but never used in the code
9 files reviewed, 1 comment
|
/build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (5)
-
nvflare/app_common/np/recipes/__init__.py, line 15 (link)syntax:
NumpyCrossSiteEvalRecipeis imported but the implementation filecross_site_eval.pydoesn't exist in this PR, which will causeImportErrorat runtime. -
nvflare/app_common/np/recipes/__init__.py, line 18 (link)syntax:
NumpyCrossSiteEvalRecipeis exported but doesn't exist, causing runtime errors. -
examples/hello-world/hello-numpy-cross-val/job.py, line 29 (link)syntax:
NumpyCrossSiteEvalRecipeis imported but doesn't exist innvflare/app_common/np/recipes/, causingImportError.Either add the missing
cross_site_eval.pyimplementation file or use manual job construction as shown in the deletedjob_cse.pyfile. -
examples/hello-world/hello-numpy-cross-val/README.md, line 107-109 (link)style: Documentation states to "add cross-site validation with one line" but the code example requires 3 additional lines (importing NPValidator, creating it, and adding it to clients) before calling
add_cross_site_evaluation(). Update to clarify the NumPy-specific setup required. -
nvflare/recipe/utils.py, line 166 (link)style:
formatter_idparameter missing fromCrossSiteModelEval. The old implementation (deletedjob_cse.py) includedNPFormatter. Verify if formatter is needed for numpy models.
10 files reviewed, 5 comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (3)
-
nvflare/app_common/np/recipes/__init__.py, line 15 (link)syntax: the file
cross_site_eval.pydoesn't exist in thenvflare/app_common/np/recipes/directory, causing import failure -
nvflare/app_common/np/recipes/__init__.py, line 18 (link)syntax: remove
NumpyCrossSiteEvalRecipesince the class doesn't exist -
examples/hello-world/hello-numpy-cross-val/job.py, line 29 (link)logic:
NumpyCrossSiteEvalRecipedoesn't exist, will cause runtime error when using--mode pretrained. Is this recipe defined in PR #3695 or should it be added in this PR?
10 files reviewed, 3 comments
|
/build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (1)
-
examples/hello-world/hello-numpy-cross-val/client.py, line 1 (link)syntax: copyright year should be 2025, not 2026
13 files reviewed, 1 comment
Errors in /home/runner/work/NVFlare/NVFlare/examples/hello-world/hello-numpy-cross-val/README.md
|
|
thanks for catching that, should be fixed now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
14 files reviewed, 1 comment
|
/build |
holgerroth
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR adds cross-site evaluation (CSE) utilities and examples to NVFlare, building on previous work in PR #3695. The implementation introduces a new NumpyCrossSiteEvalRecipe for standalone CSE workflows and an add_cross_site_evaluation() utility function to add CSE to existing training recipes.
Key Changes:
- Added
NumpyCrossSiteEvalRecipeclass for standalone cross-site evaluation with pre-trained NumPy models - Added
add_cross_site_evaluation()utility function innvflare/recipe/utils.pyto augment any recipe with CSE capabilities - Created comprehensive examples for both NumPy and PyTorch showing standalone CSE and training+CSE workflows
- Updated hello-pt example to support CSE via
--cross_site_evalflag - PyTorch client now properly handles evaluation-only tasks with
flare.is_evaluate()branch - Updated documentation and Jupyter notebooks with CSE usage examples
Minor Issues Found:
- Copyright year inconsistency in
client.py(2026 vs 2025 in other files) - API consistency:
add_cross_site_evaluationnot exported fromnvflare.recipe.__init__.pywhileadd_experiment_trackingis
The implementation follows good architectural patterns with proper separation of concerns, clear documentation, and safe model loading (allow_pickle=False). The utility function design is extensible with a registry pattern for different model locator types.
Confidence Score: 4/5
- This PR is safe to merge with only minor style improvements recommended
- The code is well-architected with proper error handling, security practices (no pickle), and comprehensive documentation. The only issues found are minor style inconsistencies (copyright year typo and import export pattern). Core functionality is sound with proper separation between training and validation workflows, safe fallback logic for missing attributes, and clear API patterns.
examples/hello-world/hello-numpy-cross-val/client.pyneeds copyright year correction from 2026 to 2025. Consider exportingadd_cross_site_evaluationfromnvflare/recipe/__init__.pyfor consistency.
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| nvflare/app_common/np/recipes/cross_site_eval.py | 5/5 | New recipe class for standalone cross-site evaluation with NumPy models. Well-structured with clear docstrings and proper component initialization. |
| nvflare/recipe/utils.py | 4/5 | Added add_cross_site_evaluation utility function with model locator registry. Good abstraction, but could improve API consistency by exporting from init.py. |
| examples/hello-world/hello-numpy-cross-val/client.py | 4/5 | Training script for NumPy cross-site eval example. Has copyright year typo (2026 should be 2025). |
| examples/hello-world/hello-numpy-cross-val/job.py | 5/5 | Unified script supporting both standalone CSE and training+CSE modes. Clean implementation following recipe API patterns. |
| examples/hello-world/hello-pt/job.py | 5/5 | Added cross-site evaluation support via command-line flag. Clean integration with existing recipe. |
| examples/hello-world/hello-pt/client.py | 5/5 | Added is_evaluate() branch for cross-site evaluation tasks. Proper separation of training and evaluation logic. |
Sequence Diagram
sequenceDiagram
participant User
participant Recipe
participant Server
participant ModelLocator
participant Client1
participant Client2
Note over User,Client2: Cross-Site Evaluation Workflow
User->>Recipe: add_cross_site_evaluation(recipe, model_locator_type)
Recipe->>Server: Add ModelLocator component
Recipe->>Server: Add CrossSiteModelEval controller
Recipe->>Server: Add ValidationJsonGenerator
Note over User,Client2: Job Execution Starts
Server->>ModelLocator: get_model_names()
ModelLocator-->>Server: List of model names
loop For each model
Server->>ModelLocator: locate_model(model_name)
ModelLocator-->>Server: Model DXO
Server->>Server: Save model to cross_val_models_dir
end
Note over Server,Client2: Distribute models for validation
Server->>Client1: Send TASK_VALIDATION (model_1)
Server->>Client2: Send TASK_VALIDATION (model_1)
Client1->>Client1: NPValidator.execute() or is_evaluate()
Client1->>Client1: Validate model on local data
Client1-->>Server: Return validation metrics
Client2->>Client2: NPValidator.execute() or is_evaluate()
Client2->>Client2: Validate model on local data
Client2-->>Server: Return validation metrics
Server->>Server: Collect results in validation matrix
Note over Server,Client2: Repeat for all models
Server->>Server: ValidationJsonGenerator saves cross_val_results.json
Server-->>User: Job complete with validation matrix
| @@ -0,0 +1,93 @@ | |||
| # Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copyright year should be 2025, not 2026 (inconsistent with all other files in this PR which use 2025)
| def add_cross_site_evaluation( | ||
| recipe: Recipe, | ||
| model_locator_type: str = "pytorch", | ||
| model_locator_config: Optional[dict] = None, | ||
| persistor_id: Optional[str] = None, | ||
| submit_model_timeout: int = 600, | ||
| validation_timeout: int = 6000, | ||
| ): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider exporting add_cross_site_evaluation from nvflare.recipe.__init__.py for consistency with add_experiment_tracking (which is exported). This would make the API more consistent and allow users to import both utilities from the same location: from nvflare.recipe import add_experiment_tracking, add_cross_site_evaluation
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
|
/build |
Add cross-site evaluation utility and examples.
Description
Instead of #3895, this takes into account #3695 for adding cross-site evaluation utility and examples.
Types of changes
./runtest.sh.