Add heterogeneous-rank HLoRA enhancement to MedGemma example#4424
Add heterogeneous-rank HLoRA enhancement to MedGemma example#4424holgerroth merged 19 commits intoNVIDIA:mainfrom
Conversation
|
@greptileai review this PR |
Greptile SummaryThis PR extends the MedGemma federated fine-tuning example with heterogeneous-rank HLoRA: it adds Confidence Score: 5/5Safe to merge — all remaining findings are P2 style/cleanup suggestions with no correctness or data-integrity impact. The HLoRA math (compact QR+SVD factorization), the rank-truncation slicing, and the aggregator's zero-weight guard are all correct. Prior P0/P1 concerns (missing os import, silent empty-model broadcast, unreachable RuntimeError) have been addressed in earlier commits. The three remaining comments are minor inconsistencies that do not affect runtime behavior. No files require special attention; data_utils.py and run_evaluation.py have minor P2 style notes. Important Files Changed
Sequence DiagramsequenceDiagram
participant S as Server (FedAvgRecipe)
participant A as NaiveMaxRankAggregator / HLoRAMaxRankAggregator
participant C1 as site-1 (rank=4)
participant C2 as site-2 (rank=8)
participant C3 as site-3 (rank=16)
S->>C1: Global LoRA bank (rank=16)
S->>C2: Global LoRA bank (rank=16)
S->>C3: Global LoRA bank (rank=16)
Note over C1: truncate_global_bank_for_site → rank=4
Note over C2: truncate_global_bank_for_site → rank=8
Note over C3: no truncation needed
C1->>C1: SFTTrainer (local rank=4)
C2->>C2: SFTTrainer (local rank=8)
C3->>C3: SFTTrainer (local rank=16)
C1->>A: FLModel FULL (rank-4 A/B tensors, num_examples weight)
C2->>A: FLModel FULL (rank-8 A/B tensors, num_examples weight)
C3->>A: FLModel FULL (rank-16 A/B tensors, num_examples weight)
Note over A: Naive: weighted factor avg into rank-16 bank
Note over A: HLoRA: cat → QR → SVD → project to rank-16
A->>S: Aggregated FLModel (rank=16)
S->>C1: Next round global bank (rank=16)
S->>C2: Next round global bank (rank=16)
S->>C3: Next round global bank (rank=16)
Reviews (4): Last reviewed commit: "Merge branch 'main' into codex/medgemma-..." | Re-trigger Greptile |
|
@greptileai, review again to see if the issues were addressed. |
|
@greptileai, review the latest version. |
|
/build |
Summary
This draft PR extends the advanced MedGemma example with a heterogeneous-rank HLoRA workflow on top of the merged federated fine-tuning example.
Changes introduced:
naive) and HLoRA (hlora)job.py, with distinct ranks used by default for the 3-client example--finetune_onlytorun_evaluation.pyfor faster repeated checkpoint comparisonsObserved comparison
Using the heterogeneous-rank 3-client layout (
4,8,16) onCRC-VAL-HE-7K:0.8955(6430/7180)0.9414(6759/7180)+0.04580.8961(6434/7180)0.9366(6725/7180)+0.0405Notes
naiveaggregator remains necessary in the heterogeneous-rank setting because plain built-in FedAvg tensor averaging is not shape-safe once sites use different LoRA ranks.