Skip to content

Conversation

@jun-simons
Copy link

Issue

Description

The PyTorch quickstart computes running_loss across all local epochs but normalizes it only by len(trainloader). As a result, the reported train loss metric scales approximately linearly with local-epochs > 1, which is misleading.

Related issues/PRs

Fixes #6333

Proposal

Explanation

This PR fixes the normalization by dividing running_loss by (epochs * len(trainloader)), yielding an average loss per batch over the full local training run.

This is a single line change to the train function, and the fix is also applied in the docs that use this code directly.

  • examples/quickstart-pytorch/pytorchexample/task.py
  • framework/docs/source/tutorial-quickstart-pytorch.rst
  • framework/docs/source/tutorial-series-get-started-with-flower-pytorch.rst

Behavior for local-epochs=1 is unchanged but the train loss for local-epochs>1 is corrected. This was verified by running one server round with local-epochs={1,2,4} and the loss no longer scales 2x/4x, but instead remains in the same range.

Note that examples/quickstart-pennylane/... already normalizes loss as running_loss / (epochs * len(trainloader)).

Checklist

  • Implement proposed change
  • Write tests
  • Update documentation
  • Make CI checks pass
  • Ping maintainers on Slack (channel #contributions)

Any other comments?

./framework/dev/format.sh currently fails on upstream main due to a unrelated missing copyright notice in framework/py/flwr/app/message_type.py, which is reproducible on main at 92b9de10b

@github-actions github-actions bot added the Contributor Used to determine what PRs (mainly) come from external contributors. label Jan 7, 2026
@panh99 panh99 changed the title fix(*:skip) Normalize avg_trainloss in PyTorch quickstart fix(*:skip): Normalize avg_trainloss in PyTorch quickstart Jan 7, 2026
@jun-simons jun-simons force-pushed the fix-quickstart-avg-trainloss branch from 2a1198c to 10ab8b3 Compare January 9, 2026 03:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Contributor Used to determine what PRs (mainly) come from external contributors.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: PyTorch Quickstart - avg_trainloss scales with local-epochs > 1

1 participant