Skip to content

Conversation

@patrocinio
Copy link
Contributor

The tutorial was comparing loss on different batches:

  • Pre-training: evaluated on first 64 instances (batch 0)
  • Post-training: evaluated on last batch from training loop

This made the comparison misleading as it wasn't measuring improvement on the same data.

Changes:

  • Save the initial batch (xb_initial, yb_initial) after first evaluation
  • Use the saved initial batch for post-training evaluation
  • Added clarifying comment about fair comparison
  • Now both evaluations use the same data (first 64 training instances)

This provides an accurate before/after comparison showing the model's improvement on the same batch of data.

Fixes #3666

Description

Checklist

  • The issue that is being fixed is referred in the description (see above "Fixes #ISSUE_NUMBER")
  • Only one issue is addressed in this pull request
  • Labels from the issue that this PR is fixing are added to this pull request
  • No unnecessary issues are included into this pull request.

The tutorial was comparing loss on different batches:
- Pre-training: evaluated on first 64 instances (batch 0)
- Post-training: evaluated on last batch from training loop

This made the comparison misleading as it wasn't measuring improvement
on the same data.

Changes:
- Save the initial batch (xb_initial, yb_initial) after first evaluation
- Use the saved initial batch for post-training evaluation
- Added clarifying comment about fair comparison
- Now both evaluations use the same data (first 64 training instances)

This provides an accurate before/after comparison showing the model's
improvement on the same batch of data.
@pytorch-bot
Copy link

pytorch-bot bot commented Nov 27, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/3667

Note: Links to docs will display an error until the docs builds have been completed.

❗ 2 Active SEVs

There are 2 currently active SEVs. If your PR is affected, please view them below:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the cla signed label Nov 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feedback about What is torch.nn really?

1 participant