Doing the loss reduction in foundry instead of in the loss functions. #1079

ShashankMosaicML · 2024-04-01T22:19:59Z

This PR gives us more flexibility to reduce losses in foundry in custom ways.

We see that the changes do not affect the MFU or the convergence for 125M and 7B models.

Memory consumption is also similar:

llmfoundry/models/mpt/modeling_mpt.py

Co-authored-by: Vitaliy Chiley <6439018+vchiley@users.noreply.github.com>

llmfoundry/models/mpt/modeling_mpt.py

…mosaicml#1079) * setting loss_fn reduction to None * fixing a unit test * add error message * adding test to check reduction * adding test to check reduction * Update llmfoundry/models/mpt/modeling_mpt.py Co-authored-by: Vitaliy Chiley <6439018+vchiley@users.noreply.github.com> * preserving batch dimension of targets * minor change --------- Co-authored-by: Vitaliy Chiley <6439018+vchiley@users.noreply.github.com> Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com>

…#1079) * setting loss_fn reduction to None * fixing a unit test * add error message * adding test to check reduction * adding test to check reduction * Update llmfoundry/models/mpt/modeling_mpt.py Co-authored-by: Vitaliy Chiley <6439018+vchiley@users.noreply.github.com> * preserving batch dimension of targets * minor change --------- Co-authored-by: Vitaliy Chiley <6439018+vchiley@users.noreply.github.com> Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com>

ShashankMosaicML added 4 commits April 1, 2024 21:52

setting loss_fn reduction to None

d8e4061

fixing a unit test

039caec

add error message

16b5b0d

adding test to check reduction

351434a

ShashankMosaicML marked this pull request as ready for review April 1, 2024 23:40

ShashankMosaicML requested review from vchiley and dakinggg and removed request for vchiley April 1, 2024 23:40

adding test to check reduction

8888689

vchiley reviewed Apr 1, 2024

View reviewed changes

llmfoundry/models/mpt/modeling_mpt.py Outdated Show resolved Hide resolved

vchiley reviewed Apr 1, 2024

View reviewed changes

llmfoundry/models/mpt/modeling_mpt.py Outdated Show resolved Hide resolved

ShashankMosaicML and others added 5 commits April 1, 2024 17:20

Update llmfoundry/models/mpt/modeling_mpt.py

882b745

Co-authored-by: Vitaliy Chiley <6439018+vchiley@users.noreply.github.com>

Merge branch 'main' into reduce_loss_in_foundry

c27f444

preserving batch dimension of targets

f651cff

merging

be9a2ca

Merge branch 'main' into reduce_loss_in_foundry

a83d2f8

vchiley approved these changes Apr 2, 2024

View reviewed changes

minor change

718c358

vchiley approved these changes Apr 2, 2024

View reviewed changes

ShashankMosaicML enabled auto-merge (squash) April 2, 2024 16:51

vchiley reviewed Apr 2, 2024

View reviewed changes

llmfoundry/models/mpt/modeling_mpt.py Show resolved Hide resolved

vchiley reviewed Apr 2, 2024

View reviewed changes

llmfoundry/models/mpt/modeling_mpt.py Show resolved Hide resolved

ShashankMosaicML merged commit 632cb73 into mosaicml:main Apr 2, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Doing the loss reduction in foundry instead of in the loss functions. #1079

Doing the loss reduction in foundry instead of in the loss functions. #1079

ShashankMosaicML commented Apr 1, 2024 •

edited

Doing the loss reduction in foundry instead of in the loss functions. #1079

Doing the loss reduction in foundry instead of in the loss functions. #1079

Conversation

ShashankMosaicML commented Apr 1, 2024 • edited

ShashankMosaicML commented Apr 1, 2024 •

edited