Skip to content

fix perplexity logging#622

Merged
sichu2023 merged 8 commits into
mainfrom
sichu/torchmetric-ppl
Jan 22, 2025
Merged

fix perplexity logging#622
sichu2023 merged 8 commits into
mainfrom
sichu/torchmetric-ppl

Conversation

@sichu2023
Copy link
Copy Markdown
Contributor

@sichu2023 sichu2023 commented Jan 17, 2025

Replace callback method in logging perplexity with torchmetrics.
W B Chart 1_17_2025, 1_46_04 PM
W B Chart 1_17_2025, 1_46_09 PM

Loss curve is shifted by a constant when TP=2 but gradient is not affected and the perplexity value match with the loss curve without TP. This is a previously unidentified problem and will be resolved by another PR.
W B Chart 1_17_2025, 1_46_13 PM
W B Chart 1_17_2025, 1_46_17 PM
W B Chart 1_17_2025, 1_46_22 PM

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jan 17, 2025

Codecov Report

❌ Patch coverage is 91.17647% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.58%. Comparing base (42d76aa) to head (f6b893a).
⚠️ Report is 565 commits behind head on main.

Files with missing lines Patch % Lines
...-packages/bionemo-llm/src/bionemo/llm/lightning.py 88.46% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #622      +/-   ##
==========================================
- Coverage   86.62%   86.58%   -0.05%     
==========================================
  Files         116      116              
  Lines        6961     6931      -30     
==========================================
- Hits         6030     6001      -29     
+ Misses        931      930       -1     
Files with missing lines Coverage Δ
...ionemo-esm2/src/bionemo/esm2/scripts/train_esm2.py 93.80% <100.00%> (+0.11%) ⬆️
...s/bionemo-geneformer/src/bionemo/geneformer/api.py 100.00% <100.00%> (ø)
...onemo/geneformer/model/finetune_token_regressor.py 60.31% <100.00%> (-0.63%) ⬇️
...packages/bionemo-llm/src/bionemo/llm/model/loss.py 58.53% <100.00%> (-4.11%) ⬇️
...s/bionemo-testing/src/bionemo/testing/lightning.py 100.00% <100.00%> (ø)
...-packages/bionemo-llm/src/bionemo/llm/lightning.py 91.97% <88.46%> (-0.34%) ⬇️
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Collaborator

@farhadrgh farhadrgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some minor comments, LGTM

Comment thread sub-packages/bionemo-llm/src/bionemo/llm/lightning.py Outdated
Comment thread sub-packages/bionemo-testing/src/bionemo/testing/lightning.py Outdated
@sichu2023 sichu2023 force-pushed the sichu/torchmetric-ppl branch 2 times, most recently from 25286db to 0c97e81 Compare January 21, 2025 20:16
@sichu2023 sichu2023 enabled auto-merge January 21, 2025 20:17
Comment thread sub-packages/bionemo-geneformer/src/bionemo/geneformer/api.py Outdated
Copy link
Copy Markdown
Collaborator

@skothenhill-nv skothenhill-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like there are some major diffs that seem unrelated to the change, is there a leaky merge happening here?

Signed-off-by: sichu <sichu@nvidia.com>
Signed-off-by: sichu <sichu@nvidia.com>
Signed-off-by: sichu <sichu@nvidia.com>
Signed-off-by: sichu <sichu@nvidia.com>
Signed-off-by: sichu <sichu@nvidia.com>
Signed-off-by: sichu <sichu@nvidia.com>
Signed-off-by: sichu <sichu@nvidia.com>
Signed-off-by: sichu <sichu@nvidia.com>
@sichu2023 sichu2023 force-pushed the sichu/torchmetric-ppl branch from 0c97e81 to f6b893a Compare January 21, 2025 21:15
@sichu2023
Copy link
Copy Markdown
Contributor Author

looks like there are some major diffs that seem unrelated to the change, is there a leaky merge happening here?

Intentionally changes to drop forward output for logging since we are moving to torchmetric now.

@sichu2023 sichu2023 added this pull request to the merge queue Jan 21, 2025
Merged via the queue into main with commit e553389 Jan 22, 2025
@sichu2023 sichu2023 deleted the sichu/torchmetric-ppl branch January 22, 2025 00:46
polinabinder1 pushed a commit that referenced this pull request Jan 22, 2025
Replace callback method in logging perplexity with torchmetrics.
![W B Chart 1_17_2025, 1_46_04
PM](https://github.com/user-attachments/assets/64c417f6-771c-49f8-ab68-0134fb2a8ef8)
![W B Chart 1_17_2025, 1_46_09
PM](https://github.com/user-attachments/assets/4893fac0-fa42-44ed-97ed-11625f80b5e0)

Loss curve is shifted by a constant when TP=2 but gradient is not
affected and the perplexity value match with the loss curve without TP.
This is a previously unidentified problem and will be resolved by
another PR.
![W B Chart 1_17_2025, 1_46_13
PM](https://github.com/user-attachments/assets/365fc6e1-3cd7-43fa-9264-25e46d54933c)
![W B Chart 1_17_2025, 1_46_17
PM](https://github.com/user-attachments/assets/d79a6fc2-2fc3-48d5-ba9c-50922ed2a3f8)
![W B Chart 1_17_2025, 1_46_22
PM](https://github.com/user-attachments/assets/3f5b1bdc-ff39-4e0a-8758-c24bb8c574d2)

---------

Signed-off-by: sichu <sichu@nvidia.com>
Signed-off-by: Polina Binder <pbinder@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants