Fix mup for the layers with AttentionLayerMup #494

DomInvivo · 2023-12-20T03:32:44Z

Changelogs

Added embed_dim to the list of keys to look for when doing the mup kwargs

@maciej-sypetkowski I think this should fix your issue, although I can't verify it because on my end, when I use the config with architecture.mup_scale_factor: 2 it works. Basically, I can't reproduce because I don't know how you do your scaling for it to fail, but at least the attn_layer keys in mup_base_params.yaml are no longer null.

IMPORTANT

when this PR closes, it will affect the reproducibility of your models if they use AttentionLayerMup, such as GPSLayerPyg since the mup will affect the learning rate of these layers, whether they were previously ignored.

…as `in_dim` in gps layer

codecov · 2023-12-20T03:39:56Z

Codecov Report

Merging #494 (4045fcf) into main (8cbf2d0) will increase coverage by 0.17%.
Report is 23 commits behind head on main.
The diff coverage is 100.00%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #494      +/-   ##
==========================================
+ Coverage   71.35%   71.52%   +0.17%     
==========================================
  Files          94       93       -1     
  Lines        8718     8707      -11     
==========================================
+ Hits         6221     6228       +7     
+ Misses       2497     2479      -18

Flag	Coverage Δ
unittests	`71.52% <100.00%> (+0.17%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
ipu	`49.14% <ø> (ø)`

DomInvivo · 2023-12-20T04:04:40Z

graphium/nn/architectures/global_architectures.py

+                    assert (
+                        x[k] % num_heads == 0
+                    ), f"embed_dim={x[k]} is not divisible by num_heads={num_heads}"


I don't think it's needed since there's another assertion in AttentionLayerMup. @maciej-sypetkowski can you check if we remove that part when scaling by a factor that's not divisible by num_heads, does it still work?

Yes, it's not needed

maciej-sypetkowski

LGTM. Tested and it works

maciej-sypetkowski · 2023-12-24T11:59:50Z

env.yml

@@ -17,6 +17,7 @@ dependencies:
  - pandas >=1.0
  - scikit-learn
  - fastparquet
+  - networkx


Why is it needed now?

Removed double check of embed_dim/num_heads, discussed in PR #494

DomInvivo added 3 commits December 19, 2023 16:23

Added rescaling of embed_dim. And force embed_dim to be the same …

08f8b9d

…as `in_dim` in gps layer

Reverting changes to env

21ed2e7

black linting

0629e94

DomInvivo requested a review from maciej-sypetkowski December 20, 2023 03:32

Reverted env.yml

c58de0a

DomInvivo commented Dec 20, 2023

View reviewed changes

DomInvivo added 3 commits December 19, 2023 23:05

re-added the assertion for force_consistent_in_dim

fe6fa16

Limiting lightning version

5d07bc4

Removed lightning constraint

2f58246

maciej-sypetkowski approved these changes Dec 24, 2023

View reviewed changes

Removed networkx inenv.yml

4045fcf

DomInvivo merged commit f698df4 into main Dec 24, 2023
7 checks passed

DomInvivo added a commit that referenced this pull request Dec 24, 2023

Update global_architectures.py

b69aced

Removed double check of embed_dim/num_heads, discussed in PR #494

DomInvivo mentioned this pull request Dec 24, 2023

Update global_architectures.py #501

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix mup for the layers with AttentionLayerMup #494

Fix mup for the layers with AttentionLayerMup #494

DomInvivo commented Dec 20, 2023 •

edited

Loading

codecov bot commented Dec 20, 2023 •

edited

Loading

DomInvivo Dec 20, 2023

maciej-sypetkowski Dec 24, 2023

maciej-sypetkowski left a comment

maciej-sypetkowski Dec 24, 2023

Fix mup for the layers with AttentionLayerMup #494

Fix mup for the layers with AttentionLayerMup #494

Conversation

DomInvivo commented Dec 20, 2023 • edited Loading

Changelogs

IMPORTANT

codecov bot commented Dec 20, 2023 • edited Loading

Codecov Report

DomInvivo Dec 20, 2023

Choose a reason for hiding this comment

maciej-sypetkowski Dec 24, 2023

Choose a reason for hiding this comment

maciej-sypetkowski left a comment

Choose a reason for hiding this comment

maciej-sypetkowski Dec 24, 2023

Choose a reason for hiding this comment

DomInvivo commented Dec 20, 2023 •

edited

Loading

codecov bot commented Dec 20, 2023 •

edited

Loading