Asymmetric Encoder and Decoder Configuration for Megatron Models #4568

MaximumEntropy · 2022-07-20T05:27:05Z

Signed-off-by: MaximumEntropy sandeep.subramanian.1@umontreal.ca

What does this PR do ?

Adds the ability to configure encoder and decoder asymmetrically for encoder and decoder.

Collection: NLP

Changelog

Make YAML 1-step hierarchical
Function in lm_encoder_decoder to maintain backward compatibility.

Usage

You can potentially add a usage example below

model.encoder.num_layers=12
model.decoder.num_layers=2

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

lgtm-com · 2022-07-20T05:44:39Z

This pull request introduces 1 alert when merging 997aa17 into fea3775 - view on LGTM.com

new alerts:

1 for Unused import

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

lgtm-com · 2022-07-20T17:57:00Z

This pull request introduces 1 alert when merging d8ac136 into fea3775 - view on LGTM.com

new alerts:

1 for Unused import

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

michalivne

Great PR! See comments.

examples/nlp/language_modeling/conf/megatron_ul2_config.yaml

nemo/collections/nlp/modules/common/megatron/token_level_encoder_decoder.py

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

lgtm-com · 2022-07-29T04:12:53Z

This pull request fixes 2 alerts when merging 3f567b5 into 1d25c90 - view on LGTM.com

fixed alerts:

2 for Unused local variable

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

lgtm-com · 2022-07-29T04:51:45Z

This pull request fixes 2 alerts when merging 9e20e5f into 588c6ca - view on LGTM.com

fixed alerts:

2 for Unused local variable

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

lgtm-com · 2022-08-01T16:51:25Z

This pull request fixes 2 alerts when merging 463d9a8 into ce16320 - view on LGTM.com

fixed alerts:

2 for Unused local variable

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

…into asymmetric_enc_dec_megatron

lgtm-com · 2022-08-01T17:41:44Z

This pull request fixes 2 alerts when merging 019edef into ce16320 - view on LGTM.com

fixed alerts:

2 for Unused local variable

lgtm-com · 2022-08-01T18:52:53Z

This pull request fixes 2 alerts when merging 763203c into 1a9daa5 - view on LGTM.com

fixed alerts:

2 for Unused local variable

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

…into asymmetric_enc_dec_megatron

lgtm-com · 2022-08-01T19:11:34Z

This pull request fixes 2 alerts when merging 182a787 into 1a9daa5 - view on LGTM.com

fixed alerts:

2 for Unused local variable

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

lgtm-com · 2022-08-01T20:11:40Z

This pull request fixes 2 alerts when merging 52a9396 into 1a9daa5 - view on LGTM.com

fixed alerts:

2 for Unused local variable

okuchaiev · 2022-08-01T20:23:09Z

/blossom-ci

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

lgtm-com · 2022-08-01T21:26:23Z

This pull request fixes 2 alerts when merging d18ee78 into 1a9daa5 - view on LGTM.com

fixed alerts:

2 for Unused local variable

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

lgtm-com · 2022-08-01T22:03:27Z

This pull request fixes 2 alerts when merging 5e32eeb into 1a9daa5 - view on LGTM.com

fixed alerts:

2 for Unused local variable

michalivne

LGTM! See minor comments and questions

examples/nlp/language_modeling/conf/megatron_ul2_config.yaml

examples/nlp/machine_translation/conf/aayn_base_megatron.yaml

nemo/collections/nlp/models/language_modeling/megatron_lm_encoder_decoder_model.py

nemo/collections/nlp/modules/common/megatron/megatron_encoder_decoder.py

michalivne · 2022-08-01T22:20:36Z

nemo/collections/nlp/modules/common/megatron/megatron_perceiver_encoders.py

@@ -59,9 +59,6 @@ def __init__(
        encoder_attn_mask_type=AttnMaskType.padding,
        hidden_dropout=0.1,
        attention_dropout=0.1,


What happened to

position_embedding_type='learned_absolute', relative_attention_num_buckets=32, relative_attention_max_distance=128,

?

I've removed RPE support from perceivers for now.

michalivne · 2022-08-01T22:21:09Z

nemo/collections/nlp/modules/common/megatron/megatron_transformer_decoder.py

@@ -130,6 +130,7 @@ def __init__(
            model_type=parent_model_type,
            transformer_block_type=transformer_block_type,
            headscale=headscale,
+            gradient_accumulation_fusion=False,  # TODO: This has to be False for enc-dec models for now.


Will the model work if grad accumulation is set to true?

Yes, this is just a jit fusion thing that was implemented specifically for GPT that I turned off explicitly here for T5.

michalivne · 2022-08-01T22:22:38Z

nemo/collections/nlp/modules/common/megatron/token_level_encoder_decoder.py

-        decoder_arch,
-        vocab_size,
-        hidden_size,
+        encoder_cfg: DictConfig,


Way cleaner! Nice!

Its still hard to keep track of args everywhere, but we're getting close to making it clean :)

michalivne · 2022-08-01T22:24:52Z

nemo/collections/nlp/modules/common/megatron/token_level_encoder_decoder.py

+        return kv_channels
+
+    def _validate_enc_dec_hidden_size(self, encoder_cfg, decoder_cfg):
+        if encoder_cfg.hidden_size != decoder_cfg.hidden_size:


ok enc/dec hidden size is validated here.

michalivne · 2022-08-01T22:27:33Z

nemo/collections/nlp/modules/common/megatron/token_level_encoder_decoder.py

-                hidden_size % num_attention_heads == 0
-            ), 'hidden_size must be divisible by num_attention_heads if kv_channels is None'
-            kv_channels = hidden_size // num_attention_heads
+        encoder_kv_channels, decoder_kv_channels = self._validate_config()


Why do we need those values encoder_kv_channels, decoder_kv_channels?

Because these are typically provided as None in the yaml and then computed internally based on hidden size and num attention heads.

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

lgtm-com · 2022-08-01T22:51:24Z

This pull request fixes 2 alerts when merging fe68b84 into 1a9daa5 - view on LGTM.com

fixed alerts:

2 for Unused local variable

michalivne

LGTM!

lgtm-com · 2022-08-02T00:29:54Z

This pull request fixes 2 alerts when merging 7a4b576 into 2c8eb53 - view on LGTM.com

fixed alerts:

2 for Unused local variable

lgtm-com · 2022-08-02T01:46:11Z

This pull request fixes 2 alerts when merging db6b55e into a592f6f - view on LGTM.com

fixed alerts:

2 for Unused local variable

…DIA#4568) * Initial asymmetric Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Update YAML Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix other yaml files Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove unused import Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Update configs Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * NMT fixes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Add missing arg Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Jenkins test updates Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * More fixes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Rank check fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Add enc/dec specific rpe configs Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * CI Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix perceiver and model type Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove RPE args from perceiver Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Validate RPE and perceivers Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix layer type Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * deep copy config and better backward compat Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Perceiver compatibility and headscale changes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix shapes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Update CI test Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix heads Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Address comments Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Micha Livne <michalivne@users.noreply.github.com> Signed-off-by: David Mosallanezhad <dmosallanezh@nvidia.com>

…DIA#4568) * Initial asymmetric Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Update YAML Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix other yaml files Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove unused import Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Update configs Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * NMT fixes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Add missing arg Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Jenkins test updates Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * More fixes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Rank check fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Add enc/dec specific rpe configs Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * CI Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix perceiver and model type Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove RPE args from perceiver Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Validate RPE and perceivers Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix layer type Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * deep copy config and better backward compat Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Perceiver compatibility and headscale changes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix shapes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Update CI test Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix heads Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Address comments Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Micha Livne <michalivne@users.noreply.github.com> Signed-off-by: Anas Abou Allaban <aabouallaban@pm.me>

…DIA#4568) * Initial asymmetric Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Update YAML Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix other yaml files Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove unused import Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Update configs Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * NMT fixes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Add missing arg Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Jenkins test updates Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * More fixes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Rank check fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Add enc/dec specific rpe configs Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * CI Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix perceiver and model type Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove RPE args from perceiver Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Validate RPE and perceivers Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix layer type Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * deep copy config and better backward compat Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Perceiver compatibility and headscale changes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix shapes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Update CI test Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix heads Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Style Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Address comments Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Micha Livne <michalivne@users.noreply.github.com> Signed-off-by: Hainan Xu <hainanx@nvidia.com>

MaximumEntropy added 2 commits July 19, 2022 16:29

Initial asymmetric

1fd3710

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Update YAML

997aa17

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Fix other yaml files

d8ac136

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

MaximumEntropy added 7 commits July 20, 2022 17:32

Style

779e71d

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Remove unused import

50de62c

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Update configs

0cc8908

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

NMT fixes

c0c3a7a

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Fix

3720d80

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Add missing arg

ae21c7e

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Style

4fab83a

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

MaximumEntropy requested review from michalivne and ericharper July 22, 2022 18:19

MaximumEntropy added 5 commits July 22, 2022 11:20

Merge branch 'main' into asymmetric_enc_dec_megatron

771467f

Merge and fix

0f2017b

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Jenkins test updates

9a1124a

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

More fixes

f94788e

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Merge branch 'main' into asymmetric_enc_dec_megatron

52045c7

michalivne reviewed Jul 25, 2022

View reviewed changes

MaximumEntropy added 5 commits July 26, 2022 10:19

Rank check fix

0415143

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Fix merge

1a73d36

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Add enc/dec specific rpe configs

80e0499

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Merge branch 'main' into asymmetric_enc_dec_megatron

6b23511

Merge

3f567b5

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

MaximumEntropy added 2 commits July 28, 2022 21:35

Style

06bd1a3

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Merge branch 'main' into asymmetric_enc_dec_megatron

9e20e5f

Fix

218e4e9

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Merge branch 'main' into asymmetric_enc_dec_megatron

463d9a8

MaximumEntropy added 2 commits August 1, 2022 10:18

Update CI test

28b489b

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Merge branch 'asymmetric_enc_dec_megatron' of github.com:NVIDIA/NeMo …

019edef

…into asymmetric_enc_dec_megatron

Merge branch 'main' into asymmetric_enc_dec_megatron

763203c

MaximumEntropy added 2 commits August 1, 2022 11:53

Fix

fa6884f

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Merge branch 'asymmetric_enc_dec_megatron' of github.com:NVIDIA/NeMo …

182a787

…into asymmetric_enc_dec_megatron

Fix heads

52a9396

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Fix

d18ee78

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

Style

5e32eeb

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

michalivne reviewed Aug 1, 2022

View reviewed changes

Address comments

fe68b84

Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca>

michalivne approved these changes Aug 1, 2022

View reviewed changes

Merge branch 'main' into asymmetric_enc_dec_megatron

7a4b576

Merge branch 'main' into asymmetric_enc_dec_megatron

db6b55e

MaximumEntropy merged commit 8e1436b into main Aug 2, 2022

MaximumEntropy deleted the asymmetric_enc_dec_megatron branch August 2, 2022 02:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Asymmetric Encoder and Decoder Configuration for Megatron Models #4568

Asymmetric Encoder and Decoder Configuration for Megatron Models #4568

MaximumEntropy commented Jul 20, 2022

lgtm-com bot commented Jul 20, 2022

lgtm-com bot commented Jul 20, 2022

michalivne left a comment

lgtm-com bot commented Jul 29, 2022

lgtm-com bot commented Jul 29, 2022

lgtm-com bot commented Aug 1, 2022

lgtm-com bot commented Aug 1, 2022

lgtm-com bot commented Aug 1, 2022

lgtm-com bot commented Aug 1, 2022

lgtm-com bot commented Aug 1, 2022

okuchaiev commented Aug 1, 2022

lgtm-com bot commented Aug 1, 2022

lgtm-com bot commented Aug 1, 2022

michalivne left a comment

michalivne Aug 1, 2022

MaximumEntropy Aug 1, 2022

michalivne Aug 1, 2022

MaximumEntropy Aug 1, 2022

michalivne Aug 1, 2022

MaximumEntropy Aug 1, 2022

michalivne Aug 1, 2022

michalivne Aug 1, 2022

MaximumEntropy Aug 1, 2022

lgtm-com bot commented Aug 1, 2022

michalivne left a comment

lgtm-com bot commented Aug 2, 2022

lgtm-com bot commented Aug 2, 2022

Asymmetric Encoder and Decoder Configuration for Megatron Models #4568

Asymmetric Encoder and Decoder Configuration for Megatron Models #4568

Conversation

MaximumEntropy commented Jul 20, 2022

What does this PR do ?

Changelog

Usage

Before your PR is "Ready for review"

Who can review?

Additional Information

lgtm-com bot commented Jul 20, 2022

lgtm-com bot commented Jul 20, 2022

michalivne left a comment

Choose a reason for hiding this comment

lgtm-com bot commented Jul 29, 2022

lgtm-com bot commented Jul 29, 2022

lgtm-com bot commented Aug 1, 2022

lgtm-com bot commented Aug 1, 2022

lgtm-com bot commented Aug 1, 2022

lgtm-com bot commented Aug 1, 2022

lgtm-com bot commented Aug 1, 2022

okuchaiev commented Aug 1, 2022

lgtm-com bot commented Aug 1, 2022

lgtm-com bot commented Aug 1, 2022

michalivne left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lgtm-com bot commented Aug 1, 2022

michalivne left a comment

Choose a reason for hiding this comment

lgtm-com bot commented Aug 2, 2022

lgtm-com bot commented Aug 2, 2022