Add Module-level Adapters, Save-Restore and tests #4114

titu1994 · 2022-05-05T04:39:52Z

What does this PR do ?

Adds significant functionality to adapters, allowing multiple module-specific adapters, save and restore of just the adapters themselves rather than the full model (saving several hundred megabytes-to-gigabytes), and a suite of tests for the new functionality.

Collection: [Core, ASR]

Changelog

Adds support for both "global" and "module" specific adapters. Module specific adapters allow for targetted adapters to certain modules, independent of the others.
Adds support for saving and restoring just the adapter modules, independent of the original model weights. This allows savings of significant disk storage, since just the adapters are usually a tiny fraction of the param count of the models.
Adds a battery of tests for the new capabilities.
Add support for dropout to adapter modules
Add support for stochastic dropout to all adapters via ResidualAddStrategy.

Usage

Subclass implementations can chose between global adapters, module adapters or both. A basic implementation of Adapters for a toy model is provided in the tests/core/mixins/adapters/test_adapter_model_mixin.py

# Global adapters (adds "abc" adapter to all supported modules - encoder+decoder+others1)
model.add_adapter(name="abc", cfg=cfg) 

# Module adapters 
model.add_adapter(name="decoder:abc", cfg=cfg)

# Save all of the adapters (only the adapters themselves)
model.save_adapters("adapters.pt", name=None)
OR
model.save_adapters("adapters.pt", name={global or module adapter name})

# Restore one-or-all adapters (only the adapters themselves)
new_model.load_adapters("adapters.pt", name=None)
OR
new_model.load_adapters("adapters.pt", name={exact module name from state dict}, 
                        map_location='cpu', strict=True)

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

lgtm-com · 2022-05-06T18:25:03Z

This pull request introduces 1 alert when merging cf85cf3 into ddd8719 - view on LGTM.com

new alerts:

1 for Unused local variable

lgtm-com · 2022-05-08T03:41:26Z

This pull request introduces 1 alert when merging 5462c37 into ddd8719 - view on LGTM.com

new alerts:

1 for Unused local variable

lgtm-com · 2022-05-08T05:03:31Z

This pull request introduces 1 alert when merging 644247d into ddd8719 - view on LGTM.com

new alerts:

1 for Unused local variable

lgtm-com · 2022-05-08T23:39:09Z

This pull request introduces 1 alert when merging 976166e into ddd8719 - view on LGTM.com

new alerts:

1 for Unused local variable

lgtm-com · 2022-05-09T00:52:49Z

This pull request introduces 1 alert when merging aa19bf6 into ddd8719 - view on LGTM.com

new alerts:

1 for Unused local variable

lgtm-com · 2022-05-09T01:22:20Z

This pull request introduces 1 alert when merging 3496141 into ddd8719 - view on LGTM.com

new alerts:

1 for Unused local variable

lgtm-com · 2022-05-09T01:55:45Z

This pull request introduces 1 alert when merging 4fe248c into ddd8719 - view on LGTM.com

new alerts:

1 for Unused local variable

lgtm-com · 2022-05-09T10:07:47Z

This pull request introduces 1 alert when merging f6ab1e5 into ddd8719 - view on LGTM.com

new alerts:

1 for Unused local variable

lgtm-com · 2022-05-09T18:37:34Z

This pull request introduces 1 alert when merging e0bc4e1 into 470587a - view on LGTM.com

new alerts:

1 for Unused local variable

lgtm-com · 2022-05-10T04:51:52Z

This pull request introduces 1 alert when merging 7ea9643 into 650718f - view on LGTM.com

new alerts:

1 for Unused local variable

ericharper · 2022-05-10T17:21:57Z

/blossom-ci

ericharper · 2022-05-10T17:24:36Z

/blossom-ci

nemo/core/classes/modelPT.py

examples/asr/asr_adapters/train_asr_adapter.py

titu1994 · 2022-05-12T22:30:45Z

/blossom-ci

lgtm-com · 2022-05-13T05:58:11Z

This pull request introduces 1 alert when merging 0904b9e into 1311f4f - view on LGTM.com

new alerts:

1 for Unused local variable

lgtm-com · 2022-05-13T06:22:00Z

This pull request introduces 1 alert when merging 777cf7e into 1311f4f - view on LGTM.com

new alerts:

1 for Unused local variable

lgtm-com · 2022-05-13T23:48:46Z

This pull request introduces 1 alert when merging bcca9cf into 08df199 - view on LGTM.com

new alerts:

1 for Unused local variable

lgtm-com · 2022-05-14T07:10:48Z

This pull request introduces 1 alert when merging 7b63804 into 27129ab - view on LGTM.com

new alerts:

1 for Unused local variable

lgtm-com · 2022-05-14T08:10:09Z

This pull request introduces 1 alert when merging eaa973c into 27129ab - view on LGTM.com

new alerts:

1 for Unused local variable

lgtm-com · 2022-05-16T21:33:18Z

This pull request introduces 1 alert when merging 7169d13 into ed1ffeb - view on LGTM.com

new alerts:

1 for Unused local variable

erastorgueva-nv · 2022-05-16T23:06:24Z

examples/asr/asr_adapters/train_asr_adapter.py

@@ -120,7 +121,7 @@ def main(cfg):
        raise ValueError("Cannot set `cfg.model.nemo_model` and `cfg.model.pretrained_model`. Select one only.")


I'd write "Cannot set both cfg.model.nemo_model and cfg.model.pretrained_model. Set one only."

Signed-off-by: smajumdar <titu1994@gmail.com>

…ated tests Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <titu1994@gmail.com>

Signed-off-by: smajumdar <smajumdar@nvidia.com>

lgtm-com · 2022-05-17T19:15:57Z

This pull request introduces 1 alert when merging a624f32 into 8318980 - view on LGTM.com

new alerts:

1 for Unused local variable

* First draft of model level tests and support for multiple types adapters in same model Signed-off-by: smajumdar <titu1994@gmail.com> * Add save restore tests for adapters Signed-off-by: smajumdar <titu1994@gmail.com> * Add save restore tests for adapters Signed-off-by: smajumdar <titu1994@gmail.com> * Add adapter only save and restore Signed-off-by: smajumdar <titu1994@gmail.com> * Update base adapter config Signed-off-by: smajumdar <titu1994@gmail.com> * Add tests Signed-off-by: smajumdar <titu1994@gmail.com> * Fix collection of get enabled adapters, limiting to each module's scope Signed-off-by: smajumdar <titu1994@gmail.com> * Update docs and add support for resolution of module adapter names Signed-off-by: smajumdar <titu1994@gmail.com> * Update ASR adapters to only support module adapters Signed-off-by: smajumdar <titu1994@gmail.com> * Add state dict match test Signed-off-by: smajumdar <titu1994@gmail.com> * Fix name resolution for set_enabled_adapters Signed-off-by: smajumdar <titu1994@gmail.com> * Correct case where name is none for set adapter Signed-off-by: smajumdar <titu1994@gmail.com> * Correct case where there are no adapters to save Signed-off-by: smajumdar <titu1994@gmail.com> * Update config for training Signed-off-by: smajumdar <titu1994@gmail.com> * Force update to internal config upon get or set Signed-off-by: smajumdar <titu1994@gmail.com> * Add spec augment update support to adapters Signed-off-by: smajumdar <titu1994@gmail.com> * Correct config update Signed-off-by: smajumdar <titu1994@gmail.com> * Add dropout support to linear adapters Signed-off-by: smajumdar <titu1994@gmail.com> * Add type to config Signed-off-by: smajumdar <titu1994@gmail.com> * Add stochastic depth regularization to adapter merge strategy and related tests Signed-off-by: smajumdar <titu1994@gmail.com> * Add support for dynamic strategy change Signed-off-by: smajumdar <titu1994@gmail.com> * Add support for dynamic strategy change Signed-off-by: smajumdar <titu1994@gmail.com> * Add more tests Signed-off-by: smajumdar <titu1994@gmail.com> * Add more tests Signed-off-by: smajumdar <titu1994@gmail.com> * Remove logging of adapter name Signed-off-by: smajumdar <titu1994@gmail.com> * Update changes for reviews Signed-off-by: smajumdar <smajumdar@nvidia.com> * Refactor the utility methods Signed-off-by: smajumdar <smajumdar@nvidia.com> * Refactor the utility methods Signed-off-by: smajumdar <smajumdar@nvidia.com> * Fixed configs for optim and spec augment Signed-off-by: smajumdar <smajumdar@nvidia.com> * Fixed configs for optim and spec augment Signed-off-by: smajumdar <smajumdar@nvidia.com> * Rename method to subclassable private Signed-off-by: smajumdar <smajumdar@nvidia.com> * Add support for adapter module names to be pre-specified in config Signed-off-by: smajumdar <smajumdar@nvidia.com> * Fix imports Signed-off-by: smajumdar <smajumdar@nvidia.com> * Fix typos Signed-off-by: smajumdar <smajumdar@nvidia.com>

titu1994 marked this pull request as draft May 5, 2022 04:40

titu1994 marked this pull request as ready for review May 6, 2022 01:09

titu1994 force-pushed the multi_adapter branch from 3e8658e to d90679d Compare May 6, 2022 01:42

titu1994 requested review from sam1373, erastorgueva-nv and VahidooX May 6, 2022 02:01

sam1373 reviewed May 11, 2022

View reviewed changes

nemo/core/classes/modelPT.py Show resolved Hide resolved

examples/asr/asr_adapters/train_asr_adapter.py Show resolved Hide resolved

examples/asr/asr_adapters/train_asr_adapter.py Show resolved Hide resolved

titu1994 force-pushed the multi_adapter branch from 11799e4 to 0904b9e Compare May 13, 2022 05:35

titu1994 force-pushed the multi_adapter branch from bcca9cf to 7b63804 Compare May 14, 2022 07:00

titu1994 force-pushed the multi_adapter branch from eaa973c to 7c04286 Compare May 16, 2022 21:15

erastorgueva-nv reviewed May 16, 2022

View reviewed changes

titu1994 added 24 commits May 17, 2022 11:55

Fix name resolution for set_enabled_adapters

49c465b

Signed-off-by: smajumdar <titu1994@gmail.com>

Correct case where name is none for set adapter

9282c5e

Signed-off-by: smajumdar <titu1994@gmail.com>

Correct case where there are no adapters to save

b37cb10

Signed-off-by: smajumdar <titu1994@gmail.com>

Update config for training

6abada1

Signed-off-by: smajumdar <titu1994@gmail.com>

Force update to internal config upon get or set

b45da66

Signed-off-by: smajumdar <titu1994@gmail.com>

Add spec augment update support to adapters

5160410

Signed-off-by: smajumdar <titu1994@gmail.com>

Correct config update

1e260ab

Signed-off-by: smajumdar <titu1994@gmail.com>

Add dropout support to linear adapters

33c25d3

Signed-off-by: smajumdar <titu1994@gmail.com>

Add type to config

f415b8a

Signed-off-by: smajumdar <titu1994@gmail.com>

Add stochastic depth regularization to adapter merge strategy and rel…

d2292ac

…ated tests Signed-off-by: smajumdar <titu1994@gmail.com>

Add support for dynamic strategy change

33eb4a3

Signed-off-by: smajumdar <titu1994@gmail.com>

Add support for dynamic strategy change

3fb1654

Signed-off-by: smajumdar <titu1994@gmail.com>

Add more tests

3a235d2

Signed-off-by: smajumdar <titu1994@gmail.com>

Add more tests

c956167

Signed-off-by: smajumdar <titu1994@gmail.com>

Remove logging of adapter name

c5a38b7

Signed-off-by: smajumdar <titu1994@gmail.com>

Update changes for reviews

a6cd5ba

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Refactor the utility methods

aa6935b

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Refactor the utility methods

ca0f543

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Fixed configs for optim and spec augment

834d0f6

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Fixed configs for optim and spec augment

16f4d62

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Rename method to subclassable private

d857c92

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Add support for adapter module names to be pre-specified in config

c00ad25

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Fix imports

8b0d6a8

Signed-off-by: smajumdar <smajumdar@nvidia.com>

Fix typos

a624f32

Signed-off-by: smajumdar <smajumdar@nvidia.com>

titu1994 force-pushed the multi_adapter branch from 7169d13 to a624f32 Compare May 17, 2022 18:55

sam1373 approved these changes May 17, 2022

View reviewed changes

titu1994 merged commit 89994de into NVIDIA:main May 17, 2022

titu1994 deleted the multi_adapter branch May 17, 2022 18:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Module-level Adapters, Save-Restore and tests #4114

Add Module-level Adapters, Save-Restore and tests #4114

titu1994 commented May 5, 2022 •

edited

Loading

lgtm-com bot commented May 6, 2022

lgtm-com bot commented May 8, 2022

lgtm-com bot commented May 8, 2022

lgtm-com bot commented May 8, 2022

lgtm-com bot commented May 9, 2022

lgtm-com bot commented May 9, 2022

lgtm-com bot commented May 9, 2022

lgtm-com bot commented May 9, 2022

lgtm-com bot commented May 9, 2022

lgtm-com bot commented May 10, 2022

ericharper commented May 10, 2022

ericharper commented May 10, 2022

titu1994 commented May 12, 2022

lgtm-com bot commented May 13, 2022

lgtm-com bot commented May 13, 2022

lgtm-com bot commented May 13, 2022

lgtm-com bot commented May 14, 2022

lgtm-com bot commented May 14, 2022

lgtm-com bot commented May 16, 2022

erastorgueva-nv May 16, 2022

lgtm-com bot commented May 17, 2022

		@@ -120,7 +121,7 @@ def main(cfg):
		raise ValueError("Cannot set `cfg.model.nemo_model` and `cfg.model.pretrained_model`. Select one only.")

Add Module-level Adapters, Save-Restore and tests #4114

Add Module-level Adapters, Save-Restore and tests #4114

Conversation

titu1994 commented May 5, 2022 • edited Loading

What does this PR do ?

Changelog

Usage

Before your PR is "Ready for review"

lgtm-com bot commented May 6, 2022

lgtm-com bot commented May 8, 2022

lgtm-com bot commented May 8, 2022

lgtm-com bot commented May 8, 2022

lgtm-com bot commented May 9, 2022

lgtm-com bot commented May 9, 2022

lgtm-com bot commented May 9, 2022

lgtm-com bot commented May 9, 2022

lgtm-com bot commented May 9, 2022

lgtm-com bot commented May 10, 2022

ericharper commented May 10, 2022

ericharper commented May 10, 2022

titu1994 commented May 12, 2022

lgtm-com bot commented May 13, 2022

lgtm-com bot commented May 13, 2022

lgtm-com bot commented May 13, 2022

lgtm-com bot commented May 14, 2022

lgtm-com bot commented May 14, 2022

lgtm-com bot commented May 16, 2022

erastorgueva-nv May 16, 2022

Choose a reason for hiding this comment

lgtm-com bot commented May 17, 2022

titu1994 commented May 5, 2022 •

edited

Loading