New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[AutoMM] Support HPO presets #2839

Merged

zhiqiangdon merged 13 commits into autogluon:master from zhiqiangdon:mm-presets

Feb 7, 2023

Contributor

zhiqiangdon commented Feb 4, 2023 •

edited

Issue #, if available:

Description of changes:

Support HPO presets, e.g., high_quality_hpo, medium_quality_hpo, best_quality_hpo, or hpo.
Unit tests.
Support combining preset hyperparameters and user-provided hyperparameters.
Support combining preset hyperparameter_tune_kwargs and user-provided hyperparameter_tune_kwargs.

from autogluon.multimodal import MultiModalPredictor
predictor = MultiModalPredictor(problem_type="classification", presets="high_quality_hpo")
predictor.fit(train_data=train_data)

Text backbone candidates with the number of parameters:

medium quality

"google/electra-small-discriminator" (13.5M)
"google/flan-t5-small" (35.3M)
"microsoft/deberta-v3-xsmall" (22M)
"microsoft/MiniLM-L12-H384-uncased" (33.4M)
"albert-base-v2" (11.7M)

high quality

"google/electra-base-discriminator" (108.9M)
"google/flan-t5-base" (109.6M)
"microsoft/deberta-v3-small" (141M)
"roberta-base" (124.6M)
"albert-xlarge-v2" (58.8M)

best quality

"microsoft/deberta-v3-base" (183.8M)
"google/flan-t5-large" (341.2M)
"google/electra-large-discriminator" (334.1M)
"roberta-large" (355.4M)

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

zhiqiangdon added 7 commits

February 1, 2023 16:10


          add util

926b484


          hpo presets

a381c1e


          update

1973cf6

fix

d13728e

fix

1415ce4


          merge

300d4e6


          update

3f9f56c

zhiqiangdon changed the title ~~[AutoMM] Add HPO presets~~ [AutoMM] Support HPO presets

zhiqiangdon added the model list checked label

zhiqiangdon added 2 commits

February 3, 2023 17:50

fix

dfd3739

fix

7c0fb39

github-actions bot commented Feb 4, 2023

Job PR-2839-7c0fb39 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2839/7c0fb39/index.html

zhiqiangdon added 2 commits

February 4, 2023 23:48


          tests

7b72b0f

fix

c9586fe

github-actions bot commented Feb 5, 2023

Job PR-2839-c9586fe is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2839/c9586fe/index.html


          backbone candidates

116d34d

zhiqiangdon requested review from sxjscience, bryanyzhu, cheungdaven, FANGAreNotGnu and yongxinw

February 6, 2023 06:39

github-actions bot commented Feb 6, 2023

Job PR-2839-116d34d is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2839/116d34d/index.html

bryanyzhu reviewed

View reviewed changes

Contributor

bryanyzhu left a comment

LGTM with a small comment.

multimodal/src/autogluon/multimodal/presets.py

+                                  "model.hf_text.checkpoint_name": "google/electra-small-discriminator",
+                                  "model.timm_image.checkpoint_name": "mobilenetv3_large_100",
+                                  "model.document_transformer.checkpoint_name": "microsoft/layoutlmv2-base-uncased",
+                                  "optimization.learning_rate": 4e-4,

Contributor

bryanyzhu Feb 6, 2023

Do we want to specify learning rate here? It seems other default settings only include checkpoint name.

Contributor Author

zhiqiangdon Feb 6, 2023

This learning rate is used for small backbones, which has been here since the last release.

multimodal/src/autogluon/multimodal/presets.py

+              default_tunable_hyperparameters = {
+                  "optimization.learning_rate": tune.loguniform(1e-5, 1e-2),
+                  "optimization.optim_type": tune.choice(["adamw", "sgd"]),
+                  "optimization.max_epochs": tune.choice(list(range(5, 31))),

Contributor

bryanyzhu Feb 6, 2023

@FANGAreNotGnu I'm not sure if this default setting works for detection, since detection usually requires more training epochs. We might need to override these values in detection_hpo presets.

Contributor Author

zhiqiangdon Feb 6, 2023 •

edited

This just provides the default. Each problem type can further customize them.

cheungdaven reviewed

View reviewed changes

multimodal/src/autogluon/multimodal/presets.py

+              default_hyperparameter_tune_kwargs = {
+                  "searcher": "bayes",
+                  "scheduler": "ASHA",
+                  "num_trials": 512,

Contributor

cheungdaven Feb 6, 2023

will this number be too large?

Contributor Author

zhiqiangdon Feb 6, 2023

I'm not sure. It kind of depends on our search space. How many trials do you think are reasonable?

multimodal/src/autogluon/multimodal/presets.py Outdated

+                  "optimization.learning_rate": tune.loguniform(1e-5, 1e-2),
+                  "optimization.optim_type": tune.choice(["adamw", "sgd"]),
+                  "optimization.max_epochs": tune.choice(list(range(5, 31))),
+                  "env.batch_size": tune.choice([32, 64, 128, 256]),

Contributor

cheungdaven Feb 6, 2023

also try smaller batch size?

Collaborator

sxjscience Feb 6, 2023

We can consider not tune batch-size for now (have a seperate batch size tuning logic) and focus on select the learning rate.

Contributor Author

zhiqiangdon Feb 6, 2023 •

edited

Batch size can also affect the performance. @sxjscience Do you mean tuning per_gpu_batch_size (which should not affect the performance) with lightning's tuner?

multimodal/src/autogluon/multimodal/presets.py

+              def parse_presets_str(presets: str):
+                  use_hpo = False
+                  if presets.endswith("_hpo"):

Contributor

cheungdaven Feb 6, 2023

will this be case sensitive?

Contributor Author

zhiqiangdon Feb 6, 2023

presets is already converted to lower case in predictor init.

multimodal/src/autogluon/multimodal/presets.py

@@ @@ -604,6 +718,10 @@ def get_automm_presets(problem_type: str, presets: str): @@
                   hyperparameter_tune_kwargs
                       Hyperparameter tuning strategy and kwargs (for example, how many HPO trials to run).
                   """
+                  if not presets:
+                      presets = DEFAULT
+                  if presets == "hpo":

Contributor

cheungdaven Feb 6, 2023

case sensitive, consider move the line (presets = presets.lower()) before it.

Collaborator

sxjscience commented Feb 6, 2023

We can pick a selection criteria for each catalog, for example, we can measure the model size / training time (ideally we should report a curve about training throughput v.s. performance).

For now, we can consider to adopt the following rules:

medium quality (<50M backbones)
high quality (>=50M,<200M backbones)
best quality (>=200M backbones)

We can thus also consider backbones in https://www.sbert.net/docs/pretrained_models.html, and use other common backbones like

DeBERTA-series: https://huggingface.co/microsoft/deberta-v3-xsmall
MiniLM-series: https://huggingface.co/microsoft/MiniLM-L12-H384-uncased
FLAN-T5-series: https://huggingface.co/google/flan-t5-base


          update text backbones

4aded4d

sxjscience approved these changes

View reviewed changes

Collaborator

sxjscience left a comment

LGTM overall. I'm not sure about whether we should also search for 'batch_size' as we need to usually use as large batch-size as possible. And the HPO is mainly centered at model selection / search for the best tuning method.

github-actions bot commented Feb 7, 2023

Job PR-2839-4aded4d is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2839/4aded4d/index.html

bryanyzhu approved these changes

View reviewed changes

zhiqiangdon merged commit 5acf693 into autogluon:master

zhiqiangdon deleted the mm-presets branch

February 8, 2023 06:09

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

cheungdaven cheungdaven left review comments

sxjscience sxjscience approved these changes

bryanyzhu bryanyzhu approved these changes

FANGAreNotGnu Awaiting requested review from FANGAreNotGnu

yongxinw Awaiting requested review from yongxinw