[Relax][Training] Trainer API #115

SiriusNEO · 2023-01-30T01:57:35Z

This PR brings a wrapper for relax training. The following things are done internally in this trainer:

Maintain (store/update) the parameters of the module.
Merge backbone and specified loss function together.
Build/Compile/Run the module.
Build/Compile/Run the optimizer. (using the same vm_config as we run the module.)

And it also provides two interfaces for loading params/exporting params.

Example:

trainer = Trainer(MLP, [1, 2], "main") # [1, 2] means input[1] and input[2] are parameters in this module.
trainer.set_loss(MSELoss(reduction="sum"), pred_sinfo, pred_sinfo)
trainer.set_vm_config(target="llvm")
trainer.set_optimizer(optim_type=SGD, lr=0.001).setup()
trainer.setup()
trainer.rand_init_params()
trainer.forward(*fwd_inputs)
trainer.backward(*bwd_inputs)

python/tvm/relax/training/trainer.py

python/tvm/relax/training/utils.py

python/tvm/relax/training/setup_trainer.py

python/tvm/relax/training/trainer.py

python/tvm/relax/training/setup_trainer.py

python/tvm/relax/training/trainer.py

python/tvm/relax/training/setup_trainer.py

* Add gpu ci. * Update autotir gpu test.

MasterJH5574

Only one minor point. Besides, would you like to update the PR description a bit so that it is consistent with the new API?

python/tvm/relax/training/setup_trainer.py

This PR brings a wrapper for relax training. The following things are done internally in this trainer: - Maintain (store/update) the parameters of the module. - Merge backbone and specified loss function together. - Build/Compile/Run the module. - Build/Compile/Run the optimizer. (using the same vm_config as we run the module.) And it also provides two interfaces for loading params/exporting params. Example: ``` trainer = Trainer(MLP, [1, 2], "main") # [1, 2] means input[1] and input[2] are parameters in this module. trainer.set_loss(MSELoss(reduction="sum"), pred_sinfo, pred_sinfo) trainer.set_vm_config(target="llvm") trainer.set_optimizer(optim_type=SGD, lr=0.001).setup() trainer.setup() trainer.rand_init_params() trainer.forward(*fwd_inputs) trainer.backward(*bwd_inputs) ```

MasterJH5574 force-pushed the relax branch from 38ac587 to 79de148 Compare January 31, 2023 22:07

SiriusNEO force-pushed the mlc-dev/2023-01-30-trainer branch from 1c538c5 to 542f647 Compare February 1, 2023 01:34

SiriusNEO mentioned this pull request Feb 1, 2023

[Tracking Issue] Relax training M0 migration and polishment #97

Closed

20 tasks

SiriusNEO marked this pull request as ready for review February 1, 2023 01:49

SiriusNEO changed the title ~~[WIP][Relax][Training] Trainer API~~ [Relax][Training] Trainer API Feb 1, 2023

SiriusNEO marked this pull request as draft February 1, 2023 15:42

SiriusNEO marked this pull request as ready for review February 4, 2023 11:10

SiriusNEO requested review from Hzfengsy and MasterJH5574 February 5, 2023 06:20

SiriusNEO force-pushed the mlc-dev/2023-01-30-trainer branch from e24152e to 59223bc Compare February 5, 2023 06:25

Ubospica reviewed Feb 6, 2023

View reviewed changes

python/tvm/relax/training/trainer.py Outdated Show resolved Hide resolved

python/tvm/relax/training/utils.py Outdated Show resolved Hide resolved

Ubospica reviewed Feb 6, 2023

View reviewed changes

python/tvm/relax/training/setup_trainer.py Outdated Show resolved Hide resolved

python/tvm/relax/training/trainer.py Outdated Show resolved Hide resolved

MasterJH5574 reviewed Feb 6, 2023

View reviewed changes

python/tvm/relax/training/setup_trainer.py Outdated Show resolved Hide resolved

python/tvm/relax/training/trainer.py Outdated Show resolved Hide resolved

python/tvm/relax/training/setup_trainer.py Outdated Show resolved Hide resolved

SiriusNEO force-pushed the mlc-dev/2023-01-30-trainer branch from 576e7cb to f0beb77 Compare February 8, 2023 00:56

MasterJH5574 pushed a commit that referenced this pull request Feb 8, 2023

[CI] Enable GPU tests; Add AutoTIR cuda test. (#115)

ea2cddb

* Add gpu ci. * Update autotir gpu test.

MasterJH5574 force-pushed the relax branch from 5746118 to f1ab5e8 Compare February 8, 2023 15:31

SiriusNEO force-pushed the mlc-dev/2023-01-30-trainer branch from fa1b184 to 2fa30cd Compare February 9, 2023 08:50

spectrometerHBH pushed a commit to spectrometerHBH/relax that referenced this pull request Feb 9, 2023

[CI] Enable GPU tests; Add AutoTIR cuda test. (mlc-ai#115)

d8e5ebb

* Add gpu ci. * Update autotir gpu test.

MasterJH5574 approved these changes Feb 10, 2023

View reviewed changes

python/tvm/relax/training/setup_trainer.py Outdated Show resolved Hide resolved

SiriusNEO added 11 commits February 10, 2023 12:15

draft

e55e06f

runnable

84ffb01

finish

c63fb84

lint

0cc046f

refactor draft

26c52ee

lint

4366a6f

doc update

9ac5262

variable argument annotate

876b433

separate files

1c3cda6

address comments

7b9be0f

address comments

0a87b22

SiriusNEO added 4 commits February 10, 2023 12:15

refactor again

d640f89

fix doc

54dbfbf

fix gradient API

67625ef

change opt level

15fb62b

SiriusNEO force-pushed the mlc-dev/2023-01-30-trainer branch from e612300 to 15fb62b Compare February 10, 2023 04:19

SiriusNEO merged commit a65d808 into mlc-ai:relax Feb 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Relax][Training] Trainer API #115

[Relax][Training] Trainer API #115

SiriusNEO commented Jan 30, 2023 •

edited

Loading

MasterJH5574 left a comment

[Relax][Training] Trainer API #115

[Relax][Training] Trainer API #115

Conversation

SiriusNEO commented Jan 30, 2023 • edited Loading

MasterJH5574 left a comment

Choose a reason for hiding this comment

SiriusNEO commented Jan 30, 2023 •

edited

Loading