[CodeCamp2023-470] Runner supports setting the number of iterations for each epoch #1292

ShuRaymond · 2023-08-04T09:08:14Z

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

One of Openmmlab codecamp task.

Modification

Modified _flexible_runner.py and runner.py so that FlexibleRunner supports setting the number of rounds per epoch iteration to save debugging time.

Checklist

Pre-commit or other linting tools are used to fix the potential lint issues.
The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
If the modification has potential influence on downstream projects, this PR should be tested with downstream projects, like MMDet or MMCls.
The documentation has been modified accordingly, like docstring or example tutorials.

Signed-off-by: ShuRaymond <shu291711388@gmail.com>

CLAassistant · 2023-08-04T09:24:07Z

All committers have signed the CLA.

HAOCHENYE · 2023-08-07T03:26:39Z

Hi, we should also update a unit test to validate this feature works as expected 😄

zhouzaida · 2023-08-07T07:53:50Z

Hi @ShuRaymond , thanks for your contribution.

Here are several comments:

No need to modify FlexibleRunner, as num_batch_per_epoch can be passed to the loop directly through train_cfg, val_cfg, or test_cfg.
No need to update IterBasedTrainLoop
Unit tests need to be added.
- mmengine/tests/test_runner/test_runner.py
  
  Line 1304 in 398d229
  
  def test_build_train_loop(self):
- mmengine/tests/test_runner/test_runner.py
  
  Line 1350 in 398d229
  
  def test_build_val_loop(self):
- mmengine/tests/test_runner/test_runner.py
  
  Line 1377 in 398d229
  
  def test_build_test_loop(self):
Documentation needs to be updated.
- docs/zh_cn/common_usage/debug_tricks.md
- docs/en/common_usage/debug_tricks.md

ShuRaymond · 2023-08-07T12:40:17Z

Hi, we should also update a unit test to validate this feature works as expected 😄

thanks for reminding, I am doing it.

ShuRaymond · 2023-08-07T12:41:15Z

Hi @ShuRaymond , thanks for your contribution.

Here are several comments:

No need to modify FlexibleRunner, as num_batch_per_epoch can be passed to the loop directly through train_cfg, val_cfg, or test_cfg.

No need to update IterBasedTrainLoop

Unit tests need to be added.

mmengine/tests/test_runner/test_runner.py

Line 1304 in 398d229

def test_build_train_loop(self):

mmengine/tests/test_runner/test_runner.py

Line 1350 in 398d229

def test_build_val_loop(self):

mmengine/tests/test_runner/test_runner.py

Line 1377 in 398d229

def test_build_test_loop(self):

Documentation needs to be updated.

docs/zh_cn/common_usage/debug_tricks.md

docs/en/common_usage/debug_tricks.md

thanks for reminding and teaching, just done it.

zhouzaida · 2023-08-07T15:44:11Z

Hi, also need to add several unit tests (whether the num_batch_per_epoch works as expected) in the following methods:

mmengine/tests/test_runner/test_runner.py

Line 1432 in bbd416a

def test_train(self):

mmengine/tests/test_runner/test_runner.py

Line 1841 in bbd416a

def test_val(self):

mmengine/tests/test_runner/test_runner.py

Line 1920 in bbd416a

def test_test(self):

test_train

def test_train(self):
    # 15 test num_batch_per_epoch
    cfg = copy.deepcopy(self.epoch_based_cfg)
    cfg.train_cfg = dict(
        by_epoch=True,
        max_epochs=3,
        num_batch_per_epoch=2,
    )
    runner = Runner.from_cfg(cfg)
    runner.train()
    self.assertEqual(runner.iter, 3 * 2)

zhouzaida · 2023-08-07T15:44:45Z

mmengine/runner/_flexible_runner.py

@@ -298,6 +298,7 @@ def __init__(
                f'train_dataloader={train_dataloader}, '
                f'train_cfg={train_cfg}, '
                f'optim_wrapper={optim_wrapper}.')
+


Suggested change

zhouzaida · 2023-08-07T15:46:22Z

Also, need to update docstring in

mmengine/mmengine/runner/loops.py

Line 33 in bbd416a

corresponding milestone. Defaults to None.

num_batch_per_epoch (int, optional):

zhouzaida · 2023-08-07T15:46:56Z

mmengine/runner/loops.py

@@ -40,6 +40,7 @@ def __init__(
            max_epochs: int,
            val_begin: int = 1,
            val_interval: int = 1,
+            num_batch_per_epoch: Optional[int] = None,


Adding a new parameter in the middle position may cause a bc issue. Suggest moving it to the end.

HAOCHENYE · 2023-08-28T02:41:36Z