Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add benchmark regression test script with tmux #849

Merged
merged 10 commits into from
Aug 25, 2021

Conversation

liqikai9
Copy link
Collaborator

@liqikai9 liqikai9 commented Aug 10, 2021

Motivation

When releasing the new version of our codebase monthly or quarterly, we would like to conduct benchmark regression tests for the previously released models and algorithms, which can support different priorities.

The base feature is to read a config file containing a model list and runtime parameters, then run multiple tasks in different panes and windows controlled by tmux automatically.

The priority of the models is as follows. P0: core, P1: important, P2: less important, P3: least important. You can assign different priorities for each model and also decide the priority levels for inference and training tasks, respectively.

This script aims at running multiple benchmark regression tasks without the need to start lots of terminals manually and avoiding the possible inconvenience due to network interruption when running tasks on remote servers, which is quite common.

Modification

We added the folder .dev_scripts containing two files: benchmark_regression_cfg_tmpl.yaml and benchmark_regression.py. Besides, in order to specify the work-dir of the inference task, we added an additional argument --work-dir to the script $mmpose/tools/test.py and modified the code accordingly.

Arguments

The script is based on $mmpose/tools/slurm_test.sh and $mmpose/tools/slurm_train.sh. It supports running test and train tasks with custom priority and runtime setting parameters, which can be specified in the config file.

To run the script, a config file containing multiple models is required. For example, the benchmark_regression_cfg_tmpl.yaml under the directory $mmpose/.dev_scripts. The config file gives a template about different fields. It has a model_list field that contains different priorities. Under each priority level, there are multiple models.

Specifically, the config file must indicate model priorities and paths to the config file and the corresponding checkpoint file. For example,

model_list:
    P0: # the priority of the models, P0: core, P1: important, P2: less important, P3: least important
      -   config: configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/res50_coco_256x192.py  # path to the config file
          checkpoint: https://download.openmmlab.com/mmpose/top_down/resnet/res50_coco_256x192-ec54d7f3_20200709.pth # path or url to the config file
        task_name: res50_coco_256x192 # the job name in slurm will be specified according to this field and the mode. If not specified, use the basename of the config file

        test:...

        train:...


    -    ...
      
    P1:...

The field priority like P0 and P1 is added so that you assign different priorities for different models. You can add more models as you need under the corresponding priority field.

For a more detailed description of the arguments, please refer to the script $mmpose/.dev_scripts/benchmark_regression.py.

Usage

Here is a simple example to run the script.

cd $mmpose
python ./.dev_scripts/benchmark_regression.py [--config ${/path/to/moel_list}] [--session-name ${SESSION_NAME}] [--priority ${TEST_PRIORITY} ${TRAIN_PRIORITY} ]

Note that the ${TEST_PRIORITY} and ${TRAIN_PRIORITY} give the largest number of priorities of test and train tasks, respectively.

Running the above script with default parameters, you will start a new tmux session with each pane running a task independently. Enjoy it!

@codecov
Copy link

codecov bot commented Aug 10, 2021

Codecov Report

Merging #849 (5f3e95b) into master (24dbb01) will increase coverage by 0.05%.
The diff coverage is 96.92%.

❗ Current head 5f3e95b differs from pull request most recent head 704bbbf. Consider uploading reports for the commit 704bbbf to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #849      +/-   ##
==========================================
+ Coverage   83.59%   83.64%   +0.05%     
==========================================
  Files         176      178       +2     
  Lines       14145    14195      +50     
  Branches     2364     2367       +3     
==========================================
+ Hits        11824    11874      +50     
- Misses       1713     1714       +1     
+ Partials      608      607       -1     
Flag Coverage Δ
unittests 83.57% <96.92%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
mmpose/models/heads/hmr_head.py 100.00% <ø> (ø)
mmpose/models/utils/smpl.py 96.29% <96.29%> (ø)
mmpose/models/__init__.py 100.00% <100.00%> (ø)
mmpose/models/builder.py 100.00% <100.00%> (ø)
mmpose/models/detectors/mesh.py 90.75% <100.00%> (+0.04%) ⬆️
mmpose/models/utils/__init__.py 100.00% <100.00%> (ø)
mmpose/datasets/pipelines/shared_transform.py 88.50% <0.00%> (+0.50%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 24dbb01...704bbbf. Read the comment docs.

@jin-s13 jin-s13 requested a review from ly015 August 10, 2021 12:06
@ly015 ly015 mentioned this pull request Aug 11, 2021
8 tasks
liqikai9 and others added 3 commits August 16, 2021 14:05
modify the config and rename the filename
modify the script and rename the filename
@liqikai9 liqikai9 requested a review from ly015 August 16, 2021 07:28
liqikai9 and others added 3 commits August 16, 2021 15:38
using mmcv.load to avoid introducing the extra dependency on yaml
@jin-s13 jin-s13 self-requested a review August 18, 2021 09:18
eval: mAP # evaluation metric, which depends on the dataset, e.g., "mAP" for MSCOCO
fuse-conv-bn:
gpu_collect:
P0: # the priority of the models, P0: core, P1: important, P2: less important, P3: least important
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few suggestions about the config file:

  • Rename this file "benchmark_regression_cfg_tmpl.yaml" or something, which serves as a template to show the full content that a config file could include. We will add a more compact config file with a full model list and only necessary arguments.
  • Use test instead of infer as the mode name.
  • gpus_per_node can be set to 8 for all models and modes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got these suggestions.

@ly015 ly015 merged commit f39830b into open-mmlab:master Aug 25, 2021
ly015 added a commit that referenced this pull request Aug 30, 2021
* Fix import and deprecation issues in unit tests (#871)

* fix some bugs in the unit test of smpl model.

* reorganize `tests/` to solve importing issue (PEP 420)

* fix deprecation warnings in unit tests

Co-authored-by: ly015 <liyining0712@gmail.com>

* add benchmark regression test script with tmux (#849)

* test the simple case using tmux to run multiple benchmark regression test tasks

* modify and rename the config file and script

* Delete config_list.yaml

* modify the config and rename the filename

* Delete test_benchmark_tmux.py

* modify the script and rename the filename

* Update setup.cfg

* using mmcv.load to avoid introducing the extra dependency on yaml

* fix some typo

* refactor the config file and modify the script accordingly

* modify the config and script

* rename the config file

* Correct dataset preparation guide of WFLW (#873)

* add pr template (#875)

* add CITATION.cff and update setup.py (#876)

* Add copyright header and pre-commit hook (#872)

* Add pre-commit hook to automatically add copyright file header

* update files with copyright header

*  Limit copyright checking in the first 2 lines of a file
* Exclude configs in demo/

* set max-header-lines as 5

* rebase to master and add copyright to new files
* move benchmark_regression into .dev_scripts/benchmark

* Translate tasks/2d_body_keypoint.md (#842)

* 2rd PR remove poseval

* fix lint

* revise the CN version

Co-authored-by: ly015 <liyining0712@gmail.com>

* fix some bugs in the unit test of smpl model.

* * reorganiz `tests/` to solve importing issue (PEP 420)

* add dataset info

* fix lint

* * fix wrongly modified parts in previous rebase
* fix lint

* rename datasets/_base_ as datasets/base

* resolve compatibility of pose_limb_color

* Add dummy dataset base classes with old names for compatibility

* * Rewrite relative unittest based on dataset_info
* Add bc-breaking test for functions related to dataset_info
* Rename DatasetInfo.dataset_info as DatasetInfo._dataset_info
* Fix dataset_info of h36m dataset

* Handle breaking change pose_limb_color -> pose_link_color

* add unittest for old-fashioned dataset initialization without dataset_info

* resolve naming conflict in unittests

Co-authored-by: zengwang430521 <zengwang430521@gmail.com>
Co-authored-by: ly015 <liyining0712@gmail.com>
ly015 added a commit that referenced this pull request Sep 2, 2021
* Fix import and deprecation issues in unit tests (#871)

* fix some bugs in the unit test of smpl model.

* reorganize `tests/` to solve importing issue (PEP 420)

* fix deprecation warnings in unit tests

Co-authored-by: ly015 <liyining0712@gmail.com>

* add benchmark regression test script with tmux (#849)

* test the simple case using tmux to run multiple benchmark regression test tasks

* modify and rename the config file and script

* Delete config_list.yaml

* modify the config and rename the filename

* Delete test_benchmark_tmux.py

* modify the script and rename the filename

* Update setup.cfg

* using mmcv.load to avoid introducing the extra dependency on yaml

* fix some typo

* refactor the config file and modify the script accordingly

* modify the config and script

* rename the config file

* Correct dataset preparation guide of WFLW (#873)

* add pr template (#875)

* add CITATION.cff and update setup.py (#876)

* Add copyright header and pre-commit hook (#872)

* Add pre-commit hook to automatically add copyright file header

* update files with copyright header

*  Limit copyright checking in the first 2 lines of a file
* Exclude configs in demo/

* set max-header-lines as 5

* rebase to master and add copyright to new files
* move benchmark_regression into .dev_scripts/benchmark

* Translate tasks/2d_body_keypoint.md (#842)

* 2rd PR remove poseval

* fix lint

* revise the CN version

Co-authored-by: ly015 <liyining0712@gmail.com>

* fix some bugs in the unit test of smpl model.

* * reorganiz `tests/` to solve importing issue (PEP 420)

* add dataset info

* fix lint

* * fix wrongly modified parts in previous rebase
* fix lint

* rename datasets/_base_ as datasets/base

* resolve compatibility of pose_limb_color

* Add dummy dataset base classes with old names for compatibility

* * Rewrite relative unittest based on dataset_info
* Add bc-breaking test for functions related to dataset_info
* Rename DatasetInfo.dataset_info as DatasetInfo._dataset_info
* Fix dataset_info of h36m dataset

* Handle breaking change pose_limb_color -> pose_link_color

* add unittest for old-fashioned dataset initialization without dataset_info

* resolve naming conflict in unittests

Co-authored-by: zengwang430521 <zengwang430521@gmail.com>
Co-authored-by: ly015 <liyining0712@gmail.com>
ly015 added a commit that referenced this pull request Sep 7, 2021
* add dataset info (#663)

* Fix import and deprecation issues in unit tests (#871)

* fix some bugs in the unit test of smpl model.

* reorganize `tests/` to solve importing issue (PEP 420)

* fix deprecation warnings in unit tests

Co-authored-by: ly015 <liyining0712@gmail.com>

* add benchmark regression test script with tmux (#849)

* test the simple case using tmux to run multiple benchmark regression test tasks

* modify and rename the config file and script

* Delete config_list.yaml

* modify the config and rename the filename

* Delete test_benchmark_tmux.py

* modify the script and rename the filename

* Update setup.cfg

* using mmcv.load to avoid introducing the extra dependency on yaml

* fix some typo

* refactor the config file and modify the script accordingly

* modify the config and script

* rename the config file

* Correct dataset preparation guide of WFLW (#873)

* add pr template (#875)

* add CITATION.cff and update setup.py (#876)

* Add copyright header and pre-commit hook (#872)

* Add pre-commit hook to automatically add copyright file header

* update files with copyright header

*  Limit copyright checking in the first 2 lines of a file
* Exclude configs in demo/

* set max-header-lines as 5

* rebase to master and add copyright to new files
* move benchmark_regression into .dev_scripts/benchmark

* Translate tasks/2d_body_keypoint.md (#842)

* 2rd PR remove poseval

* fix lint

* revise the CN version

Co-authored-by: ly015 <liyining0712@gmail.com>

* fix some bugs in the unit test of smpl model.

* * reorganiz `tests/` to solve importing issue (PEP 420)

* add dataset info

* fix lint

* * fix wrongly modified parts in previous rebase
* fix lint

* rename datasets/_base_ as datasets/base

* resolve compatibility of pose_limb_color

* Add dummy dataset base classes with old names for compatibility

* * Rewrite relative unittest based on dataset_info
* Add bc-breaking test for functions related to dataset_info
* Rename DatasetInfo.dataset_info as DatasetInfo._dataset_info
* Fix dataset_info of h36m dataset

* Handle breaking change pose_limb_color -> pose_link_color

* add unittest for old-fashioned dataset initialization without dataset_info

* resolve naming conflict in unittests

Co-authored-by: zengwang430521 <zengwang430521@gmail.com>
Co-authored-by: ly015 <liyining0712@gmail.com>

* fix typo

* fix typo

Co-authored-by: Jas <jinsheng@sensetime.com>
Co-authored-by: zengwang430521 <zengwang430521@gmail.com>
@liqikai9 liqikai9 deleted the test_benchmark_script branch March 8, 2022 07:59
shuheilocale pushed a commit to shuheilocale/mmpose that referenced this pull request May 6, 2023
* test the simple case using tmux to run multiple benchmark regression test tasks

* modify and rename the config file and script

* Delete config_list.yaml

* modify the config and rename the filename

* Delete test_benchmark_tmux.py

* modify the script and rename the filename

* Update setup.cfg

* using mmcv.load to avoid introducing the extra dependency on yaml

* fix some typo

* refactor the config file and modify the script accordingly

* modify the config and script

* rename the config file
shuheilocale pushed a commit to shuheilocale/mmpose that referenced this pull request May 6, 2023
* add dataset info (open-mmlab#663)

* Fix import and deprecation issues in unit tests (open-mmlab#871)

* fix some bugs in the unit test of smpl model.

* reorganize `tests/` to solve importing issue (PEP 420)

* fix deprecation warnings in unit tests

Co-authored-by: ly015 <liyining0712@gmail.com>

* add benchmark regression test script with tmux (open-mmlab#849)

* test the simple case using tmux to run multiple benchmark regression test tasks

* modify and rename the config file and script

* Delete config_list.yaml

* modify the config and rename the filename

* Delete test_benchmark_tmux.py

* modify the script and rename the filename

* Update setup.cfg

* using mmcv.load to avoid introducing the extra dependency on yaml

* fix some typo

* refactor the config file and modify the script accordingly

* modify the config and script

* rename the config file

* Correct dataset preparation guide of WFLW (open-mmlab#873)

* add pr template (open-mmlab#875)

* add CITATION.cff and update setup.py (open-mmlab#876)

* Add copyright header and pre-commit hook (open-mmlab#872)

* Add pre-commit hook to automatically add copyright file header

* update files with copyright header

*  Limit copyright checking in the first 2 lines of a file
* Exclude configs in demo/

* set max-header-lines as 5

* rebase to master and add copyright to new files
* move benchmark_regression into .dev_scripts/benchmark

* Translate tasks/2d_body_keypoint.md (open-mmlab#842)

* 2rd PR remove poseval

* fix lint

* revise the CN version

Co-authored-by: ly015 <liyining0712@gmail.com>

* fix some bugs in the unit test of smpl model.

* * reorganiz `tests/` to solve importing issue (PEP 420)

* add dataset info

* fix lint

* * fix wrongly modified parts in previous rebase
* fix lint

* rename datasets/_base_ as datasets/base

* resolve compatibility of pose_limb_color

* Add dummy dataset base classes with old names for compatibility

* * Rewrite relative unittest based on dataset_info
* Add bc-breaking test for functions related to dataset_info
* Rename DatasetInfo.dataset_info as DatasetInfo._dataset_info
* Fix dataset_info of h36m dataset

* Handle breaking change pose_limb_color -> pose_link_color

* add unittest for old-fashioned dataset initialization without dataset_info

* resolve naming conflict in unittests

Co-authored-by: zengwang430521 <zengwang430521@gmail.com>
Co-authored-by: ly015 <liyining0712@gmail.com>

* fix typo

* fix typo

Co-authored-by: Jas <jinsheng@sensetime.com>
Co-authored-by: zengwang430521 <zengwang430521@gmail.com>
HAOCHENYE pushed a commit to HAOCHENYE/mmpose that referenced this pull request Jun 27, 2023
…lab#849)

* [Enhance] Ensure metrics is not empty when saving best ckpts

* fix warn to warning

* delete a unnecessary method
ajgrafton pushed a commit to ajgrafton/mmpose that referenced this pull request Mar 6, 2024
* test the simple case using tmux to run multiple benchmark regression test tasks

* modify and rename the config file and script

* Delete config_list.yaml

* modify the config and rename the filename

* Delete test_benchmark_tmux.py

* modify the script and rename the filename

* Update setup.cfg

* using mmcv.load to avoid introducing the extra dependency on yaml

* fix some typo

* refactor the config file and modify the script accordingly

* modify the config and script

* rename the config file
ajgrafton pushed a commit to ajgrafton/mmpose that referenced this pull request Mar 6, 2024
* add dataset info (open-mmlab#663)

* Fix import and deprecation issues in unit tests (open-mmlab#871)

* fix some bugs in the unit test of smpl model.

* reorganize `tests/` to solve importing issue (PEP 420)

* fix deprecation warnings in unit tests

Co-authored-by: ly015 <liyining0712@gmail.com>

* add benchmark regression test script with tmux (open-mmlab#849)

* test the simple case using tmux to run multiple benchmark regression test tasks

* modify and rename the config file and script

* Delete config_list.yaml

* modify the config and rename the filename

* Delete test_benchmark_tmux.py

* modify the script and rename the filename

* Update setup.cfg

* using mmcv.load to avoid introducing the extra dependency on yaml

* fix some typo

* refactor the config file and modify the script accordingly

* modify the config and script

* rename the config file

* Correct dataset preparation guide of WFLW (open-mmlab#873)

* add pr template (open-mmlab#875)

* add CITATION.cff and update setup.py (open-mmlab#876)

* Add copyright header and pre-commit hook (open-mmlab#872)

* Add pre-commit hook to automatically add copyright file header

* update files with copyright header

*  Limit copyright checking in the first 2 lines of a file
* Exclude configs in demo/

* set max-header-lines as 5

* rebase to master and add copyright to new files
* move benchmark_regression into .dev_scripts/benchmark

* Translate tasks/2d_body_keypoint.md (open-mmlab#842)

* 2rd PR remove poseval

* fix lint

* revise the CN version

Co-authored-by: ly015 <liyining0712@gmail.com>

* fix some bugs in the unit test of smpl model.

* * reorganiz `tests/` to solve importing issue (PEP 420)

* add dataset info

* fix lint

* * fix wrongly modified parts in previous rebase
* fix lint

* rename datasets/_base_ as datasets/base

* resolve compatibility of pose_limb_color

* Add dummy dataset base classes with old names for compatibility

* * Rewrite relative unittest based on dataset_info
* Add bc-breaking test for functions related to dataset_info
* Rename DatasetInfo.dataset_info as DatasetInfo._dataset_info
* Fix dataset_info of h36m dataset

* Handle breaking change pose_limb_color -> pose_link_color

* add unittest for old-fashioned dataset initialization without dataset_info

* resolve naming conflict in unittests

Co-authored-by: zengwang430521 <zengwang430521@gmail.com>
Co-authored-by: ly015 <liyining0712@gmail.com>

* fix typo

* fix typo

Co-authored-by: Jas <jinsheng@sensetime.com>
Co-authored-by: zengwang430521 <zengwang430521@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants