Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] add seed to distributed sampler #250

Merged
merged 2 commits into from Mar 30, 2022

Conversation

fangyixiao18
Copy link
Collaborator

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Modify distributed sampler and keep the manual_seed behavior same with pytorch original version
Refer to open-mmlab/mmdetection#7432 and open-mmlab/mmdetection#7440

Modification

add seed to distributed sampler and apply g.manual_seed(self.epoch + self.seed)
synchronize the seed in init function

BC-breaking (Optional)

Does the modification introduce changes that break the backward compatibility of the downstream repositories?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here and update the documentation.

Checklist

Before PR:

  • Pre-commit or other linting tools are used to fix the potential lint issues.
  • Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests.
  • The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  • The documentation has been modified accordingly, like docstring or example tutorials.

After PR:

  • If the modification has potential influence on downstream or other related projects, this PR should be tested with those projects, like MMDet or MMSeg.
  • CLA has been signed and all committers have signed the CLA in this PR.

@codecov
Copy link

codecov bot commented Mar 23, 2022

Codecov Report

Merging #250 (b58024b) into dev_v0.8.0 (3dce8db) will decrease coverage by 0.16%.
The diff coverage is 35.00%.

@@              Coverage Diff               @@
##           dev_v0.8.0     #250      +/-   ##
==============================================
- Coverage       69.21%   69.05%   -0.17%     
==============================================
  Files             105      106       +1     
  Lines            3726     3745      +19     
  Branches          604      607       +3     
==============================================
+ Hits             2579     2586       +7     
- Misses           1041     1053      +12     
  Partials          106      106              
Flag Coverage Δ
unittests 69.05% <35.00%> (-0.17%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
mmselfsup/datasets/builder.py 76.92% <ø> (ø)
mmselfsup/utils/dist_utils.py 31.25% <31.25%> (ø)
mmselfsup/datasets/samplers/distributed_sampler.py 16.66% <33.33%> (+0.62%) ⬆️
mmselfsup/utils/__init__.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3dce8db...b58024b. Read the comment docs.

@YuanLiuuuuuu
Copy link
Collaborator

Please add UT to pass codecov

@fangyixiao18
Copy link
Collaborator Author

Please add UT to pass codecov

As this is a distributed function, it is unable to test it properly.

@fangyixiao18 fangyixiao18 merged commit 904c095 into open-mmlab:dev_v0.8.0 Mar 30, 2022
YuanLiuuuuuu pushed a commit that referenced this pull request Mar 31, 2022
* [Fix] add seed to distributed sampler

* fix lint
Jiahao000 added a commit that referenced this pull request Mar 31, 2022
* [Fix]: Fix mmcls upgrade bug (#235)

* [Feature]: Add multi machine dist_train (#232)

* [Feature]: Add multi machine dist_train

* [Fix]: Change bash to sh

* [Fix]: Fix missing sh suffix

* [Refactor]: Change bash to sh

* [Refactor] Add unit test (#234)

* [Refactor] add unit test

* update workflow

* update

* [Fix] fix lint

* update test

* refactor moco and densecl unit test

* fix lint

* add unit test

* update unit test

* remove modification

* [Feature]: Add MAE metafile (#238)

* [Feature]: Add MAE metafile

* [Fix]: Fix lint

* [Fix]: Change LARS to AdamW in the metafile of MAE

* [Fix] fix codecov bug (#241)

* [Fix] fix codecov bug

* update comment

* [Refactor] Using MMCls backbones (#233)

* [Refactor] using backbones from MMCls

* [Refactor] modify the unit test

* [Fix] modify default setting of out_indices

* [Docs] fix lint

* [Refactor] modify super init

* [Refactore] remove res_layer.py

* using mmcv PatchEmbed

* [Fix]: Fix outdated problem (#249)

* [Fix]: Fix outdated problem

* [Fix]: Update MoCov3 bibtex

* [Fix]: Use abs path in README

* [Fix]: Reformat MAE bibtex

* [Fix]: Reformat MoCov3 bibtex

* [Feature] Resume from the latest checkpoint automatically. (#245)

* [Feature] Resume from the latest checkpoint automatically.

* fix windows path problem

* fix lint

* add code reference

* [Docs] add docstring for ResNet and ResNeXt (#252)

* [Feature] support KNN benchmark (#243)

* [Feature] support KNN benchmark

* [Fix] add docstring and multi-machine testing

* [Fix] fix lint

* [Fix] change args format and check init_cfg

* [Docs] add benchmark tutorial

* [Docs] add benchmark results

* [Feature]: SimMIM supported (#239)

* [Feature]: SimMIM Pretrain

* [Feature]: Add mix precision and 16x128 config

* [Fix]: Fix config import bug

* [Fix]: Fix config bug

* [Feature]: Simim Finetune

* [Fix]: Log every 100

* [Fix]: Fix eval problem

* [Feature]: Add docstring for simmim

* [Refactor]: Merge layer wise lr decay to Default constructor

* [Fix]:Fix simmim evaluation bug

* [Fix]: Change model to be compatible to latest version of mmcls

* [Fix]: Fix lint

* [Fix]: Rewrite forward_train for classification cls

* [Feature]: Add UT

* [Fix]: Fix lint

* [Feature]: Add 32 gpus training for simmim ft

* [Fix]: Rename mmcls classifier wrapper

* [Fix]: Add docstring to SimMIMNeck

* [Feature]: Generate docstring for the forward function of simmim encoder

* [Fix]: Rewrite the class docstring for constructor

* [Fix]: Fix lint

* [Fix]: Fix UT

* [Fix]: Reformat config

* [Fix]: Add img resolution

* [Feature]: Add readme and metafile

* [Fix]: Fix typo in README.md

* [Fix]: Change BlackMaskGen to BlockwiseMaskGenerator

* [Fix]: Change the name of SwinForSimMIM

* [Fix]: Delete irrelevant files

* [Feature]: Create extra transformerfinetuneconstructor

* [Fix]: Fix lint

* [Fix]: Update SimMIM README

* [Fix]: Change SimMIMPretrainHead to SimMIMHead

* [Fix]: Fix the docstring of ft constructor

* [Fix]: Fix UT

* [Fix]: Recover deletion

Co-authored-by: Your <you@example.com>

* [Fix] add seed to distributed sampler (#250)

* [Fix] add seed to distributed sampler

* fix lint

* [Feature] Add ImageNet21k (#225)

* solve memory leak by limited implementation

* fix lint problem

Co-authored-by: liming <liming.ai@bytedance.com>

* [Refactor] change args format to '--a-b' (#253)

* [Refactor] change args format to `--a-b`

* modify tsne script

* modify 'sh' files

* modify getting_started.md

* modify getting_started.md

* [Fix] fix 'mkdir' error in prepare_voc07_cls.sh (#261)

* [Fix] fix positional parameter error (#260)

* [Fix] fix command errors in benchmarks tutorial (#263)

* [Docs] add brief installation steps in README.md (#265)

* [Docs] add colab tutorial (#247)

* [Docs] add colab tutorial

* fix lint

* modify the colab tutorial, using API to train the model

* modify the description

* remove #

* modify the command

* [Docs] translate 6_benchmarks.md into Chinese (#262)

* [Docs] translate 6_benchmarks.md into Chinese

* Update 6_benchmarks.md

change 基准 to 基准评测

* Update 6_benchmarks.md

(1)  Add Chinese translation of  ‘1 folder for ImageNet nearest-neighbor classification task’
(2) 数据预准备 -> 数据准备

* [Docs] remove install scripts in README (#267)

* [Docs] Update version information in dev branch (#268)

* update version to v0.8.0

* fix lint

* [Fix]: Install the latest mmcls

* [Fix]: Add SimMIM in RAEDME

Co-authored-by: Yuan Liu <30762564+YuanLiuuuuuu@users.noreply.github.com>
Co-authored-by: Jiahao Xie <52497952+Jiahao000@users.noreply.github.com>
Co-authored-by: Your <you@example.com>
Co-authored-by: Ming Li <73068772+mitming@users.noreply.github.com>
Co-authored-by: liming <liming.ai@bytedance.com>
Co-authored-by: RenQin <45731309+soonera@users.noreply.github.com>
Co-authored-by: YuanLiuuuuuu <3463423099@qq.com>
@fangyixiao18 fangyixiao18 deleted the sampler-seed branch April 1, 2022 06:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants