Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Dynamic BN for inefficient memory use #62

Merged
merged 14 commits into from Dec 14, 2022

Conversation

yuikosakuma1
Copy link
Contributor

This PR addresses the following two issues:

  1. Dynamic BN operation contained undesirable graph construction that caused linear increase in memory use each iteration.
    This PR fixes this bug by modifying DynamicBatchNorm.

You can check this by adding the following codes in nnabla_nas/runned/searcher/ofa.py, L253 or anywhere in the training loop.

from nnabla_ext.cuda.init import print_memory_cache_map
if self.comm.rank == 0:
    print_memory_cache_map()

This gives something like cache_map(device_id: 0, mem_type: small, used: 1.86GB, free: 10.55MB):
You can check used and see this doesn't increase after several iterations.
This can be checked by running elastic expand_ratio for any OFA model. For example,

python main.py experiment=classification/ofa/ofa_mbv3/imagenet_search_expand_phase2
  1. clear_memory_cache() was removed because this is not necessary and may slow the training.

@yuikosakuma1
Copy link
Contributor Author

@hyingho Please merge this PR.

@hyingho hyingho merged commit 2268b35 into master Dec 14, 2022
@hyingho hyingho deleted the feature/20220818-fix-dynamicbn branch December 14, 2022 11:04
@hyingho hyingho added the release-note-bugfix Auto-release; Bugfix label Apr 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-bugfix Auto-release; Bugfix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants