Add a new pre-commit-hook to automatically add a copyright (open-mmla…

…b#96) * Add a new pre-commit-hook to automatically add a copyright * add check-algo-readme Co-authored-by: qiufeng <44188071+wutongshenqiu@users.noreply.github.com> * fix alg-readme lints Co-authored-by: qiufeng <44188071+wutongshenqiu@users.noreply.github.com>
pppppM · Mar 2, 2022 · 9ad889e · 9ad889e
1 parent b07abd9
commit 9ad889e
Show file tree

Hide file tree

Showing 24 changed files with 34 additions and 0 deletions.
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -40,3 +40,9 @@ repos:
     hooks:
       - id: docformatter
         args: ["--in-place", "--wrap-descriptions", "79"]
+  - repo: https://github.com/open-mmlab/pre-commit-hooks
+    rev: v0.2.0
+    hooks:
+    -   id: check-algo-readme
+    -   id: check-copyright
+        args: [ "mmrazor", "tests", "tools"]
diff --git a/configs/distill/cwd/README.md b/configs/distill/cwd/README.md
@@ -1,6 +1,7 @@
 # CWD
 > [Channel-wise Knowledge Distillation for Dense Prediction](https://arxiv.org/abs/2011.13256)
 
+<!-- [ALGORITHM] -->
 ## Abstract
 
 Knowledge distillation (KD) has been proven to be a simple and effective tool for training compact models. Almost all KD variants for dense prediction tasks align the student and teacher networks' feature maps in the spatial domain, typically by minimizing point-wise and/or pair-wise discrepancy. Observing that in semantic segmentation, some layers' feature activations of each channel tend to encode saliency of scene categories (analogue to class activation mapping), we propose to align features channel-wise between the student and teacher networks. To this end, we first transform the feature map of each channel into a probability map using softmax normalization, and then minimize the Kullback-Leibler (KL) divergence of the corresponding channels of the two networks. By doing so, our method focuses on mimicking the soft distributions of channels between networks. In particular, the KL divergence enables learning to pay more attention to the most salient regions of the channel-wise maps, presumably corresponding to the most useful signals for semantic segmentation. Experiments demonstrate that our channel-wise distillation outperforms almost all existing spatial distillation methods for semantic segmentation considerably, and requires less computational cost during training. We consistently achieve superior performance on three benchmarks with various network structures.

diff --git a/configs/distill/wsld/README.md b/configs/distill/wsld/README.md
@@ -1,7 +1,11 @@
 # WSLD
 
+
+
 > [Rethinking Soft Labels for Knowledge Distillation: A Bias-Variance Tradeoff Perspective](https://arxiv.org/abs/2102.00650)
 
+<!-- [ALGORITHM] -->
+## Abstract
 Knowledge distillation is an effective approach to leverage a well-trained network
 or an ensemble of them, named as the teacher, to guide the training of a student
 network. The outputs from the teacher network are used as soft labels for supervising the training of a new network. Recent studies (Muller et al., 2019; Yuan ¨

diff --git a/configs/nas/darts/README.md b/configs/nas/darts/README.md
@@ -1,6 +1,7 @@
 # DARTS
 > [DARTS: Differentiable Architecture Search](https://arxiv.org/abs/1806.09055)
 
+<!-- [ALGORITHM] -->
 ## Abstract
 
 This paper addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. Unlike conventional approaches of applying evolution or reinforcement learning over a discrete and non-differentiable search space, our method is based on the continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent. Extensive experiments on CIFAR-10, ImageNet, Penn Treebank and WikiText-2 show that our algorithm excels in discovering high-performance convolutional architectures for image classification and recurrent architectures for language modeling, while being orders of magnitude faster than state-of-the-art non-differentiable techniques. Our implementation has been made publicly available to facilitate further research on efficient architecture search algorithms.

diff --git a/configs/nas/detnas/README.md b/configs/nas/detnas/README.md
@@ -1,6 +1,8 @@
 # DetNAS
 
 > [DetNAS: Backbone Search for Object Detection](https://arxiv.org/abs/1903.10979)
+
+<!-- [ALGORITHM] -->
 ## Abstract
 
 Object detectors are usually equipped with backbone networks designed for image classification. It might be sub-optimal because of the gap between the tasks of image classification and object detection. In this work, we present DetNAS to use Neural Architecture Search (NAS) for the design of better backbones for object detection. It is non-trivial because detection training typically needs ImageNet pre-training while NAS systems require accuracies on the target detection task as supervisory signals. Based on the technique of one-shot supernet, which contains all possible networks in the search space, we propose a framework for backbone search on object detection. We train the supernet under the typical detector training schedule: ImageNet pre-training and detection fine-tuning. Then, the architecture search is performed on the trained supernet, using the detection task as the guidance. This framework makes NAS on backbones very efficient. In experiments, we show the effectiveness of DetNAS on various detectors, for instance, one-stage RetinaNet and the two-stage FPN. We empirically find that networks searched on object detection shows consistent superiority compared to those searched on ImageNet classification. The resulting architecture achieves superior performance than hand-crafted networks on COCO with much less FLOPs complexity.

diff --git a/configs/nas/spos/README.md b/configs/nas/spos/README.md
@@ -1,6 +1,7 @@
 # SPOS
 > [Single Path One-Shot Neural Architecture Search with Uniform Sampling](https://arxiv.org/abs/1904.00420)
 
+<!-- [ALGORITHM] -->
 
 ## Abstract
 

diff --git a/configs/pruning/autoslim/README.md b/configs/pruning/autoslim/README.md
@@ -1,6 +1,8 @@
 # AutoSlim
 > [AutoSlim: Towards One-Shot Architecture Search for Channel Numbers](https://arxiv.org/abs/1903.11728)
 
+<!-- [ALGORITHM] -->
+
 ## Abstract
 
 We study how to set channel numbers in a neural network to achieve better accuracy under constrained resources (e.g., FLOPs, latency, memory footprint or model size). A simple and one-shot solution, named AutoSlim, is presented. Instead of training many network samples and searching with reinforcement learning, we train a single slimmable network to approximate the network accuracy of different channel configurations. We then iteratively evaluate the trained slimmable model and greedily slim the layer with minimal accuracy drop. By this single pass, we can obtain the optimized channel configurations under different resource constraints. We present experiments with MobileNet v1, MobileNet v2, ResNet-50 and RL-searched MNasNet on ImageNet classification. We show significant improvements over their default channel configurations. We also achieve better accuracy than recent channel pruning methods and neural architecture search methods.

diff --git a/mmrazor/apis/mmdet/train.py b/mmrazor/apis/mmdet/train.py
@@ -1,3 +1,4 @@
+# Copyright (c) OpenMMLab. All rights reserved.
 import random
 import warnings
 

diff --git a/mmrazor/core/builder.py b/mmrazor/core/builder.py
@@ -1,3 +1,4 @@
+# Copyright (c) OpenMMLab. All rights reserved.
 from mmcv.utils import Registry, build_from_cfg
 
 SEARCHERS = Registry('search')

diff --git a/mmrazor/core/hooks/__init__.py b/mmrazor/core/hooks/__init__.py
@@ -1,3 +1,4 @@
+# Copyright (c) OpenMMLab. All rights reserved.
 from .drop_path_prob import DropPathProbHook
 from .sampler_seed import DistSamplerSeedHook
 from .search_subnet import SearchSubnetHook

diff --git a/mmrazor/core/optimizer/__init__.py b/mmrazor/core/optimizer/__init__.py
@@ -1,3 +1,4 @@
+# Copyright (c) OpenMMLab. All rights reserved.
 from .builder import build_optimizers
 
 __all__ = ['build_optimizers']
diff --git a/mmrazor/core/optimizer/builder.py b/mmrazor/core/optimizer/builder.py
@@ -1,3 +1,4 @@
+# Copyright (c) OpenMMLab. All rights reserved.
 from mmcv.runner import build_optimizer
 
 

diff --git a/mmrazor/core/runners/__init__.py b/mmrazor/core/runners/__init__.py
@@ -1,3 +1,4 @@
+# Copyright (c) OpenMMLab. All rights reserved.
 from .epoch_based_runner import MultiLoaderEpochBasedRunner
 from .iter_based_runner import MultiLoaderIterBasedRunner
 

diff --git a/mmrazor/core/searcher/__init__.py b/mmrazor/core/searcher/__init__.py
@@ -1,3 +1,4 @@
+# Copyright (c) OpenMMLab. All rights reserved.
 from .evolution_search import EvolutionSearcher
 from .greedy_search import GreedySearcher
 

diff --git a/mmrazor/core/utils/lr.py b/mmrazor/core/utils/lr.py
@@ -1,3 +1,4 @@
+# Copyright (c) OpenMMLab. All rights reserved.
 def set_lr(runner, lr_groups, freeze_optimizers=[]):
     """Set specified learning rate in optimizer."""
     if isinstance(runner.optimizer, dict):

diff --git a/mmrazor/datasets/__init__.py b/mmrazor/datasets/__init__.py
@@ -1,3 +1,4 @@
+# Copyright (c) OpenMMLab. All rights reserved.
 from .utils import split_dataset
 
 __all__ = ['split_dataset']
diff --git a/mmrazor/datasets/utils.py b/mmrazor/datasets/utils.py
@@ -1,3 +1,4 @@
+# Copyright (c) OpenMMLab. All rights reserved.
 from torch.utils.data import random_split
 
 

diff --git a/mmrazor/models/architectures/components/backbones/darts_backbone.py b/mmrazor/models/architectures/components/backbones/darts_backbone.py
@@ -1,3 +1,4 @@
+# Copyright (c) OpenMMLab. All rights reserved.
 import copy
 
 import torch

diff --git a/mmrazor/models/architectures/components/heads/no_bias_fc_head.py b/mmrazor/models/architectures/components/heads/no_bias_fc_head.py
@@ -1,3 +1,4 @@
+# Copyright (c) OpenMMLab. All rights reserved.
 from mmcls.models.builder import HEADS
 from mmcls.models.heads import LinearClsHead
 from torch import nn

diff --git a/mmrazor/models/ops/darts_series.py b/mmrazor/models/ops/darts_series.py
@@ -1,3 +1,4 @@
+# Copyright (c) OpenMMLab. All rights reserved.
 import torch
 import torch.nn as nn
 from mmcv.cnn import build_norm_layer

diff --git a/mmrazor/utils/__init__.py b/mmrazor/utils/__init__.py
@@ -1,3 +1,4 @@
+# Copyright (c) OpenMMLab. All rights reserved.
 from .setup_env import setup_multi_processes
 
 __all__ = ['setup_multi_processes']
diff --git a/tests/data/cwd_pspnet.py b/tests/data/cwd_pspnet.py
@@ -1,3 +1,4 @@
+# Copyright (c) OpenMMLab. All rights reserved.
 norm_cfg = dict(type='BN',requires_grad=True)
 
 

diff --git a/tests/data/detnas_frcnn_shufflenet_fpn.py b/tests/data/detnas_frcnn_shufflenet_fpn.py
@@ -1,3 +1,4 @@
+# Copyright (c) OpenMMLab. All rights reserved.
 norm_cfg = dict(type='SyncBN', requires_grad=True)
 model = dict(
     type='mmdet.FasterRCNN',

diff --git a/tests/data/retinanet.py b/tests/data/retinanet.py
@@ -1,3 +1,4 @@
+# Copyright (c) OpenMMLab. All rights reserved.
 # small RetinaNet
 num_classes=3