# cvpods tutorials

In this tutorial, we will show the most common operations you might need when using cvpods.
Here we suppose **you have successfully clone and build cvpods on your local machine**.

## 1. Why do we need cvpods?

* Support various tasks (classification, self-supervised, detection, segmentation (semantic, panoptic, instance), keypoints, etc.)
* Support various data sets (ImageNet, COCO, Objects365, CityScapes, CityPerson, WiderFace, VOC, LVIS, etc.)
* Support various enhancements (cvpack2 has realized most of the enhancements and is compatible with all enhancements in torchvision)
* Seamless switching/collaboration of various tasks
* Taking self-supervised learning as an example, pre-training, fine-tune and training in downstream tasks (such as detection) are completed by the same set of training, eliminating the need for parameter format conversion and various manual transmissions between different codebases. The complicated process of inputting parameters and controlling the training process has significantly improved the efficiency of scientific research
* Conveniently share/open source your own results with others/communities
* Support your company's nori / dpflow and other useful tools
* The same experimental organization habits as cvpack and basedet
* Continue to absorb the achievements of the open source community and maintain the advanced nature
* Fast development, fast training, (TODO: release fast)

### 1.1 Major Features
* **Clean & simple & flexible development**: When using detectron2, if you want to implement a new module such as CustomRetinanet, you need to register it to meta_arch, then specify in xxx_config.yaml, and you still need to esplicitly invoke 'from net import CustomRetinanet' to allow registry to retrieve your module successfully. It means you need to copy train_net.py from common tools directory and insert the line above;
* **Flexible and easy-to-use configuration system**: When add new config options in Detectron2, you need to add it into config/defaults.py first and then modify the config.yaml. But now in cvpods you just need to add to config.py once. When you need to debug a component, you may need to set SOLVER.IMS_PER_BATCH to 2, before you need to modify it in config, after it starts running correctly, you need to modify it to 16. That's unconvenient too. So ​cvpods allow you to dynamicly update config, for example: `pods_train --num-gpus 1 SOLVER.IMS_PER_BATCH 2`.
* **Task specific incremental updating.**: For example, if you need to modify Retinanet relative configurations, you just need to modify retinanet_config.py and don't need care other common configs. On the other hand, we avoid putting all kinds of methods' configuration all in one base config file(such as detectron2/config/defaults.py) like detectron2, maskrcnn_benchmark and mmdetection. So retinanet will not include ROI_HEADS, MASK_HEADS configurations, but only has all necessary component.
* **Efficient experiments management**: When you need to implement a new model, you can either copy a project from examples and inheritate some kind of networks such as RetinaNet of FasterRCNN, then define your custom functions; or you can add a new base / commonly used model(such as FCOS) into `cvpods/modeling/meta_arch' and using it like a library.
* **Versatile tasks & datasets support**:
  * Detection, Segmentation (Semantic, Panoptic, Instance), Keypoint, Self-supervised Learning, 3D Detection & Segmentation, etc.
  * COCO, Objects365, WiderFace, VOC, LVIS, CityPersons, ImageNet, CrowdHuman, CityScapes, ModelNet40, ScanNet, KITTI, nuScenes, etc.
* **Global training / testing scripts.**: you just need to invoke `pods_train/test --num-gpus x` in your playground; and your projects only need to include all project-specific configs and network modules.
* **Compatible with detectron2**: All models in detectron2 can be easily migrated into cvpods.

## 2. Recommended usage
Before we introduce the specific details, let us first have a rough feel of how easy to use cvpods is. Here we list all the models in cvpods below, we organize the playgroun in `Task/Dataset/Method/Experiment` order.

In [13]:
!tree -d /home/poodarchu/Developments/repos/cvpods/playground -I "__pycache__"

[01;34m/home/poodarchu/Developments/repos/cvpods/playground[00m
|-- [01;34mdetection[00m
|   |-- [01;34mcitypersons[00m
|   |   |-- [01;34mfcos[00m
|   |   |   `-- [01;34mfcos.res50.fpn.citypersons.640size.1x[00m
|   |   |-- [01;34mrcnn[00m
|   |   |   `-- [01;34mfaster_rcnn.res50.fpn.citypersons.640size.1x[00m
|   |   `-- [01;34mretinanet[00m
|   |       `-- [01;34mretinanet.res50.fpn.citypersons.640size.1x[00m
|   |-- [01;34mcoco[00m
|   |   |-- [01;34matss[00m
|   |   |   `-- [01;34matss.res50.fpn.coco.800size.1x[00m
|   |   |-- [01;34mcenternet[00m
|   |   |   |-- [01;34mcenternet.res18.coco.512size[00m
|   |   |   `-- [01;34mcenternet.res50.coco.512size[00m
|   |   |-- [01;34mdetr[00m
|   |   |   `-- [01;34mdetr.res50.c5.coco.multiscale.150e.bs16[00m
|   |   |-- [01;34mefficientdet[00m
|   |   |   |-- [01;34meffdet0.effnet0.bifpn.coco.512size.300e[00m
|   |   |   |-- [01;34meffdet0.effnet0.bifpn.coco.512size.300e.syncbn[00

Let's take `FCOS` as example.

```shell
cd playground/detection/coco/fcos/fcos.res50.fpn.coco.800size.1x
```

In [16]:
!ls ~/Developments/repos/cvpods/playground/detection/coco/fcos/fcos.res50.fpn.coco.800size.1x
!cat ~/Developments/repos/cvpods/playground/detection/coco/fcos/fcos.res50.fpn.coco.800size.1x/README.md

README.md  config.py  net.py
# fcos.res50.fpn.coco.800size.1x

## Evaluation results for bbox:  

```  
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.387
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.575
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.417
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.226
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.427
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.499
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.320
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.534
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.570
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.364
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.617
 Average Recall 

This is a typical experiments in cvpods, which have 3 files (README.md is generated automatically). After you run the project, you'll find a softlink `log` to your config.OUTPUT_DIR, which stores all the logs/tensorboard/predicitons of this experiments.

We think these are very useful features to help researchers keep track of their results and get rid of annoying details. Unfortuantely, as far as we know, we're the first to provide these features.

To run this project:
```shell
pods_train --num-gpus 8
```
To eval your result:
```shell
pods_test --num-gpus 8
```

We help you to save time typing long commands like
```shell
python main.py -a resnet50 --dist-url 'tcp://127.0.0.1:FREEPORT' --dist-backend 'nccl' --multiprocessing-distributed --world-size 1 --rank 0 [imagenet-folder with train and val folders]
```
every time you run an experimet.

The benefits of this kind of organization is that:
* whenever you need to try some ideas, you can just copy an exising experiment, and modify the code **without** affecting your existing experiments, as well as avoiding bugs.
* You can easily write your own analyzers to analyze a group of experiments.
* Even after a long time, you can still quickly find your code and experimental results at that time.

Basically all modules that are frequently modified support customization in their own experiments, e.g., `Optimizers / Schedulers / Datasets / Augmentations`. You can also specify your own train loop in a specific experiments.

## 3. Project organization
First, let's have a look at the overall structure of cvpods.

In [4]:
!tree -d /home/poodarchu/Developments/repos/cvpods/cvpods -I "__pycache__"

[01;34m/home/poodarchu/Developments/repos/cvpods/cvpods[00m
|-- [01;34manalyser[00m
|   `-- [01;34mtide[00m
|       `-- [01;34merrors[00m
|-- [01;34mcheckpoint[00m
|-- [01;34mconfigs[00m
|-- [01;34mdata[00m
|   |-- [01;34mdatasets[00m
|   |-- [01;34msamplers[00m
|   `-- [01;34mtransforms[00m
|-- [01;34mengine[00m
|-- [01;34mevaluation[00m
|-- [01;34mexport[00m
|-- [01;34mlayers[00m
|   `-- [01;34mcsrc[00m
|       |-- [01;34mborder_align[00m
|       |-- [01;34mbox_iou_rotated[00m
|       |-- [01;34mcocoeval[00m
|       |-- [01;34mdeformable[00m
|       |-- [01;34mml_nms[00m
|       |-- [01;34mnms_rotated[00m
|       |-- [01;34mPSROIPool[00m
|       |-- [01;34mROIAlign[00m
|       |-- [01;34mROIAlignRotated[00m
|       |-- [01;34msigmoid_focal_loss[00m
|       |-- [01;34mSwapAlign2Nat[00m
|       `-- [01;34mtree_filter[00m
|-- [01;34mmodeling[00m
|   |-- [01;34mbackbone[00m
|   |   `-- [01;34mdynamic_

* The configuration system is defined in `cvpods/configs`.
* All experiments in cvpods are managed by the common `DefaultTrainer` in cvpods/engine. So you can find all components related to the train loop here (e.g., Hooks, Trainer, DDP).
* Optimizers and LR Scheduers are defined in `cvpods/solver`.
* Dataset / Augmentations / Dataset Samplers are all defined in `cvpods/data`. You can easily implement your own datasets / augmentations follwing our guides.
* Evaluation is an individual module in cvpods, which is defined in `cvpods/evaluation`.
* All custom operators like `DeformConv` are defined in `cvpods/layers`.

## 4. Guides of each component of cvpods

### 4.1 Use Config

Let's have a look at the configs. 
> One important thing need to be noticed is that, all of our configs are for 8 GPUs by default, and when you use different number of GPUs, cvpods will automatically scale relative options like iters. For SGD optimziers, it will also scale the lr automatically.

In [5]:
!tree /home/poodarchu/Developments/repos/cvpods/cvpods/configs

[01;34m/home/poodarchu/Developments/repos/cvpods/cvpods/configs[00m
|-- base_classification_config.py
|-- base_config.py
|-- base_detection_config.py
|-- config_helper.py
|-- dynamic_routing_config.py
|-- efficientdet_config.py
|-- fcos_config.py
|-- __init__.py
|-- keypoint_config.py
|-- panoptic_seg_config.py
|-- pointrend_config.py
|-- rcnn_config.py
|-- rcnn_fpn_config.py
|-- retinanet_config.py
|-- segm_config.py
|-- solo_config.py
|-- ssd_config.py
`-- yolo_config.py

0 directories, 18 files


The configuration system of cvpods is designed based on a logical inheritance relationship. For example, the root of all configs is ```base_config.py```. BaseConfig have task-independent attributes, such as configrations of trainer, input, seed, etc. When we need to add support for detection, we can add common component of current detection models such as backbones, FPN, NMS, etc. Both one-stage or two-stage detectors can inherit BaseDetectionConfig to add their own attributes. For example, RCNN models usually have RCNN heads while RetinaNet does not require these attributes. Any new task can be added in this way. As for tasks not related to object detection, it contains its unique attributes like `Keypoint`, `Mask`, etc.

The benefits of this paradigm is that, the component for training a model is cleaner and  more clear in logic.

In [7]:
from cvpods.configs.base_config import config

In [8]:
print(type(config))
print(isinstance(config, dict))
print(dir(config))

<class 'cvpods.configs.base_config.BaseConfig'>
True
['CUDNN_BENCHMARK', 'DATALOADER', 'DATASETS', 'GLOBAL', 'INPUT', 'MODE', 'MODEL', 'OUTPUT_DIR', 'SEED', 'SOLVER', 'TEST', 'TRAINER', 'VIS_PERIOD', '__class__', '__contains__', '__delattr__', '__delitem__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_register_configuration', 'clear', 'copy', 'diff', 'find', 'fromkeys', 'funcname_not_in_attr', 'get', 'items', 'keys', 'link_log', 'merge', 'merge_from_list', 'pop', 'popitem', 'setdefault', 'update', 'values']


You can find that our `config` is subclass of python `dict`, which means you can easily manipulate it using your familiar APIs.

#### 3.1.1 Frequently used options of Config
* Input-related
  * `config.INPUT.FORMAT`: used for specify "RGB" / "BGR" / other channel format for 2D images. By default, we use "RGB" for classification and "BGR" for other tasks for convenience.
  * `config.DATALOADER.NUM_WORKERS`: when you need to debug the data loading module, set it to 0.
* Trainer-related
  * `config.SOLVER.LR_SCHEDUER.MAX_ITER/MAX_EPOCH`: you can specify **one of the two** options to achieve iter-wise or epoch-wise training, and the training log will present different pattern according to it automatically. Once you choose ITER / EPOCH, other options like `WARMUP_ITERS / EVAL_PERIOD` should follow the same logic.
  * `config.SOLVER.OPTIMIZER.NAME`: By default, we use "D2SGD" for detection and "SGD" for other tasks.
  * `config.SOLVER.IMS_PER_DEVICE/IMS_PER_BATCH`: As stated above, `IMS_PER_BATCH` should be `8x` of `IMS_PER_DEVICE`. 
  * `config.SOLVER.BATCH_SUBDIVISIONS`: We support subdivision training. For example, if you set it to 16, you can use 16 iters to simulate a `16 x IMS_PER_BATCH` udpate. LR will be scaled automatically.
  * `config.TRAINER.FP16`: You can set `ENABLED=True` to run your experiments in FP16 mode.
  * `cofnig.SOLVER.LR_SCHEDULER.EPOCH_WISE`: Default false, it will udpate lr per epoch if set to `True`. 

For more detailed options of different tasks, please refer to corresponding configs.

#### 4.1.2 Write your own `config.py`

Your config should inherit from any of configs in cvpods (all configs inherit from BaseConfig).
If an option has already exist in parent config, you can just overwrite it or just use the parent setting.
Except for existing options, you can also add any options in your own config file **without defining one option in many files.** All the settings of cvpods only need to be added in one place, and then incrementally update them.


### 4.2 Add new models
We recommend you reuse the modules in cvpods if it's already there. The model definition is nothing special in cvpods.
You can migrate your previous model directly into cvpods.

There are few things to be noticed:
* The `__init__` function of your class should only has config as your input. So put all your custom options in config and parse them in the `__init__` function. The benefit of this design is that, your model can get to know the state of the training loop if necessary, even there's no training code in your experiment directory.
* Add `self.device` at top of the `__init__` and call `self.to(device)` at the end of `__init__`.
* By default the `collate` function of data_loader only put all items together, so you need to take care of them (e.g., concat, move to GPU) on in model `forward`.

### 4.3 Add new datasets

#### 4.3.1 Add basic dataset
If you need to add a new data set, please add it as follows:

1. In the cvpods/data/datasets/paths_route.py file, add the path of the current data set, evaluator_type and other information. Here we take imagenet as an example:
  ```python
  _PREDEFINED_SPLITS_IMAGENET = {
      "dataset_type": "ImageNetDataset",
      "evaluator_type": {
          "imagenet": "classification"
      },
      "imagenet": {
          "imagenet_train": ("imagenet", "train"),
          "imagenet_val": ("imagenet", "val"),
      }
  }
  PATH_ROUTES.register(_PREDEFINED_SPLITS_IMAGENET, "IMAGENET")
  ```
  Among them, `dataset_type` is the class name of XxxDataset in cvpods/data/datasets/xxx.py you want to create; `imagenet: {datasplit_name: (image_root, label_path)}` specifies each training set you need and its corresponding data , The path of the label; for each newly added dataset, you need to specify the type of evaluator used in the evaluator_type. All currently supported evaluators can be viewed here. It should be noted that the prefixes of datasplit_name and PREDEFINEDSPLITS_NAME must be the same to be correctly indexed.

2. 
Inherit `BaseDataset` and add xxx.py under the cvpods/data/datasets directory (or your own experiment dir). Where xxx is the name of the corresponding data set, such as `coco.py`. Several methods need to be implemented: `getitem`,       `len`, `load_annotations`, and `get_metadata`.
`_load_annotations` specifies how to read the data, and returns a List, each element in the list represents a sample, and its format specification is defined in `cvpods/data/base_dataset.py`.
`_get_meta` specifies how to obtain the meta information of the current dataset (such as category-id mapping, the color used to visualize each category, etc.), and returns an object in dict format.
`__getitem__` read image: read data / handle read exception
`apply transfrom`: enhance the current index sample in the order of self.transforms
formatter: unify the output format, such as converting all to Instance type
3. Add Dataset to `datasets/__init__.py` in the same level of datasets folder.

#### 4.3.2 Add composed dataset
The above shows how to add a new data-based dataset. This section shows how to combine an existing dataset to form a more complex/variable dataset. The currently supported extended dataset can be found in `cvpods/data/wrapped_dataset.py`.

Here we use the `ConcatDatset` as example, it concats multiple dataset pipelines and compose the final dataset.

```python
class ConcatDataset(_ConcatDataset):
    """A wrapper of concatenated dataset.
    Same as :obj:`torch.utils.data.dataset.ConcatDataset`, but
    concat the group flag for image aspect ratio.
    Args:
        datasets (list[:obj:`Dataset`]): A list of datasets.
    """

    def __init__(self, datasets):
        super(ConcatDataset, self).__init__(datasets)
        if hasattr(self.datasets[0], 'aspect_ratios'):
            aspect_ratios = [d.aspect_ratios for d in self.datasets]
            self.aspect_ratios = np.concatenate(aspect_ratios)
        if hasattr(self.datasets[0], 'meta'):
            self.meta = {}
            for d in self.datasets:
                self.meta.update(d.meta)
            self.meta = SimpleNamespace(**self.meta)
```

Mixup can also be added in this way.

#### 4.3.3 Use the existing Dataset
How to use the supported data set for training? 
Taking ImageNet classification as an example, you only need to add the corresponding datasplit_name in the config file. The code is here:
```python
    DATASETS=dict(
        TRAIN=("imagenet_train", ),
        TEST=("imagenet_val", ),
    ),
```

### 4.4 Add new augmentations
We implemented many augmentations used in various tasks, besides, **we support all torchvision augmentations** as well.

#### 4.4.1 Use existing augmentations
Here we use an complex augmentation pipeline to show the power of cvpods:
```python
INPUT=dict(
        AUG=dict(
            TRAIN_PIPELINES=dict(
                input_stream1=[("RepeatList", dict(transforms=[
                    ("Torch_Compose", transforms.Compose([
                        transforms.RandomResizedCrop(64, scale=(0.2, 1.)),
                        transforms.RandomApply([
                                transforms.ColorJitter(0.4, 0.4, 0.4, 0.1)], p=0.8),
                    ])),
                    ("GaussianBlur", dict(sigma=[.1, 2.], p=0.5)),
                    ("Torch_Compose", transforms.Compose([
                        transforms.RandomGrayscale(p=0.2),
                        transforms.RandomHorizontalFlip(),
                    ])),
                ], repeat_times=3)),],
                input_stream2=[("RepeatList", dict(transforms=[
                    ("Torch_Compose", transforms.Compose([
                        transforms.RandomResizedCrop(64, scale=(0.2, 1.)),
                        transforms.RandomHorizontalFlip(),
                    ])),
                    ("RandomList", dict(
                        transforms=[
                            ("AutoAugment", dict(
                                name=name,
                                prob=0.5,
                                magnitude=10,
                                hparams=dict(
                                    translate_const=int(224 * 0.45),
                                    img_mean=tuple([min(255, round(255 * x)) for x in [0.406, 0.456, 0.485]]),
                                    magnitude_std=0.5))
                            ) for name in _RAND_TRANSFORMS],
                        num_layers=2,
                        choice_weights=[_RAND_CHOICE_WEIGHTS_0[k] / sum([_RAND_CHOICE_WEIGHTS_0[k] for k in _RAND_TRANSFORMS]) for k in _RAND_TRANSFORMS],
                    )),
                ], repeat_times=3)),],
            )
        )
    ),
```
Don't be afraid, we can show another simple pipeline in the classification task in cvpods:
```python
    INPUT=dict(
        AUG=dict(
            TRAIN_PIPELINES=[
                ("Torch_Compose", transforms.Compose([
                    transforms.RandomResizedCrop(224),
                    transforms.RandomHorizontalFlip(),
                    transforms.ToTensor(),
                    transforms.Normalize(
                        mean=[0.485, 0.456, 0.406],
                        std=[0.229, 0.224, 0.225])
                    ])
                ),
            ],
            TEST_PIPELINES=[
                ("Torch_Compose", transforms.Compose([
                    transforms.Resize(256),
                    transforms.CenterCrop(224),
                    transforms.ToTensor(),
                    transforms.Normalize(
                        mean=[0.485, 0.456, 0.406],
                        std=[0.229, 0.224, 0.225])
                    ])
                ),
            ],
        )
    ),
```

As you can see, the `config.INPUT.AUG.TRAIN/TEST_PIPELIES` can be list or dict. If you need multiple input streams (which is very common in self-supervised learing tasks), you can use the first example above to specify multiple inputs. While most tasks do not require multiple pipelines, so you can refer to the second example above to specify your specific augmentations.

All current support augmentations can be found in `cvpods/data/transforms/transform_gen.py`.
As you can see, for each augmentaion in the pipeline, it follows the format 
```python
(name_str, args_dict)
```
For example, the `GaussianBlur` is defined:
```python
("GaussianBlur", dict(sigma=[.1, 2.], p=0.5))
```

If you need to use torchvision augmentations, make sure `name_str` starts with `Torch_`, and just put the augmentation in `args_dict` like above.

#### 4.4.2 Add new augmentations

It's very easy to add your own augmentations.
Here we take 3D point cloud filter points as example:
```python
@TRANSFORMS.register()
class FilterNumPoints(object):
    def __init__(self, num_points=-1):
        self.num_points = num_points

    def __call__(self, points, info):

        point_counts = box_ops.points_count_rbbox(
            points, info["annotations"]["point_boxes"])
        # TODO: check why point_counts does not equal to info.annotations.num_points_in_gt
        mask = point_counts >= self.num_points
        _dict_select(info["annotations"], mask)

        return points, info
```
You just need to make sure your augmentation is callable, and handle your data and annotations as you wish.

### 4.5 Add new evaluation metrics
The order of adding a new Evaluator is as follows:

* Inherit DatasetEvaluator and implement the `process`, `evaluate`.
  * `process` prediction in the process to obtain intermediate results used to calculate various metrics, and store them in self.predictions
  * `evaluate` gets the results of the current batch from self.predictions, calculates various indicators, and stores them in self.results
* In tools/train(test)_net.py, specify the evaluator to be used according to the corresponding evaluator_type added in paths_route. A dataset can use multiple evalutors.

## 5. Conclusion

Congratulations, now you should be very comfortable with how to use cvpods. If you encounter any other problems in use, or want to contribute code to cvpods, please refer to our README.