Home

Possible Questions for Developers

They are functions that conduct evaluation or complete a step of training.

trainers: trainers can accept different types of data (directly from datasets' collate_fn) based on the implementations. But other inputs are basically fixed.

Please refer to function documents of compound_annotation visualDet3D/utils/utils.py. This is a legacy function from RetinaNet.

Link for the function inside the current master version

You could first have a look at the heads inside this repo

Suggestions:

Write a new subclass; modify the init_layers method for new structure.
Write a completely new Head class, keeping forward, get_bboxes and loss methods as main interfaces and modify all other contents inside. (If you need to remove anchors, you may need to write a new Detector class. Just like mmdetection)

Main design ideas for data augmentations in this repo:

For training 3D detection, we should also preprocess calibration matrix $P$ when we are preprocessing the image/labels. These are all done with in augmentation pipeline.
Models are trained and completing inferences with augmented image and calibration matrixes.
2D-related prediction results are corrected by aligning the augmented $P$ to the original $P$ in evaluator pipelines

There could be better way to deal with this problem, but the above are for the current repo.