-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preparation for CLOCs #2
Comments
Hello, for SECOND-V1.5 (for newer version of SECOND, it should be very similar), check the file 'voxelnet.py' https://github.com/traveller59/second.pytorch/blob/v1.5/second/pytorch/models/voxelnet.py, line 377, batch_box_preds = preds_dict["box_preds"], they are the raw output (encodings of bounding boxes before NMS) from the SECOND network. First, you need to decode them (line 387), the decoded expressions are [x, y, z, w, l, h, r] in lidar coordinate, if you need them in camera coordinate, use functions in https://github.com/traveller59/second.pytorch/blob/v1.5/second/pytorch/core/box_torch_ops.py to transform them. 'box_torch_ops.py' also provides many other useful 2D/3D bounding box and coordinate transformation functions. |
Hi, |
Currently I do have a trained yolov4 model with BDD dataset and a trained Seconds v1.6 with Kitti dataset. My questions are:
|
|
@CodeDragon18 |
Alright I’ll work on it in the meantime! Thanks! |
Hi, just curious and confused about the training. I have used 90% for training and 10% for validation on 7480 Kitti training dataset. If I were to run inference without NMS, I would have to reuse the training dataset as the inference set? Would it be contradictory if I am using the same dataset for training and inference? |
Yes, you are right. Ideally, one should divide the dataset into 3 parts, part 1 for training the 3D and 2D detector, part 2 for training CLOCs, and part 3 for validation only. But for KITTI, first, the 3712 frames mini-training and 3769 frames validation split is so popular, many researchers use that split for their experiments, it would be good to show results on the 3769 validation set for comparison; second, KITTI is a relatively small dataset, I think it is too small to divide it into 3 parts. So, I just use the popular 3712 frames mini-training set to train 3D/2D detectors and CLOCs, and doing validation on 3769 frames validation set. This is NOT the best and reasonable way for training, but even with this, I still do get some improvements. I think for other larger datasets (such as nuScene, Waymo and Argoverse), dividing it into 3 parts would be a better choice. |
So for now, should I retrain yolov4 with any random 3712 frames, and run inference again on all 7480 frames for CLOCs input? How was Seconds' training like? I believe it is more ideal to train yolov4 similarly with the Seconds that you've used. Might try to train on nuScenes if I am able to successfully train CLOCs on Kitti. |
Yes, it would be better to train YOLO-V4 with the 3712 frames if 3712 frames are enough to train YOLO-V4. |
Alright. I'll attempt to train yolov4 with the 3712 mini-training set first, and will get back to CLOCs soon! |
Did you succeed?I also want to do this... |
Hi, I am planning to do a fusion with Yolov4 and Seconds/PointPillar. Would you be providing a tutorial/guide for the extraction of the bounding boxes before NMS?
The text was updated successfully, but these errors were encountered: