Training setups (tested with different GPUs) #47

emilyemliyM · 2022-03-18T02:46:41Z

Dear author,

Thanks for the sharing code.

I'm trying to reproduce the metrics from the paper, but haven't been successful yet.
I would like to ask about some training parameters and hardware equipment for the experiment?
Regarding the indicators such as iou in the paper, do you mean miou or just the iou of the moving class?

Thanks！

Chen-Xieyuanli · 2022-03-18T08:33:56Z

Hey @mengshiyu0109, the training parameters used for the paper are as default. We tested on Quard4000, 5000, 6000, RTX2080ti, and TITAN and got similar results.

IoU reported in our paper is the one for moving objects only.

Note that the 62 IoU performance was got by adding KNN and semantics. Without semantics, the performance is around 58 IoU on the test set. You may first check whether you enable the KNN in the config file or not.

@MaxChanger could you please also share your setups of training LMNet here?

MaxChanger · 2022-03-18T12:53:32Z

Yeah,
Hi @mengshiyu0109, I have trained and tested LMNet on 3*2080Ti and 3090, and can generally achieve similar accuracy as reported in the paper.
Maybe, I think you can try to set the batch_size in salsanext_mos.yml to 24, and then use 3*2080Ti or more GPU cards with slightly smaller memory (guarantee that bs=24).

In addition, the IoU in the paper should refer specifically to MovingIoU, but saving checkpoints during training is based on mean_IoU (average static and moving).

By the way, there may be non-deterministic in this code, you can set the following flags

def set_seed(seed=1024):
    random.seed(seed)
    # os.environ['PYTHONHASHSEED'] = str(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.cuda.manual_seed_all(seed) # if you are using multi-GPU.
    torch.backends.cudnn.benchmark = False
    torch.backends.cudnn.deterministic = True

emilyemliyM · 2022-03-18T14:39:21Z

Yeah, Hi @mengshiyu0109, I have trained and tested LMNet on 32080Ti and 3090, and can generally achieve similar accuracy as reported in the paper. Maybe, I think you can try to set the batch_size in salsanext_mos.yml to 24, and then use 32080Ti or more GPU cards with slightly smaller memory (guarantee that bs=24).

In addition, the IoU in the paper should refer specifically to MovingIoU, but saving checkpoints during training is based on mean_IoU (average static and moving).

By the way, there may be non-deterministic in this code, you can set the following flags
def set_seed(seed=1024):
    random.seed(seed)
    # os.environ['PYTHONHASHSEED'] = str(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.cuda.manual_seed_all(seed) # if you are using multi-GPU.
    torch.backends.cudnn.benchmark = False
    torch.backends.cudnn.deterministic = True

really thanks，
I still in the training mode, BTW, I just focus on the moving class iou,however, it just about 20% I got, So I haven't try the test part,

According the reply, I will try again, now.
I have more confidence about the topic now, since I have tried several method But I can not got the beautiful metrics about moving class.

Thanks.

Chen-Xieyuanli · 2022-03-18T14:50:51Z

@MaxChanger Thanks for the report!

@mengshiyu0109 you may first check whether you can generate similar results with our pre-trained model to see whether the setup is correct or not.

emilyemliyM · 2022-03-19T14:50:44Z

@MaxChanger Thanks for the report!

@mengshiyu0109 you may first check whether you can generate similar results with our pre-trained model to see whether the setup is correct or not.

thanks！！
Thanks a lot for your reply. I would like to ask, during the training process, what is the value of miou you obtained during training? Then go to start the test.

MaxChanger · 2022-03-19T15:15:21Z

Hi, @mengshiyu0109. During my training, best_val_iou in tensorboard should be around 0.84 in epoch ~120 (or I guess greater than 0.82 should be fine). Also, the non-deterministic may cause some fluctuations.
After this, you can use python infer.py xxxx to generate predicted labels and use python utils/evaluate_mos.py xxx to evaluate. The moving IoU in valid set should be around 0.60 (0.59~0.618).

Chen-Xieyuanli added the good first issue Good for newcomers label Mar 18, 2022

Chen-Xieyuanli changed the title ~~traininr parameters,thanks~~ Training setups (tested with different GPUs) Mar 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training setups (tested with different GPUs) #47

Training setups (tested with different GPUs) #47

emilyemliyM commented Mar 18, 2022

Chen-Xieyuanli commented Mar 18, 2022 •

edited

MaxChanger commented Mar 18, 2022

emilyemliyM commented Mar 18, 2022

Chen-Xieyuanli commented Mar 18, 2022

emilyemliyM commented Mar 19, 2022

MaxChanger commented Mar 19, 2022

Training setups (tested with different GPUs) #47

Training setups (tested with different GPUs) #47

Comments

emilyemliyM commented Mar 18, 2022

Chen-Xieyuanli commented Mar 18, 2022 • edited

MaxChanger commented Mar 18, 2022

emilyemliyM commented Mar 18, 2022

Chen-Xieyuanli commented Mar 18, 2022

emilyemliyM commented Mar 19, 2022

MaxChanger commented Mar 19, 2022

Chen-Xieyuanli commented Mar 18, 2022 •

edited