Finetune the ConvNeXt-L on KITTI #26

FangjunWang · 2023-12-05T02:12:55Z

Hello and nice work! My question is how to finetune the model on KITTI?
I tried with the script ./finetune/train_ft_SQLdepth.py but cannot get good enough results. Only abs_rel 0.0494 and rmse 2.182.

hisfog · 2023-12-05T02:42:46Z

Did you load a pre-trained model (self-supervised pre-trained), and what's your SSL scores.

FangjunWang · 2023-12-05T02:58:20Z

Did you load a pre-trained model (self-supervised pre-trained), and what's your SSL scores.

Thank you for your reply! Yes, I load a pre-trained model trained with script train.py with 20 epochs. What is SSL score? The training loss score is around 0.3~0.4, and the validation silog is 7.467.

hisfog · 2023-12-05T03:02:35Z

Thank you for your reply! Yes, I load a pre-trained model trained with script train.py with 20 epochs. What is SSL score? The training loss score is around 0.3~0.4, and the validation silog is 7.467.

I mean, what's your SSL model's metrics, AbsRel, RMSE, etc.

FangjunWang · 2023-12-05T03:03:52Z

Thank you for your reply! Yes, I load a pre-trained model trained with script train.py with 20 epochs. What is SSL score? The training loss score is around 0.3~0.4, and the validation silog is 7.467.

I mean, what's your SSL model's metrics, AbsRel, RMSE, etc.

AbsRel

hisfog · 2023-12-05T03:09:13Z

AbsRel

Em, AbsRel = ?, I mean your SSL model's evaluation results on KITTI, not SiLog loss

FangjunWang · 2023-12-05T04:10:22Z

AbsRel

Em, AbsRel = ?, I mean your SSL model's evaluation results on KITTI, not SiLog loss

The SSL model’s ecaluation results on KITTI are: abs_rel: 0.060, rmse: 2.642.

hisfog · 2023-12-05T04:48:17Z

The SSL model’s ecaluation results on KITTI are: abs_rel: 0.060, rmse: 2.642.

That's interesting, you got better SSL scores but worse SSL+Sup scores.

hisfog · 2023-12-05T04:51:53Z

Can you provide your fine-tuning args? I think you should choose a much smaller learning_rate.

FangjunWang · 2023-12-05T04:57:51Z

Can you provide your fine-tuning args? I think you should choose a much smaller learning_rate.

--name cvnXt_075_1130
--root weights/inc_kitti_exps
--load_weights_folder weights/convnext_large/cvnXt_075/models/weights_15
--epochs 5
--bs 8
--lr 1e-5
--wd 0.01
--div_factor 10
--final_div_factor 100
--validate_every 250
--dataset kitti
--workers 8
--w_chamfer 0
--data_path datasets/KITTI/raw
--gt_path datasets/KITTI/gts/train
--filenames_file ./finetune/train_test_inputs/kitti_eigen_train_files_with_gt.txt
--input_height 320
--input_width 1024
--min_depth 0.001
--max_depth 80
--do_random_rotate
--degree 1.0
--data_path_eval datasets/KITTI/raw
--gt_path_eval datasets/KITTI/gts/val
--filenames_file_eval ./finetune/train_test_inputs/kitti_eigen_test_files_with_gt.txt
--min_depth_eval 1e-3
--max_depth_eval 80
--do_kb_crop
--garg_crop
--same_lr

hisfog · 2023-12-05T05:01:17Z

--epochs 5 --bs 8 --lr 1e-5

I recommend --bs 16 and I think lr should be smaller, 1e-6, 5e-6, etc.

FangjunWang · 2023-12-05T05:04:25Z

--epochs 5 --bs 8 --lr 1e-5

I recommend --bs 16 and I think lr should be smaller, 1e-6, 5e-6, etc.

Thanks! I will try.

Lavreniuk · 2024-01-24T13:50:58Z

Hi @FangjunWang, I am very excited to reproduce ConvNetX results as well. However, I am currently stuck on the first stage (SSL training). I ran the training using the following command:
python train.py ./args_files/hisfog/kitti/cvnXt_L_320x1024.txt

in cvnXt_L_320x1024.txt I changed only data_path, log_dir and batch_size=8 (instead of 16 as original, as I understood you did same change).
in other experiments I tried also lower lr, and remove diff_lr argument, but no improvement.
After that I calculated the score using command:
evaluate_depth_config.py args_files/hisfog/kitti/cvnXt_L_320x1024.txt
where I changed load_weights_folder to my weights path.
However I tried weights from all epochs and best result is:
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.096 & 0.765 & 4.455 & 0.176 & 0.908 & 0.966 & 0.983 \
which is much worse than original and yours, could you please help me to understand what I did wrong on first stage, so I could fix it and after move to stage 2 (finetuning).

Should I download pretrained PoseNet or other weights, or maybe I calculates the metrics in the wrong way (but I checked it on downloaded resnet model and it reproduce same score as @hisfog claimed in gitrepo).
Could you please share your parameters as well as brief instruction what to do to reproduce the score.
Will be very thankful for help.

FangjunWang · 2024-01-25T07:50:21Z

Hi @FangjunWang, I am very excited to reproduce ConvNetX results as well. However, I am currently stuck on the first stage (SSL training). I ran the training using the following command: python train.py ./args_files/hisfog/kitti/cvnXt_L_320x1024.txt

in cvnXt_L_320x1024.txt I changed only data_path, log_dir and batch_size=8 (instead of 16 as original, as I understood you did same change). in other experiments I tried also lower lr, and remove diff_lr argument, but no improvement. After that I calculated the score using command: evaluate_depth_config.py args_files/hisfog/kitti/cvnXt_L_320x1024.txt where I changed load_weights_folder to my weights path. However I tried weights from all epochs and best result is: abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 | & 0.096 & 0.765 & 4.455 & 0.176 & 0.908 & 0.966 & 0.983 \ which is much worse than original and yours, could you please help me to understand what I did wrong on first stage, so I could fix it and after move to stage 2 (finetuning).

Should I download pretrained PoseNet or other weights, or maybe I calculates the metrics in the wrong way (but I checked it on downloaded resnet model and it reproduce same score as @hisfog claimed in gitrepo). Could you please share your parameters as well as brief instruction what to do to reproduce the score. Will be very thankful for help.

Hello, my parameters are:
--data_path datasets/KITTI/raw/
--log_dir weights/convnext_large
--model_name cvnXt_075
--dataset kitti
--eval_split eigen
--backbone convnext_large
--height 320
--width 1024
--batch_size 8
--num_epochs 20
--scheduler_step_size 10
--model_dim 32
--patch_size 32
--dim_out 64
--query_nums 64
--dec_channels 1024 512 256 128
--min_depth 0.001
--max_depth 80.0
--diff_lr
--use_stereo
--load_weights_folder weights/ConvNeXt_Large_SQLdepth
--eval_mono
--post_process
--save_pred_disps

I did not change any other things besides above parameters. Hope this helps!

Lavreniuk · 2024-01-25T10:12:02Z

@FangjunWang, thank you for quick response. I have the same parameters.
do you train using only this command:
python train.py ./args_files/hisfog/kitti/cvnXt_L_320x1024.txt
and testing using this:
evaluate_depth_config.py args_files/hisfog/kitti/cvnXt_L_320x1024.txt

also did you do something else? download some pretrained weights before training or pretrained PoseNet?

FangjunWang · 2024-01-26T06:20:56Z

@FangjunWang, thank you for quick response. I have the same parameters. do you train using only this command: python train.py ./args_files/hisfog/kitti/cvnXt_L_320x1024.txt and testing using this: evaluate_depth_config.py args_files/hisfog/kitti/cvnXt_L_320x1024.txt

also did you do something else? download some pretrained weights before training or pretrained PoseNet?

Yes, I trained and evaluated the model use the same command.
I only load a pretrained weights convnext_large_22k_1k_224.pth.

Lavreniuk · 2024-01-29T09:27:33Z

@FangjunWang, for this you should change params to:
--backbone convnext_large_in22ft1k
did you do it, or you manually change convnext_large to convnext_large_22k_1k_224.pth ?

FangjunWang · 2024-01-29T09:36:12Z

@FangjunWang, for this you should change params to: --backbone convnext_large_in22ft1k did you do it, or you manually change convnext_large to convnext_large_22k_1k_224.pth ?

I changed networks/Unet.py like this:
if backbone == "convnext_large":
pretrained = False
backbone_kwargs = {"checkpoint_path": "weights/convnext_large_22k_1k_224_filtered.pth"}
encoder = create_model(
backbone, features_only=True, out_indices=backbone_indices,
in_chans=in_channels, pretrained=pretrained, **backbone_kwargs
)

Lavreniuk · 2024-01-29T10:05:42Z

@FangjunWang, thanks.
could you pls write me an email to nick_93@ukr.net, so I could directly connect to you for other questions?

Lavreniuk · 2024-01-31T11:05:28Z

@FangjunWang, I have tried convnext_large_22k_1k_224 as you suggest it provides slightly better results, however situation is similar.
For resnet50 I was able to mostly reproduce the original score:
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.084 & 0.646 & 3.972 & 0.163 & 0.923 & 0.969 & 0.983 \

But for convnext I found next situation it improves first 6-9 epochs, and after that not improve but get worse and worse.
Have you get similar results or you have +- each epoch improvements?
ep1
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.091 & 0.704 & 4.197 & 0.173 & 0.916 & 0.966 & 0.982 \
ep2
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.091 & 0.675 & 4.182 & 0.168 & 0.918 & 0.968 & 0.983 \
ep3
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.088 & 0.701 & 4.279 & 0.167 & 0.923 & 0.969 & 0.983 \
ep4
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.085 & 0.625 & 4.017 & 0.166 & 0.926 & 0.968 & 0.983 \
ep5
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.084 & 0.664 & 4.079 & 0.165 & 0.928 & 0.969 & 0.983 \
ep6
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.082 & 0.647 & 4.119 & 0.167 & 0.926 & 0.967 & 0.982 \
ep7
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.087 & 0.745 & 4.389 & 0.170 & 0.921 & 0.967 & 0.982 \
ep8
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.086 & 0.707 & 4.256 & 0.169 & 0.923 & 0.967 & 0.982 \
ep9
abs_rel | sq_rel | rmse | rmse_log | a1 | a2 | a3 |
& 0.088 & 0.748 & 4.397 & 0.173 & 0.920 & 0.966 & 0.981 \

jerry-ryu · 2024-03-17T13:21:17Z

@Lavreniuk
I'm in the same problem as you. I've tried the imagenet pretrained model of convnext, and the posenet provided by @hisfog (#14 ). Can you help me if you make any progress?

here is my parameters:

--data_path /mnt/RG/dataset/kitti_data
--log_dir /mnt/RG/SfMNeXt-Impl/boost
--model_name cvnXt_high
--dataset kitti
--eval_split eigen
--backbone convnext_large_in22ft1k
--height 320
--width 1024
--batch_size 8
--num_epochs 20
--scheduler_step_size 10
--model_dim 32
--patch_size 32
--dim_out 64
--query_nums 64
--dec_channels 1024 512 256 128
--min_depth 0.001
--max_depth 80.0
--diff_lr
--use_stereo
--load_weights_folder /mnt/RG/SfMNeXt-Impl/boost/cvnXt_low/models/weights_0
--eval_mono
--post_process
--pretrained_pose
--pose_net_path /mnt/RG/SfMNeXt-Impl/checkpoints/pose

Lavreniuk · 2024-03-17T14:22:00Z

hi, @jerry-ryu , I have not reproduced the result of original repo, especially with much better result that was mentioned. I think you should train without pretrained posenet, but maybe I am wrong.
But from what I found in other issues it is similar for resnet and other model, that it is impossible to reproduce it. So I switch my interest to other model.

hisfog · 2024-03-17T15:08:38Z

Apologies for the delayed response, For reproducing results on KITTI，please DO NOT use the latest code release (I'm not sure what may cause these issues above). Instead, you can kindly utilize the following version by

git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047

which is consistent with the implementation of paper SQLdepth, without any additional modifications.

jerry-ryu · 2024-03-17T15:38:42Z

@Lavreniuk @hisfog
Thank you for your kind response, I will try again and let you know.

jerry-ryu · 2024-03-19T08:39:50Z

@hisfog
Thank you so much, I was finally able to reproduce SQLdepth on resnet50 1024x320.

I will post my experimental results and argsfiles for those who want to train SQLdepth.

-Depth metrics:
paper:

ResNet50 320x1024 trained:

ConvNext 192x640 trained:

ResNet50 320x1024

-args:

Do not use latest code realease
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047

Apologies for the delayed response, For reproducing results on KITTI，please DO NOT use the latest code release (I'm not sure what may cause these issues above). Instead, you can kindly utilize the following version by
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
which is consistent with the implementation of paper SQLdepth, without any additional modifications.

args files

--data_path /mnt2/RG/data
--log_dir /mnt2/RG/SfMNeXt-Impl/sqldepth_log/
--model_name resnet_320x1024
--dataset kitti 
--eval_split eigen
--backbone resnet_lite
--height 320 
--width 1024
--batch_size 10
--num_epochs 25
--scheduler_step_size 15
--model_dim 32
--patch_size 20
--dim_out 128
--query_nums 128
--num_features 256
--num_layers 50
--min_depth 0.001
--max_depth 80.0
--use_stereo
--load_weights_folder /mnt2/RG/SfMNeXt-Impl/sqldepth_log/resnet_320x1024/models/weights_24
--eval_mono
--post_process

ConvNext 192x640

(Due to lack of gpu capacity, 192x640 was used instead of 320x1024)
-args:

Do not use latest code realease
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047

Apologies for the delayed response, For reproducing results on KITTI，please DO NOT use the latest code release (I'm not sure what may cause these issues above). Instead, you can kindly utilize the following version by
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
which is consistent with the implementation of paper SQLdepth, without any additional modifications.

args files

--data_path /mnt2/RG/data
--log_dir /mnt2/RG/SfMNeXt-Impl/sqldepth_log/
--model_name cvnXt_192x640
--dataset kitti 
--eval_split eigen 
--backbone convnext_large
--height 192 
--width 640
--batch_size 8
--num_epochs 25
--scheduler_step_size 15
--model_dim 32
--patch_size 16
--dim_out 64
--query_nums 64
--dec_channels 1024 512 256 128
--min_depth 0.001
--max_depth 80.0
--use_stereo
--load_weights_folder /mnt2/RG/SfMNeXt-Impl/sqldepth_log/cvnXt_192x640/models/weights_24
--eval_mono
--post_process

Thank you again for your wonderful code and congratulations paper accept!

p.s.
I don't think there's any special change between the commit you told me and the latest code, so if you have any ideas about what made the experimental results significantly different, I'd appreciate it if you could tell me.

NoelShin · 2024-03-31T14:57:27Z

Thank you @hisfog and @jerry-ryu for the kind responses and sharing the experiment settings.

Background: I was in the same situation where I couldn't get the similar results to the numbers reported in the paper when using the latest code. Now knowing this issue, I'm training with the suggested branch, but curious what caused the difference in my result.

I checked the differences between the latest commit and 6a1e997, and the most notable difference I can find was the filename changes in splits/eigen_zhou/train_files.txt which can possibly affect the training. @hisfog, do you think this is the cause?

jerry-ryu · 2024-03-31T15:34:59Z

@NoelShin
I looked it up after seeing your reply, and it seems quite reasonable. Thank you for finding it!!

XIAN-XIAN-X · 2024-04-12T15:04:50Z

Thank you @hisfog and @jerry-ryu for the kind responses and sharing the experiment settings.

Background: I was in the same situation where I couldn't get the similar results to the numbers reported in the paper when using the latest code. Now knowing this issue, I'm training with the suggested branch, but curious what caused the difference in my result.

I checked the differences between the latest commit and 6a1e997, and the most notable difference I can find was the filename changes in splits/eigen_zhou/train_files.txt which can possibly affect the training. @hisfog, do you think this is the cause?

hello!I notice that too.Do you know which paper the old split came from?

chaoying0115 · 2024-05-14T06:12:20Z

非常感谢，我终于能够在 resnet50 1024x320 上重现 SQLdepth。

我将为那些想要训练 SQLdepth 的人发布我的实验结果和 argsfile。

**-深度指标：**纸：

ResNet50 320x1024 训练：

ConvNext 192x640 训练：

ResNet50 320×1024

-参数：

不要使用最新的代码 realease
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
对于延迟的响应，我们深表歉意，为了在KITTI上重现结果，请不要使用最新的代码版本（我不确定是什么原因导致了上述这些问题）。相反，您可以通过以下方式使用以下版本
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
这与论文 SQLdepth 的实现一致，无需任何额外的修改。
args文件
--data_path /mnt2/RG/data
--log_dir /mnt2/RG/SfMNeXt-Impl/sqldepth_log/
--model_name resnet_320x1024
--dataset kitti 
--eval_split eigen
--backbone resnet_lite
--height 320 
--width 1024
--batch_size 10
--num_epochs 25
--scheduler_step_size 15
--model_dim 32
--patch_size 20
--dim_out 128
--query_nums 128
--num_features 256
--num_layers 50
--min_depth 0.001
--max_depth 80.0
--use_stereo
--load_weights_folder /mnt2/RG/SfMNeXt-Impl/sqldepth_log/resnet_320x1024/models/weights_24
--eval_mono
--post_process
转换下一个 192x640

（由于 GPU 容量不足，使用 192x640 而不是 320x1024） -参数：

不要使用最新的代码 realease
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
对于延迟的响应，我们深表歉意，为了在KITTI上重现结果，请不要使用最新的代码版本（我不确定是什么原因导致了上述这些问题）。相反，您可以通过以下方式使用以下版本
git checkout 6a1e997f97caef8de080bb2873f71cfbad9a8047
这与论文 SQLdepth 的实现一致，无需任何额外的修改。
args文件
--data_path /mnt2/RG/data
--log_dir /mnt2/RG/SfMNeXt-Impl/sqldepth_log/
--model_name cvnXt_192x640
--dataset kitti 
--eval_split eigen 
--backbone convnext_large
--height 192 
--width 640
--batch_size 8
--num_epochs 25
--scheduler_step_size 15
--model_dim 32
--patch_size 16
--dim_out 64
--query_nums 64
--dec_channels 1024 512 256 128
--min_depth 0.001
--max_depth 80.0
--use_stereo
--load_weights_folder /mnt2/RG/SfMNeXt-Impl/sqldepth_log/cvnXt_192x640/models/weights_24
--eval_mono
--post_process
再次感谢您的精彩代码，并祝贺论文接受！

p.s. 我不认为你告诉我的提交和最新代码之间有任何特别的变化，所以如果你对是什么让实验结果显着不同有任何想法，如果你能告诉我，我将不胜感激。

您好我尝试复现了resnet50 640x192 ，但是得到的效果相差很多

这是我的args_res50_kitti_192x640_train.txt

--data_path /home/ccy/project/kitti_data/
--dataset kitti
--eval_split eigen
--height 192
--width 640
--batch_size 6
--num_epochs 25
--model_dim 64
--patch_size 16
--query_nums 120
--scheduler_step_size 15
--eval_mono
--load_weights_folder /home/Process3/tmp/mdp/res50_models/weights_19
--post_process
--min_depth 0.001
--max_depth 80.0
--ext jpg
--model_name mdp2
--log_dir /home/ccy/tmp/

这是args_res50_kitti_192x640_eval.txt
--data_path /home/ccy/project/kitti_data/
--dataset kitti
--eval_split eigen
--height 192
--width 640
--batch_size 6
--model_dim 64
--patch_size 16
--query_nums 120
--eval_mono
--load_weights_folder /home/ccy/tmp/mdp2/models/weights_8/
--post_process
--min_depth 0.01
--max_depth 80.0
--save_pred_disps

我使用的数据集就是monodepth2对应处理的kitti_data

探索很久不知道具体原因非常期待您的回复和指导，谢谢

hisfog mentioned this issue Dec 21, 2023

Promblem about reproducing the results #13

Open

hisfog mentioned this issue Apr 10, 2024

The training results differ from the paper by 33% #36

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finetune the ConvNeXt-L on KITTI #26

Finetune the ConvNeXt-L on KITTI #26

FangjunWang commented Dec 5, 2023

hisfog commented Dec 5, 2023

FangjunWang commented Dec 5, 2023

hisfog commented Dec 5, 2023

FangjunWang commented Dec 5, 2023

hisfog commented Dec 5, 2023

FangjunWang commented Dec 5, 2023

hisfog commented Dec 5, 2023

hisfog commented Dec 5, 2023 •

edited

Loading

FangjunWang commented Dec 5, 2023

hisfog commented Dec 5, 2023

FangjunWang commented Dec 5, 2023

Lavreniuk commented Jan 24, 2024 •

edited

Loading

FangjunWang commented Jan 25, 2024

Lavreniuk commented Jan 25, 2024

FangjunWang commented Jan 26, 2024

Lavreniuk commented Jan 29, 2024

FangjunWang commented Jan 29, 2024

Lavreniuk commented Jan 29, 2024

Lavreniuk commented Jan 31, 2024

jerry-ryu commented Mar 17, 2024

Lavreniuk commented Mar 17, 2024

hisfog commented Mar 17, 2024

jerry-ryu commented Mar 17, 2024

jerry-ryu commented Mar 19, 2024 •

edited

Loading

NoelShin commented Mar 31, 2024

jerry-ryu commented Mar 31, 2024

XIAN-XIAN-X commented Apr 12, 2024

chaoying0115 commented May 14, 2024

ResNet50 320×1024

转换下一个 192x640

Finetune the ConvNeXt-L on KITTI #26

Finetune the ConvNeXt-L on KITTI #26

Comments

FangjunWang commented Dec 5, 2023

hisfog commented Dec 5, 2023

FangjunWang commented Dec 5, 2023

hisfog commented Dec 5, 2023

FangjunWang commented Dec 5, 2023

hisfog commented Dec 5, 2023

FangjunWang commented Dec 5, 2023

hisfog commented Dec 5, 2023

hisfog commented Dec 5, 2023 • edited Loading

FangjunWang commented Dec 5, 2023

hisfog commented Dec 5, 2023

FangjunWang commented Dec 5, 2023

Lavreniuk commented Jan 24, 2024 • edited Loading

FangjunWang commented Jan 25, 2024

Lavreniuk commented Jan 25, 2024

FangjunWang commented Jan 26, 2024

Lavreniuk commented Jan 29, 2024

FangjunWang commented Jan 29, 2024

Lavreniuk commented Jan 29, 2024

Lavreniuk commented Jan 31, 2024

jerry-ryu commented Mar 17, 2024

Lavreniuk commented Mar 17, 2024

hisfog commented Mar 17, 2024

jerry-ryu commented Mar 17, 2024

jerry-ryu commented Mar 19, 2024 • edited Loading

ResNet50 320x1024

ConvNext 192x640

NoelShin commented Mar 31, 2024

jerry-ryu commented Mar 31, 2024

XIAN-XIAN-X commented Apr 12, 2024

chaoying0115 commented May 14, 2024

ResNet50 320×1024

转换下一个 192x640

hisfog commented Dec 5, 2023 •

edited

Loading

Lavreniuk commented Jan 24, 2024 •

edited

Loading

jerry-ryu commented Mar 19, 2024 •

edited

Loading