Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to train the zero-shot model? #34

Closed
hwanyu112 opened this issue Oct 22, 2022 · 2 comments
Closed

How to train the zero-shot model? #34

hwanyu112 opened this issue Oct 22, 2022 · 2 comments

Comments

@hwanyu112
Copy link

Hi! Thanks for your interesting work!
I am trying to reproduce the zero-shot experiments in the paper recently, but like #19 (comment) , it gets mIoU much lower than yours.

Here is my scripts:

train_lseg_zs.py:

from modules.lseg_module_zs import LSegModuleZS
from utils import do_training, get_default_argument_parser

if __name__ == "__main__":
    parser = LSegModuleZS.add_model_specific_args(get_default_argument_parser())
    args = parser.parse_args()
    do_training(args, LSegModuleZS)

command:

python -u train_lseg_zs.py --backbone clip_resnet101 --exp_name lsegzs_pascal_f0 --dataset pascal \
--widehead --no-scaleinv --arch_option 0 --ignore_index 255 --fold 0 --nshot 0 --batch_size 8 \

Default aruguments: base_lr=0.004, weight_decay=1e-4, momentum=0.9

I wonder where the problem is. And could you please share your training scripts for the zero-shot experiment?

@hwanyu112
Copy link
Author

When I make base_lr=0.09, it gets higher mIoU. So could you please provide your whole hyperparameters and how many epochs for training respectively? Thanks a lot.

@Boyiliee
Copy link
Collaborator

Hi @hwanyu112 ,

Thanks so much for your interest in LSeg!

We provide the train script for ADE20k dataset, and you could easily revise it for zero-shot experiments. As for FSS-1000, it should be very easy to reproduce the results. As for COCO and PASCAL datasets, due to very few classes, you need early stop (you should be able to get the optimal results using the models from epoch 0-3) and do a hyper-parameter sweep to find the best learning rate, the optimal lr should be smaller than the lr of FSS-1000.

Hope this helps!

Best,
Boyi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants