Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to properly train SPTNet #4

Closed
Iranb opened this issue Mar 14, 2024 · 3 comments
Closed

How to properly train SPTNet #4

Iranb opened this issue Mar 14, 2024 · 3 comments

Comments

@Iranb
Copy link

Iranb commented Mar 14, 2024

Thank you for your cool work on GCD.

I ran the training script using the script in the Readme and tried to train a model based on DINO pretraining on the CUB dataset, but it seems that there are issues with the results.

CUDA_VISIBLE_DEVICES=0 python train_spt.py \
    --dataset_name 'CUB' \
    --batch_size 128 \
    --grad_from_block 11 \
    --epochs 1000 \
    --num_workers 8 \
    --use_ssb_splits \
    --sup_weight 0.35 \
    --weight_decay 5e-4 \
    --transform 'imagenet' \
    --lr 1 \
    --lr2 0.05 \
    --prompt_size 1 \
    --freq_rep_learn 20 \
    --pretrained_model_path ./pretrained/dino_vitbase16_pretrain.pth \
    --prompt_type 'all' \
    --eval_funcs 'v2' \
    --warmup_teacher_temp 0.07 \
    --teacher_temp 0.04 \
    --warmup_teacher_temp_epochs 10 \
    --memax_weight 1 \
    --model_path ./model_save

Here is the results.txt , which records the accuracy changes of each epoch during the training of 1000 epochs.

result.txt

What parameters do I need to modify to reproduce the results in the paper?

I look forward to your response and would like to thank you once again for your great work !

@whj363636
Copy link
Collaborator

whj363636 commented Mar 14, 2024

Hi @Iranb

Thank you for your interest! I have reviewed the results you provided. I kindly request you to take note of the reminder mentioned below the training scripts. Our model is designed to enhance the compatibility with GCD by adjusting both the model parameters and prompt parameters. If you wish to reproduce the results mentioned in the paper, it is necessary to obtain the pretrained model and replace 'pretrained_model_path' with the SimGCD pretrained model, as SimGCD was the model we utilized in the paper. However, please note that our method can be applied to any other pretrained model, and SimGCD is just one of the optional choices. You can further replace the pretrained model with any other model, but need to carefully select the hyperparameters and training scheme.

@Iranb
Copy link
Author

Iranb commented Mar 14, 2024

Hi @Iranb

Thank you for your interest! I have reviewed the results you provided. I kindly request you to take note of the reminder mentioned below the training scripts. Our model is designed to enhance the compatibility with GCD by adjusting both the model parameters and prompt parameters. If you wish to reproduce the results mentioned in the paper, it is necessary to obtain the pretrained model and replace 'pretrained_model_path' with the SimGCD pretrained model, as SimGCD was the model we utilized in the paper. However, please note that our method can be applied to any other pretrained model, and SimGCD is just one of the optional choices. You can further replace the pretrained model with any other model, but need to carefully select the hyperparameters and training scheme.

Thank you for your response. Through the replication experiments of SimGCD before, I did find that the current method's loss design is sensitive to hyperparameters. I will retry the SPTNet based on the pretrained SimGCD weights. 😊

@whj363636
Copy link
Collaborator

whj363636 commented Mar 14, 2024

Sure. The default setting of the SimGCD paper should be sufficient to obtain a good enough pretrained model (for CUB, be aware of the difference of hyperparameters between their latest versions and previous ones), and proceed with building our SPTNet on top of it. Please feel free to reopen this issue if you encounter any difficulties.

@whj363636 whj363636 changed the title Train 1000 epoch on CUB dataset but got wrong results? How to properly train SPTNet Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants