Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation script to reproduce numbers in SAM-HQ paper #46

Open
ankitgoyalumd opened this issue Jul 22, 2023 · 4 comments
Open

Evaluation script to reproduce numbers in SAM-HQ paper #46

ankitgoyalumd opened this issue Jul 22, 2023 · 4 comments

Comments

@ankitgoyalumd
Copy link

Great work! I am looking for a script that allows to reproduce iou and boundary IOU numbers. I looked into the train folder and there is an evaluation example shown. However it uses the checkpoint sam_vit_l_0b3195.pth

The predicted masks from this checkpoint of extreme poor quality leading me to believe I should have been using sam_hq_vit_l.pth shown in the main readme of the repo. However when I pass sam_hq_vit_l.pth, to the argument checkpoint of train.py along with flag --eval, it fails to load the checkpoint and errors out since keys do not match.

Please advise how I can reproduce results.

@ymq2017
Copy link
Collaborator

ymq2017 commented Jul 24, 2023

Hi, the evaluation script in the train folder is
python -m torch.distributed.launch --nproc_per_node=1 train.py --checkpoint ./pretrained_checkpoint/sam_vit_l_0b3195.pth --model-type vit_l --output work_dirs/hq_sam_l --eval --restore-model work_dirs/hq_sam_l/epoch_11.pth
In this script, we load sam_vit_l_0b3195.pth for the encoder output and load the additional parameters trained by ours with the argument --restore-model. In training, we only learn a small number of parameters and save them. For evaluation on HQ dataset, we only need to load this group of parameters.
An example checkpoint for --restore-model can be found here. You can also train it yourself.

@ankitgoyalumd
Copy link
Author

ankitgoyalumd commented Jul 30, 2023 via email

@ymq2017
Copy link
Collaborator

ymq2017 commented Jul 31, 2023

Hi, when evaluating the four HQ datasets, we only need to load the mask_decoder of hq_sam. We do it this way because it saves storage space during training. For example,
--restore-model work_dirs/hq_sam_l/epoch_11.pth
Here epoch_11.pth is the decoder part of the sam_hq_vit_l.pth. An example checkpoint for --restore-model can be found in this link. You can also train it yourself.

@vishakhalall
Copy link

vishakhalall commented Aug 17, 2023

I faced this error too. I used the command python3 -m torch.distributed.launch --nproc_per_node=1 train.py --checkpoint ./pretrained_checkpoint/sam_vit_l_0b3195.pth --model-type vit_l --output work_dirs/hq_sam_l/ --eval --restore-model work_dirs/hq_sam_l/sam_hq_epoch_11.pth where sam_hq_epoch_11.pth was the final checkpoint that the training had saved. However when I tried with any other previous checkpoints, it worked as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants