Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some question about trainning #9

Closed
zbw0329 opened this issue Sep 23, 2021 · 10 comments
Closed

Some question about trainning #9

zbw0329 opened this issue Sep 23, 2021 · 10 comments

Comments

@zbw0329
Copy link

zbw0329 commented Sep 23, 2021

Hi~Thanks for your excellent work!
I have a machine with 2 1080Ti,and I want to train your model on CIFAR10 with resnet18.

I use the parmeters like this ,but it seems don't work.
1632405015(1)

The program is stuck in this situation.

1632405115(1)

@zbw0329
Copy link
Author

zbw0329 commented Sep 23, 2021

If I use this parameters
1632405352(1)
And I get this back

1632405379(1)

@akuxcw
Copy link
Collaborator

akuxcw commented Sep 26, 2021

Hi, if you want to train models on CIFAR10, you have to change the dataset and data augmentation. The methods of loading ImageNet and CIFAR10 are different.

@zbw0329
Copy link
Author

zbw0329 commented Sep 27, 2021

What is the different between '--resume' and '--pretrained'?
How to evaluate my model?

@zbw0329
Copy link
Author

zbw0329 commented Sep 27, 2021

I changed the function of loading datasets,it works.
1632735880(1)

But when I evaluate my model,the output tensor is [1,1000] and the target tensor is [1].
I think it cause by the difference between CIFAR10 and ImageNet.
Do you think I should add a softmax to the end of the model?

@akuxcw
Copy link
Collaborator

akuxcw commented Sep 28, 2021

What is the different between '--resume' and '--pretrained'?
How to evaluate my model?

"resume" will load optimizer, which is used to resume training from unexcepted break.
You can use linear evaluation or downstream tasks to evaluate the model.

@akuxcw
Copy link
Collaborator

akuxcw commented Sep 28, 2021

I changed the function of loading datasets,it works.
1632735880(1)

But when I evaluate my model,the output tensor is [1,1000] and the target tensor is [1].
I think it cause by the difference between CIFAR10 and ImageNet.
Do you think I should add a softmax to the end of the model?

I suppose the dataloader is wrong. You do need to change the model's classifier to fit CIFAR10, but the shape of output and target tensor is wired. Do you use batch_size = 1?

@zbw0329
Copy link
Author

zbw0329 commented Sep 28, 2021

Yes, I set batch_size to 1 by mistake.
Thanks for your help.
I wish you success in your research.

@zbw0329 zbw0329 closed this as completed Sep 28, 2021
@zbw0329
Copy link
Author

zbw0329 commented Sep 29, 2021

My Loss_clu is always the same during my trainning,what happened?
微信图片_20210929155454

@zbw0329 zbw0329 reopened this Sep 29, 2021
@zbw0329
Copy link
Author

zbw0329 commented Oct 12, 2021

Could you release the code of CIFAR10?
I notice that you have showed the table of CIFAR10 on your paper

@akuxcw
Copy link
Collaborator

akuxcw commented Oct 20, 2021

Hi, the results on CIFAR are linear evaluation / finetune results. We didn't try to train pretrained models on CIFAR. The losses during training do reduce very slowly. Are the linear evaluation results on CIFAR reasonable?

@zbw0329 zbw0329 closed this as completed Nov 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants