Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to save and evaluate model after training ? #4

Closed
dnaveenr opened this issue Sep 24, 2019 · 7 comments
Closed

How to save and evaluate model after training ? #4

dnaveenr opened this issue Sep 24, 2019 · 7 comments

Comments

@dnaveenr
Copy link

I wanted to know how we could save the model weights and use it for inference on some test data ?
I tried using torch.save() and torch.load() after all completion of all epochs, but the results are not similar as to automatic eval on test split and quite bad.
Could you please provide some inputs on this ?

@Confusezius
Copy link
Owner

You would need to show me exactly what you do. Ideally, you do:

  1. opt = pkl.load(open(your_save_path/hypa.pkl','rb'))
  2. network = netlib.networkselect(opt)
  3. network.load_state_dict(torch.load(your_save_path/checkpoint.pth.tar)['state_dict'])
  4. _ = network.eval()
  5. with torch.no_grad(): do stuff

Also, you might not want to use the weights directly AFTER training, but those that provide the best validation performance (as is done automatically when calling eval.evaluate()). The reported results are based on weights found this way.

@dnaveenr
Copy link
Author

dnaveenr commented Sep 24, 2019

Thanks @Confusezius ; will test with your inputs. I was saving the model.state_dict after every epoch and picked the model file with best validation performance.
I am working on image retrieval with L2 distance based on the feature embeddings generated after training. I noticed that directly generating the feature vector embedding ( with Resnet pretrained model ) gives results which were specific to image color features, so currently working on improving the embeddings.

@Confusezius
Copy link
Owner

@dnaveenr keep me posted, both if everything works out or not :)

@dnaveenr
Copy link
Author

@Confusezius Sure. Will keep you posted.
Just wanted to check more one thing, since we are doing online sampling, does batch size play a major role in performance ? Currently, default batch size is 112, which I cannot use on my GPU. I'm working with a batch size of 32. During your tests, did you try with different batch size ? Any suggestions ?

@Confusezius
Copy link
Owner

So for methods like ProxyNCA where you use external proxies to create meaningful n-tuples for training it doesn't matter (to a certain degree). However, for triplet sampling methods like semihard or distance sampling, there is a notable drop in performance for small batchsizes, as the set of triplets you can create per choice of anchor gets smaller, thereby constraining the effectiveness of your sampling method. In that regard, I have seen significant performance change for batchsizes below 100. However you can try and counter it a bit by maybe making --n_samples_per_class smaller, e.g. using 2 or 3 instead of 4, to uphold a higher diversity in negatives.

@dnaveenr
Copy link
Author

Thanks. Very insightful. I will try tweaking --n_samples_per_class and check.

@Confusezius
Copy link
Owner

Feel free to reopen this issue if there are any other issues :). Otherwise I'll close it for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants