How to save and evaluate model after training ? #4

dnaveenr · 2019-09-24T07:07:21Z

I wanted to know how we could save the model weights and use it for inference on some test data ?
I tried using torch.save() and torch.load() after all completion of all epochs, but the results are not similar as to automatic eval on test split and quite bad.
Could you please provide some inputs on this ?

Confusezius · 2019-09-24T12:49:21Z

You would need to show me exactly what you do. Ideally, you do:

opt = pkl.load(open(your_save_path/hypa.pkl','rb'))
network = netlib.networkselect(opt)
network.load_state_dict(torch.load(your_save_path/checkpoint.pth.tar)['state_dict'])
_ = network.eval()
with torch.no_grad(): do stuff

Also, you might not want to use the weights directly AFTER training, but those that provide the best validation performance (as is done automatically when calling eval.evaluate()). The reported results are based on weights found this way.

dnaveenr · 2019-09-24T13:11:05Z

Thanks @Confusezius ; will test with your inputs. I was saving the model.state_dict after every epoch and picked the model file with best validation performance.
I am working on image retrieval with L2 distance based on the feature embeddings generated after training. I noticed that directly generating the feature vector embedding ( with Resnet pretrained model ) gives results which were specific to image color features, so currently working on improving the embeddings.

Confusezius · 2019-09-24T14:23:17Z

@dnaveenr keep me posted, both if everything works out or not :)

dnaveenr · 2019-09-25T10:07:42Z

@Confusezius Sure. Will keep you posted.
Just wanted to check more one thing, since we are doing online sampling, does batch size play a major role in performance ? Currently, default batch size is 112, which I cannot use on my GPU. I'm working with a batch size of 32. During your tests, did you try with different batch size ? Any suggestions ?

Confusezius · 2019-09-25T12:23:39Z

So for methods like ProxyNCA where you use external proxies to create meaningful n-tuples for training it doesn't matter (to a certain degree). However, for triplet sampling methods like semihard or distance sampling, there is a notable drop in performance for small batchsizes, as the set of triplets you can create per choice of anchor gets smaller, thereby constraining the effectiveness of your sampling method. In that regard, I have seen significant performance change for batchsizes below 100. However you can try and counter it a bit by maybe making --n_samples_per_class smaller, e.g. using 2 or 3 instead of 4, to uphold a higher diversity in negatives.

dnaveenr · 2019-09-25T13:32:41Z

Thanks. Very insightful. I will try tweaking --n_samples_per_class and check.

Confusezius · 2019-09-28T14:43:24Z

Feel free to reopen this issue if there are any other issues :). Otherwise I'll close it for now.

Confusezius closed this as completed Sep 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to save and evaluate model after training ? #4

How to save and evaluate model after training ? #4

dnaveenr commented Sep 24, 2019

Confusezius commented Sep 24, 2019

dnaveenr commented Sep 24, 2019 •

edited

Confusezius commented Sep 24, 2019

dnaveenr commented Sep 25, 2019

Confusezius commented Sep 25, 2019

dnaveenr commented Sep 25, 2019

Confusezius commented Sep 28, 2019

How to save and evaluate model after training ? #4

How to save and evaluate model after training ? #4

Comments

dnaveenr commented Sep 24, 2019

Confusezius commented Sep 24, 2019

dnaveenr commented Sep 24, 2019 • edited

Confusezius commented Sep 24, 2019

dnaveenr commented Sep 25, 2019

Confusezius commented Sep 25, 2019

dnaveenr commented Sep 25, 2019

Confusezius commented Sep 28, 2019

dnaveenr commented Sep 24, 2019 •

edited