Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It seems "top1" loss function do not work in this tf implementation? #3

Open
ownership-xyz opened this issue Nov 14, 2017 · 5 comments

Comments

@ownership-xyz
Copy link

It seems "top1" loss function do not work in this tf implementation?
I follow the parameters setting like that it theano implementation. However, I just get a result of 0.48 and 0.17. (Original is 0.59 and 0.23)

I use tensorflow 1.2.

@Songweiping
Copy link
Owner

Hi @IcyLiGit ,
Sorry about that you can't reproduce the results. First, I didn't test the code with "top1" loss, thus I have no idea how to set "good" parameters, either. Second, did you get the numbers 0.59 and 0.23 by running theano implementation?
BTW, I use Adam optimizer defaultly and you may try RMSProp(default optimizer in theano implementation) or others. Tuning initial learning rate and dropout rate would also help.
Good Luck!

Weiping

@ownership-xyz
Copy link
Author

@Songweiping
I had ran the original theano code and it produced the right result as that in the paper.Actually, in theano code, the optimizer is adagrad. However, I found adagrad and adadelta do not work in tensorflow implementation.

RMSprop and adam with cross-entropy loss and softmax activation function may work in your implementation. However, top1 and bpr can just produce a result of 0.48 (not 0.6 in paper) , and it seems loss value decrease quicker in tf. (Maybe caused by overfitting? But I can not find the differences of these two implementation....)

@ownership-xyz
Copy link
Author

ownership-xyz commented Nov 14, 2017

I use the same parameter settings in these two implementations. (Softmax + cross-entropy + 0.5 drop + 0.001 lr without decay). However, the losses reported are different.

theano:
2017-11-14 3 46 07

tensorflow:
2017-11-14 3 45 45

@Songweiping
Copy link
Owner

Songweiping commented Nov 14, 2017

It seems that TF converges faster than Theano. So how about:

  1. decease training steps.
  2. more concretely, use validation data to prevent over-fitting(early stop).

Weiping

@gds123
Copy link

gds123 commented Jun 12, 2018

I find the similar issue too
I add dynamic_rnn to Weiping's code, the recall then drop to 0.43 for Softmax + cross-entropy
the recall is 0.43 for top1

and it's not overfitting, I have check the recall on training data
@Songweiping
@IcyLiGit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants