Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new with on stage is even better than which with two stage #24

Open
lucaslu1987 opened this issue Apr 15, 2018 · 10 comments
Open

new with on stage is even better than which with two stage #24

lucaslu1987 opened this issue Apr 15, 2018 · 10 comments

Comments

@lucaslu1987
Copy link

Hi Marek,
i run DANtest.py and want to know the performance of different stage. However, the result of the network with one stage is even better. The data set is initialized as the paper said.below is the test result.(i used center distance)
OneStage
Processing common subset of the 300W public test set (test sets of LFPW and HELEN)
Average error: 0.428566349505
Processing challenging subset of the 300W public test set (IBUG dataset)
Average error: 0.598229653794
Showing results for the entire 300W pulic test set (IBUG dataset, test sets of LFPW and HELEN
Average error: 0.461809522334
TwoStage
Processing common subset of the 300W public test set (test sets of LFPW and HELEN)
Average error: 0.439570144885
Processing challenging subset of the 300W public test set (IBUG dataset)
Average error: 0.616254202783
Showing results for the entire 300W pulic test set (IBUG dataset, test sets of LFPW and HELEN
Average error: 0.474188937071

@MarekKowalski
Copy link
Owner

Hi,

Those results are a bit surprising, as the error for the IBUG dataset you obtained seems to be much lower than any paper I have seen so far. Are you sure you are measuring this correctly?

If you want, you can place the models you have in some public data storage like Dropbox and send me the link, so I can re-evaluate it. This would help rule out the possibility of measurement error.

Best regards,

Marek

@mariolew
Copy link

@MarekKowalski Hi, in the training phase, did you hold the learning rate fixed to be 0.001? If I use learning rate 0.001, then the loss in stage 2 cannot drop.

@MarekKowalski
Copy link
Owner

Yes, the learning rate was kept the same. When you are training the second stage are you updating the parameters for both stages or just stage 2?

In my experience, if you are updating parameters of both stages at the same time, the error takes much longer to drop.

@mariolew
Copy link

@MarekKowalski Actually, I only update parameters for stage 2, but I can see error drop only when I lower down the learning rate.

@MarekKowalski
Copy link
Owner

That's surprising, for how long are you trying to see if the error starts droping? Also, is that in your implementation or in the one in this repo?

@mariolew
Copy link

@MarekKowalski I mean in my implementation, I only observe error drop with my learning rate lower down. And, someone has found almost the same problem in another tensorflow implementation. I've double checked the functionality of the self defined layer and found no problem, I don't know if the Adam is implemented differently in theano, or Batch Norm is implemented differently.

@MarekKowalski
Copy link
Owner

I agree that it might be because TF has a different implementation of some part of the learning algorithm or the network.

@zjjMaiMai
Copy link

@MarekKowalski @mariolew Hi
In my implementation on tensorflow, I have not found this problem.
i use a private dataset , and epoch is [15,45]

image

@jnulzl
Copy link

jnulzl commented Apr 27, 2018

i want to know if both stages share parameters in the same layer?

@MarekKowalski
Copy link
Owner

No the parameter values are not shared between the stages.

Marek

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants