Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmark/mnist #5680

Closed
wants to merge 16 commits into from
Closed

Benchmark/mnist #5680

wants to merge 16 commits into from

Conversation

dzhwinter
Copy link
Contributor

@dzhwinter dzhwinter commented Nov 15, 2017

Since this PR contains TensorFlow code, this benchmark will not be merged, the small fix has been included in our develop branch.

The comparison result is shown in #5862

helinwang
helinwang previously approved these changes Nov 15, 2017
Copy link
Contributor

@helinwang helinwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@reyoung reyoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not change default random seed to 1.

@qingqing01
Copy link
Contributor

Better to add testing set in the demo.

@QiJune
Copy link
Member

QiJune commented Nov 16, 2017

What's the purpose of this PR? Does it want to compare the acc with other frameworks, like tensorflow/pytorch?

@dzhwinter
Copy link
Contributor Author

I compare our refactorization with Paddle V1 / TensorFlow.
The reason I change the random seed into 1 --- Paddle V1 cannot set a 0 random seed, 0 means the seed selected by the system in the old config. Just for comparing the results.

@dzhwinter
Copy link
Contributor Author

I will merge the tensorflow code into this PR, and make it more clear. Sorry for the misleading.

@QiJune
Copy link
Member

QiJune commented Nov 16, 2017

And what's the result of this benchmark? Does this PR compare the computation batch by batch? If we initialize parameter the same, does each batch's result is the same with tensorflow? We may need to control the error against tensorflow under 1e-6 with double compile.

attr=param_attr,
shape=param_shape,
dtype=dtype,
initializer=NormalInitializer(0.0, std, 1))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much is the effect of this initialization?
I know this change make it same as v2 initialization. But perhaps the default XavierInitializer is more common choice. We should allow the user to supply initializer. In fact, it is can be supplied in param_attr. Adding an initializer here will ignore the initializer setting in param_attr and cause confusing.

In order to make the sample mnist code, it is better to change the example to set the initializer through param_attr. And if the different std does not have much effect, we don't even need to bother to change the mnist example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for such a late reply caused by the trip. I compared the forward cost, accuracy, loss with v2/tensorflow batch by batch. Because our default Initializer is UniformInitializer, has a
-0.02 compared with the benchmark at the first several batches. With the NormalInitalizer w e have reached the same result with Paddle V2, even at the last decimal. I think we can draw the conclusion that our implementation has got the same accuracy with the Paddle V2.

Tensorflow uses a special random number generator algorithm Philox which we don't have. We can reach the approximal same result when we fill Variable with the same random value, but there is a difference at 6 to 8 decimal place, caused by the conv2d operator implementation difference.

We should allow the user to supply initializer. In fact, it is can be supplied in param_attr. Adding an initializer here will ignore the initializer setting in param_attr and cause confusing.

We are working on make Initializer configurable. . Sure will only change the mnist example configuration in the benchmark test and set fc layer with XavierInitilizer as default.
#5819
#5805
#5760

@dzhwinter dzhwinter closed this Jun 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants