Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inaccuracies in tf.reduce_sum #5527

Closed
zergylord opened this issue Nov 10, 2016 · 11 comments
Closed

Inaccuracies in tf.reduce_sum #5527

zergylord opened this issue Nov 10, 2016 · 11 comments
Assignees
Labels
stat:awaiting tensorflower Status - Awaiting response from tensorflower type:feature Feature requests

Comments

@zergylord
Copy link

zergylord commented Nov 10, 2016

I've been trying to normalize some vectors by dividing by their sum, but have noticed that this doesn't always result in vectors that sum to one.

For example:

foo = np.random.rand(100)*100 #random floats between 0-100
print(sess.run(tf.reduce_sum(foo/tf.reduce_sum(foo)))) #should return 1
print((foo/foo.sum()).sum()) #does return 1

While I realize that this is possible with finite precision floating point operations, I find it odd how readily it occurs, especially when the same operation in numpy doesn't encounter this issue. Is this a known issue? I've also noticed that tf.nn.softmax doesn't appear to suffer from this problem, is there a different op I need to use to ensure proper normalization?

@ppwwyyxx
Copy link
Contributor

Met similar problem recently. the following code:

data = np.random.rand(1e4, 1e4) + np.random.rand(1e4, 1)
data = data.astype('float32')
vec = tf.placeholder(tf.float32, data.shape)
s = tf.reduce_sum(vec)
with tf.Session(config=tf.ConfigProto(intra_op_parallelism_threads=1)) as sess:
    np_sum = np.sum(data)
    tf_sum = sess.run(s, feed_dict={vec: data})
    diff = abs(tf_sum - np_sum)
    print np_sum, diff

gives an error of sometimes 2e5, which is about 2e-3 error per element. This seems a bit large.
The difference is not very large on GPU though. I wonder how the reduction is implemented and what's its difference from numpy.

@shoyer
Copy link
Contributor

shoyer commented Nov 10, 2016

NumPy uses pairwise summation in some cases:
https://en.wikipedia.org/wiki/Pairwise_summation

Not sure what TensorFlow does on the CPU.

@sherrym
Copy link
Contributor

sherrym commented Nov 11, 2016

@asimshankar could you please have someone look into this? Thanks!

@zergylord
Copy link
Author

@ppwwyyxx Thanks for the quick responses! Just wanted to mention that the inaccuracy is still much higher than numpy when running on the GPU, even though it sounds like its less so than TF CPU.

@concretevitamin
Copy link
Contributor

@zergylord @ppwwyyxx I cannot reproduce either of these snippets myself. In the first snippet, both Numpy and TF return 1.0. In the second, the error seems to be of the magnitude in O(100), not as drastic as reported.

Could you give some indications on how you built your program and other environment/compiler info?

@asimshankar
Copy link
Contributor

@zergylord : To add to what @concretevitamin said - if you're seeing this on GPU a lot, do let us know about the CUDA version and the driver version. Mismatches between them often lead to messed up computations.

@asimshankar asimshankar added stat:awaiting response Status - Awaiting response from author and removed stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Nov 14, 2016
@ppwwyyxx
Copy link
Contributor

ppwwyyxx commented Nov 14, 2016

@concretevitamin Running on GPU my snippet got the error of O(100). But on CPU I have O(1e5).

I'm using the binary at https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.11.0rc2-cp27-none-linux_x86_64.whl. Tested on both i7-5930K+ubuntu16.04 and i7-6700HQ+Archlinux. My numpy.__version__ == '1.11.2'.

@concretevitamin
Copy link
Contributor

One weird thing is that this seems reproducible only non-deterministically. Sometimes I see negligible error, sometimes I see 1%.

It seems this is working as intended. When ~1e8 numbers are being accumulated, at some point the accumulator value becomes >> next value being accumulated (in [0,1]). On GPU the Eigen impl has
more accumulators so this is mitigated a bit. "Fixing" this probably requires significant algo revamp in https://bitbucket.org/eigen/eigen/src/caedc0b002df3c53c265b603c0f537511a88b15f/unsupported/Eigen/CXX11/src/Tensor/TensorReduction.h?at=default&fileviewer=file-view-default.

@aselle
Copy link
Contributor

aselle commented Feb 13, 2017

@benoitsteiner, are you planning to improve the numerical accuracy of reduce_sum, or should we mark it contributions welcome. The naive implementation is certainly not "incorrect" or a "bug" but it would be better for it to use multiple accumulators, magnitude ordering partial sums or whatever.

@aselle aselle added stat:awaiting tensorflower Status - Awaiting response from tensorflower type:feature Feature requests and removed stat:awaiting response Status - Awaiting response from author labels Feb 13, 2017
@girving
Copy link
Contributor

girving commented Jun 16, 2017

Duplicate of #2625.

@girving girving closed this as completed Jun 16, 2017
@OnlyBelter
Copy link

Met same problem, any solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:awaiting tensorflower Status - Awaiting response from tensorflower type:feature Feature requests
Projects
None yet
Development

No branches or pull requests

10 participants