New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inaccuracies in tf.reduce_sum #5527
Comments
Met similar problem recently. the following code: data = np.random.rand(1e4, 1e4) + np.random.rand(1e4, 1)
data = data.astype('float32')
vec = tf.placeholder(tf.float32, data.shape)
s = tf.reduce_sum(vec)
with tf.Session(config=tf.ConfigProto(intra_op_parallelism_threads=1)) as sess:
np_sum = np.sum(data)
tf_sum = sess.run(s, feed_dict={vec: data})
diff = abs(tf_sum - np_sum)
print np_sum, diff gives an error of sometimes 2e5, which is about 2e-3 error per element. This seems a bit large. |
NumPy uses pairwise summation in some cases: Not sure what TensorFlow does on the CPU. |
@asimshankar could you please have someone look into this? Thanks! |
@ppwwyyxx Thanks for the quick responses! Just wanted to mention that the inaccuracy is still much higher than numpy when running on the GPU, even though it sounds like its less so than TF CPU. |
@zergylord @ppwwyyxx I cannot reproduce either of these snippets myself. In the first snippet, both Numpy and TF return 1.0. In the second, the error seems to be of the magnitude in O(100), not as drastic as reported. Could you give some indications on how you built your program and other environment/compiler info? |
@zergylord : To add to what @concretevitamin said - if you're seeing this on GPU a lot, do let us know about the CUDA version and the driver version. Mismatches between them often lead to messed up computations. |
@concretevitamin Running on GPU my snippet got the error of O(100). But on CPU I have O(1e5). I'm using the binary at https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.11.0rc2-cp27-none-linux_x86_64.whl. Tested on both i7-5930K+ubuntu16.04 and i7-6700HQ+Archlinux. My |
One weird thing is that this seems reproducible only non-deterministically. Sometimes I see negligible error, sometimes I see 1%. It seems this is working as intended. When ~1e8 numbers are being accumulated, at some point the accumulator value becomes >> next value being accumulated (in [0,1]). On GPU the Eigen impl has |
@benoitsteiner, are you planning to improve the numerical accuracy of reduce_sum, or should we mark it contributions welcome. The naive implementation is certainly not "incorrect" or a "bug" but it would be better for it to use multiple accumulators, magnitude ordering partial sums or whatever. |
Duplicate of #2625. |
Met same problem, any solution? |
I've been trying to normalize some vectors by dividing by their sum, but have noticed that this doesn't always result in vectors that sum to one.
For example:
While I realize that this is possible with finite precision floating point operations, I find it odd how readily it occurs, especially when the same operation in numpy doesn't encounter this issue. Is this a known issue? I've also noticed that tf.nn.softmax doesn't appear to suffer from this problem, is there a different op I need to use to ensure proper normalization?
The text was updated successfully, but these errors were encountered: