New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BF: Make buildbot Pyhon26-32 happy #686

Merged
merged 1 commit into from Jul 29, 2015

Conversation

Projects
None yet
2 participants
@MarcCote
Contributor

MarcCote commented Jul 20, 2015

@@ -174,7 +174,7 @@ def dist(self, v1, v2):
if metric.are_compatible(f1.shape, f2.shape):
distance = metric.dist(f1, f2)
if np.all(f1 == f2):
assert_equal(distance, 0.)
assert_almost_equal(distance, 0.)

This comment has been minimized.

@matthew-brett

matthew-brett Jul 20, 2015

Member

Is it obvious why the distance would not be exactly 0 if the vectors are exactly equal on win32?

This comment has been minimized.

@MarcCote

MarcCote Jul 21, 2015

Contributor

There is no check verifying if the two vectors are equal in the distance function. My guess is that there is a loss of precision at lines 141-143, cos_theta must be less than 1.

v1_normed = v1.astype(np.float64) / norm(v1.astype(np.float64))
v2_normed = v2.astype(np.float64) / norm(v2.astype(np.float64))
cos_theta = np.dot(v1_normed, v2_normed.T)

Surely, I could add an equality test but I am afraid that adding an if in this critical part would slow down the clustering (I didn't do any benchmarks though).

This comment has been minimized.

@matthew-brett

matthew-brett Jul 21, 2015

Member

Thanks for thinking about this. Yes, I guess it must be slight differences in the precision of dot, maybe the order or operation or something. It's a bit surprising to me that a vector dotted with itself in float64 precision would differ by 1e-9 on two platforms, but oh well.

No need to add the special case check in the distance code, I was just curious.

This comment has been minimized.

@MarcCote

MarcCote Jul 21, 2015

Contributor

I agree with you, I thought casting them in float64 would have avoided this precision loss.
Wait! I'm stupid :P the error is probably not with the Python version but rather in the Cython one!

In the following, both features1 and features2 are C float whereas cos_theta is a C double. If I'm not mistaken, the multiplication would be performed in 32 bits instead of 64, right?

cos_theta += features1[0, d] * features2[0, d]

This comment has been minimized.

@matthew-brett

matthew-brett Jul 21, 2015

Member

I did a debug build using try_branch.py : http://nipy.bic.berkeley.edu/builders/dipy-py2.6-32/builds/553/steps/shell_6/logs/stdio

The error comes out of the Cython code, yes. I can't reproduce it with the same code in numpy, I think the same precision. Still not sure where it comes from.

@matthew-brett

This comment has been minimized.

Member

matthew-brett commented Jul 29, 2015

I took a look at this, for a long time, and:

  • I was wrong, the error is for the Python code, not the Cython code;
  • The problem is caused by slightly weird inaccuracy of np.dot on the 32-bit machine.
# check_features.py
import hashlib
import numpy as np

fvec = np.array([[10, 1.5021631121635437e-01, 1.2246063538223773e-16]],
                dtype=np.float32)

dvec = fvec.astype(np.float64)
dvec_n = dvec / np.linalg.norm(dvec)
print(hashlib.sha1(dvec_n.tostring()).hexdigest())
print(repr(np.sum(dvec_n ** 2)))
print(repr(np.dot(dvec_n, dvec_n.T)))

On a 64-bit OSX machine:

In [5]: run check_features.py
335df17574da17f252f7df12998d34b94b072317
1.0
array([[ 1.]])

On the 32-bit machine giving the test failure:

In [114]: run check_features.py
335df17574da17f252f7df12998d34b94b072317
1.0
array([[ 0.99999999999999988898]])

So, this is a 32-bit dot implementation problem, giving fairly small error in a checking function, so I think it's OK to sweep it under the carpet with this PR.

matthew-brett added a commit that referenced this pull request Jul 29, 2015

Merge pull request #686 from MarcCote/fix_cosinemetric_unitest
MRG: Make buildbot Python26-32 happy

`dipy/segment/metricspeed.pyx` calculates the angle between two vectors in a C-ish loop.  The test we use to check the results uses `np.dot` to do the same calculation.   On 32-bit machines, there appear to be small floating point differences in the result of `np.dot` compared to the simplest possible element by element product followed by sum, giving small errors when calculating the angle of a particular vector with itself.  Silence these by allowing small differences between the Cython result and the result of the `np.dot` calculation.

@matthew-brett matthew-brett merged commit 24b7ccc into nipy:master Jul 29, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@MarcCote

This comment has been minimized.

Contributor

MarcCote commented Jul 29, 2015

Thanks for having investigated this @matthew-brett.

@MarcCote MarcCote deleted the MarcCote:fix_cosinemetric_unitest branch Jul 29, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment