-
Notifications
You must be signed in to change notification settings - Fork 429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cythonize DirectionGetter and whatnot #1342
Conversation
BTW, I think the right merge order should be 1) this PR 2) PFT (merge master, fix conflits) 3) parallel local/pft tracking |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this PR. 2X is super!! See my comments, mostly PEP8 comments. This makes sense to recode those small numpy functions here, as they are called at every tracking step (sometimes more than once per step for PFT).
-
I don't grasp the problem with
def name()
versuscdef name_c()
. Can we modify the tests here? -
cdf = self._adj_matrix[tuple(*direction)]
. Those 3float
could be changed to a singleint
. In fact, the direction is always one of the vertex of the sphere. To do that, we would need to keep track of the index of the vertex on the sphere (and the direction), instead of solely the direction. This is fine for all pmf-based model as they all used a discrete sphere. However, this is not the case for the direction ofEuDX
(PeaksAndMetricsDirectionGetter
). This might need a bit of work to be done nicely. I suggest keeping this for another PR.
@@ -202,14 +209,15 @@ def _set_adjacency_matrix(self, sphere, cos_similarity): | |||
each value is a boolean array indicating which directions are less than | |||
max_angle degrees from the key""" | |||
matrix = np.dot(sphere.vertices, sphere.vertices.T) | |||
matrix = abs(matrix) >= cos_similarity | |||
matrix = (abs(matrix) >= cos_similarity).astype('uint8') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why uint8
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because cython doesn't recognize bool. Numpy does so the line was "ok" but there are warning when we try to use it later in cython.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, thx.
cpdef trilinear_interpolate4d(double[:, :, :, :] data, double[:] point, | ||
np.ndarray out=*) | ||
cpdef trilinear_interpolate4d( | ||
double[:, :, :, :] data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here the function parameters should be aligned with the (
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think PEP8 says anything on this subject. I can revert the [style] change but I personally find it harder to read because of the random numbers of vars on each line.
Idem for all other alignment comments.
cdef int _trilinear_interpolate_c_4d(double[:, :, :, :] data, double[:] point, | ||
double[::1] result) nogil: | ||
cdef int trilinear_interpolate4d_c( | ||
double[:, :, :, :] data, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alignment here?
for i in range(3): | ||
b[i] = a[i] | ||
|
||
|
||
def local_tracker( | ||
DirectionGetter dg, TissueClassifier tc, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alignment here?
np.ndarray[np.float_t, ndim=2, mode='c'] streamline, | ||
double stepsize, | ||
int fixedstep): | ||
cdef int _local_tracker( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alignment here?
double point[3], dir[3], vs[3], voxdir[3] | ||
double[::1] pview = point, dview = dir | ||
void (*step)(double*, double*, double) nogil | ||
void (*step)(double * , double*, double) nogil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original line was OK, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Err, yes, sorry, I accidently added a space :) Will remove.
cdef: | ||
double result | ||
int err | ||
|
||
err = _trilinear_interpolate_c_4d(self.metric_map[..., None], point, | ||
self.interp_out_view) | ||
err = trilinear_interpolate4d_c( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alignment here?
include_err = _trilinear_interpolate_c_4d(self.include_map[..., None], | ||
point, self.interp_out_view) | ||
include_err = trilinear_interpolate4d_c( | ||
self.include_map[..., None], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alignment here?
Thank you for looking at this @gabknight My explanation of
The tests define their own
I think I need to put more emphasis on this. The |
OK, thx @nilgoyette. I get a better picture now. I think we will always want to override those methods in cython anyway. I notice that line 94 in /reconst/peak_direction_getter.pyx should read: I could do a PR on your branch to modify the failing tests and add the missing test for |
@gabknight, if you want and have the time to do it, sure! Thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR @nilgoyette!
I just made some cosmetic comment in your PR.
I removed some checks and I would remove more! ...
totally agree
I removed many calls to numpy and coded the functions myself.
I think for this case, we should put these functions in dipy.utils
and document it like we did in dipy.utils.six
. It can be quite confusing if we let it there. Moreover, if for any reason, someone wants to use this function somewhere else, it's quite ugly to call them from this package. What do you think? What is the good way for that @arokem ? any rules?
I often have a def name() function that simply calls a cdef name_c() function, which is a fast version without check. This can cause some problems....
I need to think more about it
Using np.random() or random(), .....
I think you should implement your own if you think that it will really improve your performance. Like you, I heard too many bad things about the old C implementation so +1 to avoid it.
@@ -137,7 +137,7 @@ def peak_directions(odf, sphere, relative_peak_threshold=.5, | |||
elif n == 1: | |||
return sphere.vertices[indices], values, indices | |||
|
|||
odf_min = odf.min() | |||
odf_min = np.min(odf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was curious about the difference here, so I made the small test below and it seems that odf.min()
is faster than np.min(odf)
in general. (Numpy 1.13.1, Python 3.6). I do not know if it is true for other versions of Numpy or Python. Any idea?
import numpy as np
a = np.random.rand(3000,160000)
%timeit a.min()
172 ms ± 2.73 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit np.min(a)
177 ms ± 14.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha, thank you for the test :) My goal, for once, wasn't to be faster. I changed it because this method is called from Cython and odf
is not a np.array anymore, it's a memoryview. A memoryview doesn't have any of the numpy method (well, yes, some but now much).
tldr np.func(data)
accepts np.array AND memoryview. data.func()
doesn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok ! Good to know, Thanks !
dipy/direction/pmf.pyx
Outdated
trilinear_interpolate4d_c(self.shcoeff, point, self.coeff) | ||
for i in range(len_pmf): | ||
_sum = 0 | ||
for j in range(self.B.shape[1]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like you did above, you should avoid to call shape
function on each iteration so I think it will be better for your performance to put self.B.shape[1]
in a size_t
variable.
raise ValueError("%s is not a known basis type." % basis_type) | ||
self._B, m, n = basis(sh_order, sphere.theta, sphere.phi) | ||
import numpy as np | ||
cimport numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it can be confusing. Can you use cnp
as an alias?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I can, but I'll wait for another request because there's no name collision and imo it's not confusing. The cimport
version is only used for the variable type, so in cdef
and argument definition. For example np.float_t
clearly use the cimport
, while np.dot
doesn't.
As I said, I'll change it to cnp
if it confuses the team!
Yes, sure! I put them just above my code and, of course, I shouldn't do that :) I think I even duplicated them 2 times in different files. Having a fast_numpy pyx library file is indeed a good idea.
Well, no, I don't actually want to code my own random! My question was more: do we have actual proofs that |
Fixed tracking tests and PeaksAndMetricsDirectionGetter
Codecov Report
@@ Coverage Diff @@
## master #1342 +/- ##
==========================================
- Coverage 87.01% 86.97% -0.04%
==========================================
Files 228 227 -1
Lines 29086 28977 -109
Branches 3131 3119 -12
==========================================
- Hits 25309 25204 -105
+ Misses 3068 3067 -1
+ Partials 709 706 -3
Continue to review full report at Codecov.
|
@nilgoyette I added an issue for the dictionary keys #1372. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. I will update PR #1340 once merged.
I just wonder if we should do it now or in a new PR. What do you think @gabknight @nilgoyette ? Otherwise, LGTM, you can go ahead @gabknight |
@skoudoro the move is done. You can merge, if you think everything is done. |
Thanks @nilgoyette @gabknight ! |
Cythonize DirectionGetter and whatnot
Disclaimer: there are 2 failing tests.
I discovered after much testing that calling
def
pyhon code from cython is slow. Much slower than one could think! I tested on LocalTracking and PFT Tracking, though pft doesn't appear in this PR. in both cases, the speedup is around 2x, less for LocalTracking, more for PFT. One could do better by removing checks and cythonizing more but I concentrated on the important parts:DirectionGetter::get_direction
,PmfGen::get_pmf
and calling C code as much as possible.What you might not like:
def name()
function that simply calls acdef name_c()
function, which is a fast version without check. This can cause some problems, and it is indeed causing a problem :) 2 LocalTracking tests are failing because of this. My cython code is directly calling the C version and the test overrides the python version, soSimpleTissueClassifier
andSimpleDirectionGetter
are ignored. Of course, I could call thedef
orcpdef
version but there's a cost to this! To discuss.What I don't like:
cdf = self._adj_matrix[tuple(*direction)]
This will always be slow and (!) it's using 3 floats as a dictionnary key :( I don't know the code enough to fix it but it would be much better/faster if it was a simple integer. Maybe a struct: int index, double[3] direction?np.random()
orrandom()
, not because it's random but because it's the only remaining slow call! I was tempted to use the oldC rand()
but I've heard so much bad things about it that I didn't dare. Do you think it's worth it?