-
Notifications
You must be signed in to change notification settings - Fork 429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RecoBundles - recognition of bundles #1443
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Travis complain but I can not reproduce this error on windows... all tests pass
|
||
Parameters | ||
----------- | ||
static : array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable name static
does not match with input parameter stat
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
----------- | ||
static : array | ||
Static streamlines | ||
moving : array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable name moving
does not match with input parameter mov
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
dipy/align/bundlemin.pyx
Outdated
Notes | ||
----- | ||
The difference with ``_bundle_minimum_distance`` is that we sum the | ||
minimum values only for the static. Therefore, this is assymetric. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: asymmetric
Codecov Report
@@ Coverage Diff @@
## master #1443 +/- ##
==========================================
+ Coverage 87.49% 87.52% +0.03%
==========================================
Files 241 244 +3
Lines 30789 31189 +400
Branches 3322 3426 +104
==========================================
+ Hits 26939 27299 +360
- Misses 3074 3079 +5
- Partials 776 811 +35
Continue to review full report at Codecov.
|
…how we set set_printoptions back to default
@kesshijordan it would be great if you can have a look at this PR. Let me know what you think. @MarcCote please have a look at the qbx_merge function. |
Thanks, @Garyfallidis ! I've been looking forward to this. I'm working on pyAFQ (@arokem) and think this approach would be a great way to clean up the results. I made some atlases (calculated centroids from quickbundles and streamline registered a series of manually segmented healthy control tracks) that would probably work well for this. |
Great strategy! Cool @kesshijordan, give it a go and let me know if you have questions. Happy to help :) |
@kesshijordan and @BramshQamar please provide your review. It would be great to have this merged for this release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adde minor suggestions to qbx_merge
. Otherwise, looks good to me.
dipy/segment/clustering.py
Outdated
len_s = len(streamlines) | ||
if select_randomly is None: | ||
select_randomly = len_s | ||
indices = np.random.choice(len_s, min(select_randomly, len_s), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I prefer using a dedicated random generator instead of relying on np.random
. It gives you more control for reproducibility.
I would add a new optional argument rng=None
to the function and do
indices = np.arange(len(streamlines))
if select_randomly:
rng = np.random.RandomState(None) if rng is None else rng
rng.shuffle(indices)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note. I personally almost always replace the pattern x = y if x is None else x
by x = x or y
rng = rng or np.random.RandomState(None)
I know that this could be a trap because False is not the same as None but some programmers prefer the or
trick.
dipy/segment/clustering.py
Outdated
|
||
final_level = len(thresholds) | ||
|
||
qbx_ordering_final = np.random.choice( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could reuse the rng
from above here instead of relying on the global RandomState.
dipy/segment/clustering.py
Outdated
|
||
print(' Duration %0.3f sec. \n' % (time() - t1, )) | ||
|
||
return merged_cluster_map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing a newline.
@Garyfallidis I think it's worth having a basic tutorial with a quick demo/guide for parameter tuning in the same release as Recobundles. I would imagine that when it's released people will give it a quick try to see if it works on their data as well as the examples in the publication. You want them to know which knobs to turn when they do; the tuning isn't necessarily intuitive (it made a huge difference in how well it worked on my data when I got feedback from you on tuning). It would help make a good first impression. |
dipy/align/streamlinear.py
Outdated
|
||
# In essence whole_brain_slr can be thought as a combination of | ||
# SLF and QuickBundles | ||
whole_brain_slr = slr_with_qb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whole_brain_slr and slr_with_qb basically have same implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the same function can be used for brain tractograms and other tractograms. In the paper I called it whole_brain_slr
but then realized that the function can be more generic. Adding some more documentation to explain.
dipy/segment/clustering.py
Outdated
len_s = len(streamlines) | ||
if select_randomly is None: | ||
select_randomly = len_s | ||
indices = np.random.choice(len_s, min(select_randomly, len_s), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note. I personally almost always replace the pattern x = y if x is None else x
by x = x or y
rng = rng or np.random.RandomState(None)
I know that this could be a trap because False is not the same as None but some programmers prefer the or
trick.
dipy/align/bundlemin.pyx
Outdated
double inf = np.finfo('f8').max | ||
double dist=0 | ||
double * min_j | ||
# double * min_i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please clean all the commented code in this function?
dipy/segment/tests/test_rb.py
Outdated
|
||
|
||
def test_rb_no_neighb(): | ||
# what if neighbors are found? No recognition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a typo? What if NO neighbors are found?
dipy/align/streamlinear.py
Outdated
|
||
MAX_DIST = 1e10 | ||
LOG_MAX_DIST = np.log(MAX_DIST) | ||
|
||
DEFAULT_BOUNDS = [(-45, 45), (-45, 45), (-45, 45), | ||
(-30, 30), (-30, 30), (-30, 30), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure it's 45, 30, and not 30, 45? I ask because I have seen another version with translation (-30, 30)
and rotation (-45, 45)
and I believe it makes more sense that way.
EDIT: Oh, I see that you have another set of bounds below :) Can you explain why it's different from this set that you jsut defined?
dipy/segment/bundles.py
Outdated
clustering. | ||
""" | ||
|
||
self.reduction_thr = reduction_thr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been working with this codebase (a really similar one from François Rheault) in the last weeks and please, for the love of gods, don't do that! Stop putting mutable stuff in self like a madman. It's so much harder to debug! I would know, I did it! I took me days to debug instead of hours.
I'm not talking only about reduction_thr
. It wouldn't be a real problem if it was only for this variable. The problem happen because you have reduction_thr
, model_bundle
, pruned_streamlines
, nb_pruned_streamlines
, transf_streamlines
, transf_matrix
, labels
, model_clust_thr
, model_cluster_map
, model_centroids
, nb_model_centroids
, centroid_matrix
, neighb_streamlines
, neighb_clusters
, neighb_centroids
, neighb_indices
, nb_neighb_streamlines
, slr_bmd
, slr_iterations
, slr_initial_matrix
, slr_final_matrix
, slr_xopt
, rtransf_cluster_map
, rtransf_centroids
, nb_rtransf_centroids
, pruning_matrix
, pruned_indices
and labeled_streamlines
that are not in self at creation and are modified at random places and times. I may or may not have missed some mutable variables, so there may be more.
Some of them are only used in one function. Some of them are used in many functions but it could be easily be avoided. Some of them are simply the len of another. Maybe some of them are never used, I didn't check.
I will not write an essay here about why a programmer should never do that, I'll simply say that it's super hard to follow and to debug this kind of code. By the way, sorry for the "tone" of this review, I swear I did it for the good of DiPy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What an old-fashioned / C++ way of thinking :P ... just kidding! I totally agree with you that would definitively clarify the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the alternative that you are proposing @nilgoyette ? Also let's all try to be polite to each other. The tone was not necessary. Please be specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing fancy: return value(s) from functions, add arguments, try to have 0 mutable member.
As a generul rule, try to only have immutable objects. self
should never be modified after construction. I know it's not Python way of thinking and it's maybe "too much", so I wouldn't care if 1 or 2 members are modified but you took it to another level :)
dipy/align/bundlemin.pyx
Outdated
openmp.omp_init_lock(&lock) | ||
|
||
min_j = <double *> malloc(static_size * sizeof(double)) | ||
# min_i = <double *> malloc(moving_size * sizeof(double)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because you only keep the min of the first loop, you don't need an array. A simple min
and sum
var will do.
dipy/align/bundlemin.pyx
Outdated
Notes | ||
----- | ||
The difference with ``_bundle_minimum_distance`` is that we sum the | ||
minimum values only for the static. Therefore, this is asymetric. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please explain more what are the consequences, if any? This function is a little less heavy than _bundle_minimum_distance
, ok, but what are we losing? When can we use it?
dipy/segment/bundles.py
Outdated
# return recognized bundle in original streamlines, labels of | ||
# recognized bundle and transformed recognized bundle | ||
return self.streamlines[self.labels], self.labels, \ | ||
self.pruned_streamlines |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be good to have detailed documentation about difference between self.streamlines[self.labels]
and self.pruned_streamlines
returned by recognize function.
self.cluster_map.refdata = self.streamlines | ||
self.centroids = self.cluster_map.centroids | ||
self.nb_centroids = len(self.centroids) | ||
self.indices = [cluster.indices for cluster in self.cluster_map] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the update, it's incredibly better. You can go further by removing self.clust_thr
and self.indices
(never used), and self.nb_centroids
(only used in a print
, so simply use len
). I think I won't complain after that :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am removing self.clust_thr but the others are useful to stay there. I have reduced number of attributes incredibly there is no need to move directly and made _cluster_streamlines a "private" function. So, you should be very happy. Some utility attributes are useful.
dipy/segment/bundles.py
Outdated
|
||
def _cluster_streamlines(self, clust_thr=15, nb_pts=20): | ||
|
||
np.random.seed(42) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be a parameter? It would be nice if the user could simply choose a seed to have reproducable results. It's probably more complex than this simple line though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, needs a proper RandomState there. Fixing ...
centroid_matrix[centroid_matrix > reduction_thr] = np.inf | ||
|
||
mins = np.min(centroid_matrix, axis=0) | ||
close_clusters_indices = list(np.where(mins != np.inf)[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get this strategy. Why not forget the infinity and simply use your reduction_thr
directly in the np.where
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple ways to go to Rome.
|
||
# TODO this can be speeded up by using directly the centroids | ||
static = select_random_set_of_streamlines(model_bundle, | ||
select_model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can fit in 80 characters.
|
||
mins = np.min(pruning_matrix, axis=0) | ||
pruned_indices = [rtransf_cluster_map[i].indices | ||
for i in np.where(mins != np.inf)[0]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem here?
dipy/segment/bundles.py
Outdated
pruned_streamlines = [transf_streamlines[i] | ||
for i in pruned_indices] | ||
|
||
pruned_indices = pruned_indices |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That might be useless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good thanks.
@kesshijordan I will have to make a tutorial on another PR. Here is why. We are currently cleaning and preparing an atlas to be used. And it takes time. I want you to look at this atlas when you visit us next week. Ideally I want to make sure that we have a tutorial that can run with multiple input atlases. But we can start with one first. |
Thank you all @kesshijordan @MarcCote @BramshQamar @nilgoyette and @MarcCote for your reviews. There were all very useful. All your comments have been address. Apologies also for the initial coding style with the many unnecessary attributes. But all this is fixed now. We are going to merge this PR today so we can move with the release in preparation for the big DIPY workshop next week. During https://brainhack.sice.indiana.edu. More updates and tutorials and atlases will be available in other PRs. |
RecoBundles - recognition of bundles
This PR implements the super 🔥🔥🔥🔥 paper:
Garyfallidis et al. Recognition of white matter bundles using local and global streamline-based registration and clustering NeuroImage, 2017
https://www.sciencedirect.com/science/article/pii/S1053811917305839
This PR is ready for reviews.
I would like that first we merge #1379 so I can move from using directly the ArraySequence to using the Streamlines class. #1379 only needs a small coverage boost.
Some comments on the methods of this PR.
RecoBundles extracts bundles from a given tractogram using model bundles from another tractogram.
Therefore it is important to make sure that the two tractograms are registered first. So, what we usually do is apply the remarkably robust streamline registration first (see function whole_brain_slr) and then start RecoBundles. This strategy can be very useful with clinical datasets where image-based registration is uncertain. Of course you are happy to use other methods for registration too.
https://www.ncbi.nlm.nih.gov/pubmed/25987367
After this PR is merged will be working on providing an easy to use workflow and with access to an atlas of many bundles. And build the tutorial. Nonetheless, be happy to start using this and do please cite us gracefully.
Remark; If the two tractograms are already registered well with nonrigid registration make sure you disable the local_slr parameter as it will be probably an overkill.
Oh and Welcome to the Matrix! :) This PR brings infinite possibilities in the game. More cool additions and enhancements coming soon.
@frheault after this is merged make a PR for the centroid-based metric directly to the DIPY master. I am talking about your previous work here https://github.com/Garyfallidis/dipy/pull/50/files