Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QuickBundlesX #1434

Merged
merged 36 commits into from
Mar 3, 2018
Merged

QuickBundlesX #1434

merged 36 commits into from
Mar 3, 2018

Conversation

Garyfallidis
Copy link
Contributor

@Garyfallidis Garyfallidis commented Feb 16, 2018

This PR is replacing #1380
This PR includes the fastest algorithm ever for clustering tractograms!!! 🔥

Garyfallidis E. et al. QuickBundlesX: Sequential clustering of millions of streamlines in multiple levels of detail at record execution time. Proceedings of the, International Society of Magnetic Resonance in Medicine (ISMRM). Singapore, 4187, 2016.

Copy link
Contributor

@MarcCote MarcCote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, QuickBundles(threshold) and QuickBundlesX(thresholds=[threshold]) should produce the same kind of results. Maybe we could refactor QuickBundles such that it uses the same Cython algorithm/code as QuickBundlesX (or maybe in a future PR). In any cases, we add some simple tests that check if the results are similar between QuickBundles(threshold) and QuickBundlesX(thresholds=[threshold]).

thresholds=self.thresholds,
ordering=ordering)

#cluster_map.refdata = streamlines
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove comment.

DEF BIGGEST_FLOAT = 3.4028235e+38 # np.finfo('f4').max
DEF SMALLEST_FLOAT = -3.4028235e+38 # np.finfo('f4').max

THRESHOLD_MULTIPLIER = 2.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused.

@@ -99,7 +402,7 @@ cdef class ClustersCentroid(Clusters):
raise ValueError("'centroid_shape' must be a tuple or a int.")

self._centroid_shape = tuple2shape(centroid_shape)

# self.aabb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line.

@@ -175,6 +478,10 @@ cdef class ClustersCentroid(Clusters):
converged &= fabs(centroid[n, d] - updated_centroid[n, d]) < self.eps
centroid[n, d] = updated_centroid[n, d]

#cdef float * aabb = &self.centroids[id_cluster].aabb[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line.

self.metric = metric
self.features_shape = tuple2shape(features_shape)
self.threshold = threshold
self.max_nb_clusters = max_nb_clusters
self.clusters = ClustersCentroid(features_shape)
self.features = np.empty(features_shape, dtype=DTYPE)
self.features_flip = np.empty(features_shape, dtype=DTYPE)
self.bvh = bvh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nowhere in the code, I could find Quickbundles instance with bvh=1. Maybe we should remove this functionality for now? Otherwise, we could do like for QuickbundlesX and always perform aabb checks?

return self._build_clustermap()


def evaluate_aabbb_checks():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find a place where this function is used.

#test_show_qbx()
#test_3D_segments()
test_3D_points()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Newline missing?

thresholds = [4, 2, 1]
qbx_class = QuickBundlesX(thresholds)
qbx = qbx_class.cluster(points)
print(qbx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we expect? Add assertions.

thresholds = [4, 2, 1]
qbx_class = QuickBundlesX(thresholds)
qbx = qbx_class.cluster(points)
print(qbx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we expect? Add assertions.

@codecov-io
Copy link

codecov-io commented Feb 18, 2018

Codecov Report

Merging #1434 into master will increase coverage by 0.02%.
The diff coverage is 91.54%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1434      +/-   ##
==========================================
+ Coverage   87.38%   87.41%   +0.02%     
==========================================
  Files         237      238       +1     
  Lines       30069    30282     +213     
  Branches     3232     3253      +21     
==========================================
+ Hits        26277    26472     +195     
- Misses       3043     3057      +14     
- Partials      749      753       +4
Impacted Files Coverage Δ
dipy/segment/tests/test_quickbundles.py 98.31% <100%> (+0.36%) ⬆️
dipy/segment/clustering.py 93.51% <79.41%> (-6.49%) ⬇️
dipy/segment/tests/test_qbx.py 96.77% <96.77%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 95a9cca...f81a4da. Read the comment docs.

@Garyfallidis
Copy link
Contributor Author

Hey @MarcCote I agree on the bvh. Needs some investigation. Any clues on the travis error? Strangely only one bot is failing https://travis-ci.org/nipy/dipy/jobs/343108569

@skoudoro
Copy link
Member

The last bot is failing because of the new version of Cython. Look at the issue #1435

@Garyfallidis
Copy link
Contributor Author

@MarcCote there is a change in Cython that will require some large RF of the code. Cython stopped supporting memory views in structs.

@@ -104,3 +111,46 @@ cdef int same_shape(Shape shape1, Shape shape2) nogil:
same_shape &= shape1.dims[i] == shape2.dims[i]

return same_shape


cdef Data2D* create_memview(Py_ssize_t buffer_size, Py_ssize_t dims[MAX_NDIM]) nogil:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we only want to support 2D memview? If so, we could rename the function so that it is clearer it is only for 2D array.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, you are right, let's do it more generic

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally, I choose the easy path: renaming. The second option needs more thinking and we need to move forward with the release

"""
free(&(memview[0][0, 0]))
memview[0] = None # Necessary to decrease refcount
free(memview)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Newline missing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@Garyfallidis
Copy link
Contributor Author

@MarcCote I addressed all your comments. If Travis agrees do merge this PR so that the other PRs can can start becoming green.

@skoudoro
Copy link
Member

skoudoro commented Mar 3, 2018

Yeah! Looks good and all tests are happy. I think it is ready to merge, can you have a last look @MarcCote? Thanks to you and @Garyfallidis!

Copy link
Contributor

@MarcCote MarcCote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After that, LGTM.

def test_circle_parallel_fornix():

circle = streamlines_in_circle(100, step_size=2)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8 - too many empty lines.

self.level = levels[level]
self.traverse_postorder(self.root, self._fetch_level)
return self.clusters
# cdef void _fetch_level(self, CentroidNode* node):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's delete those commented lines if everyone is happy with manipulating the TreeClusterMap python object instead of the QuickBundlesX Cython object.

@@ -110,5 +110,5 @@ cdef class QuickBundlesX(object):
cpdef object insert(self, Data2D datum, int datum_idx)
cdef void traverse_postorder(self, CentroidNode* node, void (*visit)(QuickBundlesX, CentroidNode*))
cdef void _dealloc_node(self, CentroidNode* node)
cdef void _fetch_level(self, CentroidNode* node)
# cdef void _fetch_level(self, CentroidNode* node)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete

@Garyfallidis
Copy link
Contributor Author

Done @MarcCote !

@MarcCote MarcCote merged commit 5a8ed85 into dipy:master Mar 3, 2018
@MarcCote
Copy link
Contributor

MarcCote commented Mar 3, 2018

Let's make all the PRs green again!

@Garyfallidis
Copy link
Contributor Author

Super! @skoudoro please inform everyone to rebase their PRs. We can go green again :) Thank you both for all the help. That was a fast recovery with a sudden Cythonic change! Grrr.....

ShreyasFadnavis pushed a commit to ShreyasFadnavis/dipy that referenced this pull request Sep 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants