Optimize `squareform`'s square matrix construction #91

jakirkham · 2017-10-09T14:14:46Z

Avoid breaking chunks into smaller pieces than necessary by taking advantage of symmetry. Namely note that pieces of rows also correspond to pieces of columns as well. Starting from the right corner where these pieces are simple singleton 1-D arrays, concatenate them together into larger pieces. Use a 2-D singleton zero array as a seed for this concatenation to grow from. By doing this, we avoid further subselection from the 1-D array beyond the necessary selection of pieces.

jakirkham · 2017-10-09T15:04:05Z

dask_distance/__init__.py

+        result = z
+        for i in _pycompat.irange(d - 2, -1, -1):
+            X_tri_i = X_tri[i]
+            result = result.rechunk(2 * X_tri_i.chunks)


Rechunking is a necessary evil in this case. This is needed for a few reasons.

First we are adding singleton zeros arrays to fill the diagonal. However when concatenating anything with these singleton zeros arrays, it results in the overlapping pieces to be rechunked into singleton chunks. As this effect exists from the beginning, the result is singleton chunking throughout the array. However there is no reason for this to impact chunking as generating singleton zeros arrays is really cheap. So the rechunking in part exists to ignore the effect of these singleton zeros arrays.

Second each row/column we concatenate has a singleton dimension. Much like with the zeros, this results in rechunking when concatenating with other arrays. This is another cause for why everything ends up with singleton chunks if we don't manually rechunk. As we know the original array was square and, if from pdist, should have square chunks, we know how these rows/columns should be grouped. By rechunking we can restore this original grouping.

Third if a user is going down this path with squareform, they are probably providing a distance matrix that has been flattened with squareform (e.g. pdist's result). Not to mention we reverse the same procedure exactly in this implementation. As such, we know that all of this data has come from an array that had a chunking matching X_tri[0]. However each subsequent row/column in X_tri was sliced from the diagonal of the original square matrix, meaning that the chunking near the diagonal gets a little messed up due to the slicing. Further this gets worse the further we go through X_tri until we reach X_tri[-1], which is a single element. So this will negatively impact later concatenations by forcing the same singleton chunking on them. To reverse this effect and match the chunking of the original array, we rechunk each result to match the next row/column to be concatenated. The result is the final array matches quite closely to the original arrays chunks (excepting one last zero whose chunking is unknown).

Given all of this, we rechunk to try and restore/preserve the original chunking of the square matrix this vector had come from. The result is a significant boost in performance indicating we are approaching this correctly.

Some very rough profiling of this implementation with and without rechunking shows that this runs 5.75x slower for simply restructuring arrays and ~2.5x slower for computing a square matrix from pdist.

jakirkham · 2017-10-09T15:09:41Z

After some very rough profiling against master, this seems to go ~4x faster for just restructuring arrays and ~2x faster for computing a square matrix from pdist. So a very sizeable speedup over the current implementation.

jakirkham · 2017-10-09T15:23:17Z

dask_distance/__init__.py

@@ -216,27 +216,23 @@ def squareform(X, force="no"):

        X_tri = []
        j1 = 0
-        for j2 in _pycompat.irange(d - 1, -1, -1):
+        for j2 in _pycompat.irange(d - 1, 0, -1):
            X_tri.append(X[j1:j1 + j2])
            j1 += j2


As a side note, we are now able to skip inclusion of this zero length 1-D array. 😄

jakirkham · 2017-10-09T15:26:35Z

dask_distance/__init__.py

+                    dask.array.concatenate([X_tri_i[:, None], result], axis=1)
+                ],
+                axis=0
+            )


The core insight of this implementation is that we can build the square matrix up from the bottom right corner by simply concatenating each row/column with zeros on the diagonal. Hence we are able to avoid slicing these rows/columns further (differing from what was done before).

jakirkham commented Oct 9, 2017

View reviewed changes

jakirkham merged commit 0b36556 into master Oct 9, 2017

jakirkham deleted the opt_squareform branch October 9, 2017 15:31

jakirkham mentioned this pull request Oct 9, 2017

Add squareform #89

Merged

jakirkham changed the title ~~Optimize squareform's square matrix construction~~ Optimize squareform's square matrix construction Oct 9, 2017

jakirkham mentioned this pull request Oct 9, 2017

WIP: Recursively handle pdist's diagonal chunks #90

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize `squareform`'s square matrix construction #91

Optimize `squareform`'s square matrix construction #91

jakirkham commented Oct 9, 2017

jakirkham Oct 9, 2017

jakirkham Oct 9, 2017 •

edited

Loading

jakirkham commented Oct 9, 2017 •

edited

Loading

jakirkham Oct 9, 2017

jakirkham Oct 9, 2017

Optimize squareform's square matrix construction #91

Optimize squareform's square matrix construction #91

Conversation

jakirkham commented Oct 9, 2017

jakirkham Oct 9, 2017

Choose a reason for hiding this comment

jakirkham Oct 9, 2017 • edited Loading

Choose a reason for hiding this comment

jakirkham commented Oct 9, 2017 • edited Loading

jakirkham Oct 9, 2017

Choose a reason for hiding this comment

jakirkham Oct 9, 2017

Choose a reason for hiding this comment

Optimize `squareform`'s square matrix construction #91

Optimize `squareform`'s square matrix construction #91

jakirkham Oct 9, 2017 •

edited

Loading

jakirkham commented Oct 9, 2017 •

edited

Loading