Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize squareform's square matrix construction #91

Merged
merged 1 commit into from
Oct 9, 2017

Conversation

jakirkham
Copy link
Owner

Avoid breaking chunks into smaller pieces than necessary by taking advantage of symmetry. Namely note that pieces of rows also correspond to pieces of columns as well. Starting from the right corner where these pieces are simple singleton 1-D arrays, concatenate them together into larger pieces. Use a 2-D singleton zero array as a seed for this concatenation to grow from. By doing this, we avoid further subselection from the 1-D array beyond the necessary selection of pieces.

Avoid breaking chunks into smaller pieces than necessary by taking
advantage of symmetry. Namely note that pieces of rows also correspond
to pieces of columns as well.  Starting from the right corner where
these pieces are simple singleton 1-D arrays, concatenate them together
into larger pieces. Use a 2-D singleton zero array as a seed for this
concatenation to grow from. By doing this, we avoid further subselection
from the 1-D array beyond the necessary selection of pieces.
result = z
for i in _pycompat.irange(d - 2, -1, -1):
X_tri_i = X_tri[i]
result = result.rechunk(2 * X_tri_i.chunks)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rechunking is a necessary evil in this case. This is needed for a few reasons.

First we are adding singleton zeros arrays to fill the diagonal. However when concatenating anything with these singleton zeros arrays, it results in the overlapping pieces to be rechunked into singleton chunks. As this effect exists from the beginning, the result is singleton chunking throughout the array. However there is no reason for this to impact chunking as generating singleton zeros arrays is really cheap. So the rechunking in part exists to ignore the effect of these singleton zeros arrays.

Second each row/column we concatenate has a singleton dimension. Much like with the zeros, this results in rechunking when concatenating with other arrays. This is another cause for why everything ends up with singleton chunks if we don't manually rechunk. As we know the original array was square and, if from pdist, should have square chunks, we know how these rows/columns should be grouped. By rechunking we can restore this original grouping.

Third if a user is going down this path with squareform, they are probably providing a distance matrix that has been flattened with squareform (e.g. pdist's result). Not to mention we reverse the same procedure exactly in this implementation. As such, we know that all of this data has come from an array that had a chunking matching X_tri[0]. However each subsequent row/column in X_tri was sliced from the diagonal of the original square matrix, meaning that the chunking near the diagonal gets a little messed up due to the slicing. Further this gets worse the further we go through X_tri until we reach X_tri[-1], which is a single element. So this will negatively impact later concatenations by forcing the same singleton chunking on them. To reverse this effect and match the chunking of the original array, we rechunk each result to match the next row/column to be concatenated. The result is the final array matches quite closely to the original arrays chunks (excepting one last zero whose chunking is unknown).

Given all of this, we rechunk to try and restore/preserve the original chunking of the square matrix this vector had come from. The result is a significant boost in performance indicating we are approaching this correctly.

Copy link
Owner Author

@jakirkham jakirkham Oct 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some very rough profiling of this implementation with and without rechunking shows that this runs 5.75x slower for simply restructuring arrays and ~2.5x slower for computing a square matrix from pdist.

@jakirkham
Copy link
Owner Author

jakirkham commented Oct 9, 2017

After some very rough profiling against master, this seems to go ~4x faster for just restructuring arrays and ~2x faster for computing a square matrix from pdist. So a very sizeable speedup over the current implementation.

@@ -216,27 +216,23 @@ def squareform(X, force="no"):

X_tri = []
j1 = 0
for j2 in _pycompat.irange(d - 1, -1, -1):
for j2 in _pycompat.irange(d - 1, 0, -1):
X_tri.append(X[j1:j1 + j2])
j1 += j2
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a side note, we are now able to skip inclusion of this zero length 1-D array. 😄

dask.array.concatenate([X_tri_i[:, None], result], axis=1)
],
axis=0
)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The core insight of this implementation is that we can build the square matrix up from the bottom right corner by simply concatenating each row/column with zeros on the diagonal. Hence we are able to avoid slicing these rows/columns further (differing from what was done before).

@jakirkham jakirkham merged commit 0b36556 into master Oct 9, 2017
@jakirkham jakirkham deleted the opt_squareform branch October 9, 2017 15:31
@jakirkham jakirkham mentioned this pull request Oct 9, 2017
@jakirkham jakirkham changed the title Optimize squareform's square matrix construction Optimize squareform's square matrix construction Oct 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant