-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize squareform
's square matrix construction
#91
Conversation
Avoid breaking chunks into smaller pieces than necessary by taking advantage of symmetry. Namely note that pieces of rows also correspond to pieces of columns as well. Starting from the right corner where these pieces are simple singleton 1-D arrays, concatenate them together into larger pieces. Use a 2-D singleton zero array as a seed for this concatenation to grow from. By doing this, we avoid further subselection from the 1-D array beyond the necessary selection of pieces.
result = z | ||
for i in _pycompat.irange(d - 2, -1, -1): | ||
X_tri_i = X_tri[i] | ||
result = result.rechunk(2 * X_tri_i.chunks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rechunking is a necessary evil in this case. This is needed for a few reasons.
First we are adding singleton zeros arrays to fill the diagonal. However when concatenating anything with these singleton zeros arrays, it results in the overlapping pieces to be rechunked into singleton chunks. As this effect exists from the beginning, the result is singleton chunking throughout the array. However there is no reason for this to impact chunking as generating singleton zeros arrays is really cheap. So the rechunking in part exists to ignore the effect of these singleton zeros arrays.
Second each row/column we concatenate has a singleton dimension. Much like with the zeros, this results in rechunking when concatenating with other arrays. This is another cause for why everything ends up with singleton chunks if we don't manually rechunk. As we know the original array was square and, if from pdist
, should have square chunks, we know how these rows/columns should be grouped. By rechunking we can restore this original grouping.
Third if a user is going down this path with squareform
, they are probably providing a distance matrix that has been flattened with squareform
(e.g. pdist
's result). Not to mention we reverse the same procedure exactly in this implementation. As such, we know that all of this data has come from an array that had a chunking matching X_tri[0]
. However each subsequent row/column in X_tri
was sliced from the diagonal of the original square matrix, meaning that the chunking near the diagonal gets a little messed up due to the slicing. Further this gets worse the further we go through X_tri
until we reach X_tri[-1]
, which is a single element. So this will negatively impact later concatenations by forcing the same singleton chunking on them. To reverse this effect and match the chunking of the original array, we rechunk each result to match the next row/column to be concatenated. The result is the final array matches quite closely to the original arrays chunks (excepting one last zero whose chunking is unknown).
Given all of this, we rechunk to try and restore/preserve the original chunking of the square matrix this vector had come from. The result is a significant boost in performance indicating we are approaching this correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some very rough profiling of this implementation with and without rechunking shows that this runs 5.75x slower for simply restructuring arrays and ~2.5x slower for computing a square matrix from pdist
.
After some very rough profiling against |
@@ -216,27 +216,23 @@ def squareform(X, force="no"): | |||
|
|||
X_tri = [] | |||
j1 = 0 | |||
for j2 in _pycompat.irange(d - 1, -1, -1): | |||
for j2 in _pycompat.irange(d - 1, 0, -1): | |||
X_tri.append(X[j1:j1 + j2]) | |||
j1 += j2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a side note, we are now able to skip inclusion of this zero length 1-D array. 😄
dask.array.concatenate([X_tri_i[:, None], result], axis=1) | ||
], | ||
axis=0 | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The core insight of this implementation is that we can build the square matrix up from the bottom right corner by simply concatenating each row/column with zeros on the diagonal. Hence we are able to avoid slicing these rows/columns further (differing from what was done before).
squareform
's square matrix construction
Avoid breaking chunks into smaller pieces than necessary by taking advantage of symmetry. Namely note that pieces of rows also correspond to pieces of columns as well. Starting from the right corner where these pieces are simple singleton 1-D arrays, concatenate them together into larger pieces. Use a 2-D singleton zero array as a seed for this concatenation to grow from. By doing this, we avoid further subselection from the 1-D array beyond the necessary selection of pieces.