-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remerge gh-9619: "FIX: Sparse matrix addition/subtraction eliminates explicit zeros" #9958
base: main
Are you sure you want to change the base?
Conversation
94e8fa0
to
a306d75
Compare
scipy/sparse/sparsetools/csr.h
Outdated
bool addsub = false; | ||
|
||
//checks the type of binary operation to be performed. | ||
if((int)op(8, 4) == 12 || (int)op(8, 4) == 4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simple check to find if the binary operation is plus or minus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd much prefer defining a separate function for addition/subtraction, rather than special casing logic here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even I was doubtful about this but I wanted to keep the code small that's why I choose this special casing logic here. Anyways I have made three separate functions now . One will handle relational operations and other two will handle the arithmetic operations .
scipy/sparse/sparsetools/csr.h
Outdated
Cj[nnz] = Aj[A_pos]; | ||
Cx[nnz] = result; | ||
Cx[nnz] = Ax[A_pos]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
op(0, Bx[B_pos])
for Bx but just Ax[A_pos]
for Ax because if binary operation is minus Bx has to be multiplied with negative sign but this is not the case with Ax .
Benchmark results:
|
@perimosocordiae I have separated the part of code performing operations of addition and subtraction from the rest so that other operations remain intact from this change. This separation was necessary because adding an extra condition for checking explicit zeros at bottleneck was affecting the performance adversely . In case of addition and subtraction some checks become redundant and can be replaced with the checks for explicit zeros . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the performance hit for doing the more generic explicit zeros check, instead of special-casing the addition/subtraction case?
scipy/sparse/tests/test_base.py
Outdated
data1 = np.array([0, 5, 7, 9]) | ||
data2 = np.array([0, 4, 6, 8]) | ||
m1 = coo_matrix((data1, (row, col)), shape=(4, 4)) | ||
m2 = coo_matrix((data2, (row, col)), shape=(4, 4)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the CSR matrix test class, so these should be CSR matrices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, I will change them to CSR matrices.
scipy/sparse/sparsetools/csr.h
Outdated
bool addsub = false; | ||
|
||
//checks the type of binary operation to be performed. | ||
if((int)op(8, 4) == 12 || (int)op(8, 4) == 4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd much prefer defining a separate function for addition/subtraction, rather than special casing logic here.
@@ -776,7 +801,7 @@ void csr_binop_csr_general(const I n_row, const I n_col, | |||
* Note: | |||
* Input: A and B column indices are assumed to be in sorted order | |||
* Output: C column indices will be in sorted order | |||
* Cx will not contain any zero entries | |||
* Cx will not contain any implicit zero entries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you meant "explicit" here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cx
initially contained only the non-zero values but after these changes it can also have the explicit zero values. That means Cx
will have the explicit zeros but not the implicit ones.
@perimosocordiae for the case of addition and subtraction if the result of operation is zero and the value in one of the matrices is zero then value in the other one will be definitely zero but this doesn't hold good for multiplication and division. In latter case entries in both the matrices will have to be checked which will add one more condition at the bottleneck. This may increase the time to a factor of 1.20 . |
fa19098
to
d535910
Compare
Benchmark Results:
|
@perimosocordiae I have divided the |
@perimosocordiae did you get time to go through the changes I made ? |
Is there an update on this PR? |
re-merge #9619 .
Binary operations on Sparse matrices removes the explicit zeros. These changes preserve the explicit zeros in the output matrix.