Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remerge gh-9619: "FIX: Sparse matrix addition/subtraction eliminates explicit zeros" #9958

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

ananyashreyjain
Copy link
Contributor

re-merge #9619 .
Binary operations on Sparse matrices removes the explicit zeros. These changes preserve the explicit zeros in the output matrix.

@ananyashreyjain ananyashreyjain changed the title Remerge gh-9619: "FIX: Sparse matrix addition/subtraction eliminates explicit zeros" [WIP] Remerge gh-9619: "FIX: Sparse matrix addition/subtraction eliminates explicit zeros" Mar 19, 2019
bool addsub = false;

//checks the type of binary operation to be performed.
if((int)op(8, 4) == 12 || (int)op(8, 4) == 4)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simple check to find if the binary operation is plus or minus.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd much prefer defining a separate function for addition/subtraction, rather than special casing logic here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even I was doubtful about this but I wanted to keep the code small that's why I choose this special casing logic here. Anyways I have made three separate functions now . One will handle relational operations and other two will handle the arithmetic operations .

Cj[nnz] = Aj[A_pos];
Cx[nnz] = result;
Cx[nnz] = Ax[A_pos];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

op(0, Bx[B_pos]) for Bx but just Ax[A_pos] for Ax because if binary operation is minus Bx has to be multiplied with negative sign but this is not the case with Ax .

@ananyashreyjain
Copy link
Contributor Author

ananyashreyjain commented Mar 19, 2019

Benchmark results:

before after ratio
1.84±0.01ms 2.02±0.01ms 1.10 sparse.Arithmetic.time_arithmetic('csr', 'AA', 'sub')
33.2±0.4 ms 35.3±0.07ms 1.06 sparse.Arithmetic.time_arithmetic('csr', 'BB', 'mul')

@ananyashreyjain ananyashreyjain changed the title [WIP] Remerge gh-9619: "FIX: Sparse matrix addition/subtraction eliminates explicit zeros" Remerge gh-9619: "FIX: Sparse matrix addition/subtraction eliminates explicit zeros" Mar 19, 2019
@ananyashreyjain
Copy link
Contributor Author

ananyashreyjain commented Mar 19, 2019

@perimosocordiae I have separated the part of code performing operations of addition and subtraction from the rest so that other operations remain intact from this change. This separation was necessary because adding an extra condition for checking explicit zeros at bottleneck was affecting the performance adversely . In case of addition and subtraction some checks become redundant and can be replaced with the checks for explicit zeros .

Copy link
Member

@perimosocordiae perimosocordiae left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the performance hit for doing the more generic explicit zeros check, instead of special-casing the addition/subtraction case?

data1 = np.array([0, 5, 7, 9])
data2 = np.array([0, 4, 6, 8])
m1 = coo_matrix((data1, (row, col)), shape=(4, 4))
m2 = coo_matrix((data2, (row, col)), shape=(4, 4))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the CSR matrix test class, so these should be CSR matrices.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I will change them to CSR matrices.

bool addsub = false;

//checks the type of binary operation to be performed.
if((int)op(8, 4) == 12 || (int)op(8, 4) == 4)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd much prefer defining a separate function for addition/subtraction, rather than special casing logic here.

@@ -776,7 +801,7 @@ void csr_binop_csr_general(const I n_row, const I n_col,
* Note:
* Input: A and B column indices are assumed to be in sorted order
* Output: C column indices will be in sorted order
* Cx will not contain any zero entries
* Cx will not contain any implicit zero entries
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant "explicit" here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cx initially contained only the non-zero values but after these changes it can also have the explicit zero values. That means Cx will have the explicit zeros but not the implicit ones.

@ananyashreyjain
Copy link
Contributor Author

ananyashreyjain commented Mar 19, 2019

@perimosocordiae for the case of addition and subtraction if the result of operation is zero and the value in one of the matrices is zero then value in the other one will be definitely zero but this doesn't hold good for multiplication and division. In latter case entries in both the matrices will have to be checked which will add one more condition at the bottleneck. This may increase the time to a factor of 1.20 .

@ananyashreyjain ananyashreyjain force-pushed the main_zeros branch 2 times, most recently from fa19098 to d535910 Compare March 29, 2019 22:01
@ananyashreyjain
Copy link
Contributor Author

ananyashreyjain commented Mar 29, 2019

Benchmark Results:

before after ratio
33.6±0.1ms 36.0±0.2ms 1.07 sparse.Arithmetic.time_arithmetic('csr', 'BB', 'mul')
4.86±0.06ms 4.59±0.05ms 0.94 sparse.Arithmetic.time_arithmetic('csr', 'AB', 'multiply')

@ananyashreyjain
Copy link
Contributor Author

ananyashreyjain commented Mar 29, 2019

@perimosocordiae I have divided the csr_binop_csr_canonical() and csr_binop_csr_general() into three parts for handling the operations of relation (>, <, <=, >=, etc), addition/subtraction and multiplication/division separately. I have modified these functions a bit so that there is not much performance hit for checking explicit zeros in matrices. After these changes explicit zero check will work for all the arithmetic operations .

@ananyashreyjain
Copy link
Contributor Author

@perimosocordiae did you get time to go through the changes I made ?

@carldlaird
Copy link

Is there an update on this PR?

@lucascolley lucascolley added the maintenance Items related to regular maintenance tasks label Dec 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Items related to regular maintenance tasks scipy.sparse
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants