-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix coalesced access checks in matrix_vector_op #372
Fix coalesced access checks in matrix_vector_op #372
Conversation
I've noticed this problem while investigating that normalizing the dataset using this op takes major fraction of time in the least squares algorithm (lstsq/eig). Just fixing that reduces the typical exec time of |
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just need a couple small unit tests for new utilities file
rerun tests |
1 similar comment
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@gpucibot merge |
One of the conditions in
test_aligned_access
inlinalg/matrix_vector_op.cuh
was incorrect (ptr % elem_size
should be zero, not otherwise). Due to that typo,matrixVectorOp
function was never using vectorized load/store instructions.This PR fixes the problem while also adding a new helper struct to simplify such checks in future.