Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transform enhancement squashed rebased #135

Merged
merged 5 commits into from
Sep 1, 2015
Merged

Transform enhancement squashed rebased #135

merged 5 commits into from
Sep 1, 2015

Conversation

jpola
Copy link
Contributor

@jpola jpola commented Aug 28, 2015

No description provided.

@@ -360,6 +360,40 @@ clsparseDCsrMatrixfromFile( clsparseCsrMatrix* csrMatx, const char* filePath, cl
const cldenseVector* y,
const clsparseControl control );

// functions for tests of scan transformation, as simple as possible
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comments on PR #133

@kknox
Copy link
Contributor

kknox commented Aug 28, 2015

I have not benchmarked your refactoring of coo2csr, csr2coo, dense2csr, csr2dense, but a phenomenal job simplifying the code and eliminating kernel passes!

@kknox
Copy link
Contributor

kknox commented Sep 1, 2015

@jpola
Let me know if you do not have time to fix my comments; i can push commits to your PR and merge this in too.

1. Added more information to message when failed to read matrix from file.
2. Reorganized functions related to computing rowBlocks they are now in csr-meta.hpp / cpp files instead of clsparse-coo2csr.cpp, rest of functions in clsparse-coo2csr were not used therfore name clsparse-coo2csr-GPU changed to proper one.
3. Csr_matrix_environment allocates matrix in double precision in the first place which is then casted to single precision.
4. Sanity checks in clsparseD/SCsrMatrixFromFile.
5. Added Inclusive and exclusive scans operations + tests.
6. Rewritten reduce by key operation + tests.
7. Rewriten coo2csr and csr2coo which no longer need the use of radix sort. I.e we now using 10 kernel calls instead of 34, simple tests showed 6x speedup.
8. Rewritten dense2csr and csr2dense in more clean way. Improved performance eliminating unnecessary copies.

Minor:
1. When reading matix in coo format directly the data need to be also sorted by (row, col). Otherwise we will have column major format which is default for mtx storage.
…duce by key functions and corresponding tests
@jpola
Copy link
Contributor Author

jpola commented Sep 1, 2015

Hi, sorry for delay.
Your comments are applied.

2015-09-01 6:32 GMT+02:00 Kent Knox notifications@github.com:

@jpola https://github.com/jpola
Let me know if you do not have time to fix my comments; i can push commits
to your PR and merge this in too.


Reply to this email directly or view it on GitHub
#135 (comment)
.

kknox pushed a commit that referenced this pull request Sep 1, 2015
Transform enhancement squashed rebased 👍
@kknox kknox merged commit e389597 into clMathLibraries:develop Sep 1, 2015
@jpola jpola deleted the transformEnhancement_squashed_rebased branch September 8, 2015 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants