Sparse Matrix(compressed) * Dense Matrix product #37

albertzaharovits · 2013-07-17T19:31:02Z

Note: needs review!!!

karlrupp · 2013-07-17T21:33:50Z

Thanks, Albert! The overall structure looks really good :-) Particularly it's great that you have an implementation for all three backends now. As for the kernels, I'll comment the commit inline. You can then either commit on top of the existing commit, or force-push an amended commit.

karlrupp · 2013-07-17T21:35:39Z

auxiliary/compressed_matrix/align1/d_mat_mul.cl

+    unsigned int row_start = sp_mat_row_indices[row];
+    unsigned int row_end = sp_mat_row_indices[row+1];
+
+    // load work rows to shared memory


I think you should limit the maximum size of shared memory used. If I ramp up the number of columns in the dense matrix to e.g. 3000, the GPU is easily out of shared memory.

Note: Revisioned version

karlrupp · 2013-07-19T14:38:40Z

auxiliary/compressed_matrix/align1/d_mat_mul.cl

+    for ( unsigned int col = get_local_id(0); col < result_col_size; col += get_local_size(0) ) {
+
+      float r = 0;
+


Nice! You can optimize it a little further by interchanging the loops for 'col' and 'k', because then 'x' needs to be loaded only once. Right now you might get loaded multiple times.

karlrupp · 2013-07-19T14:47:39Z

Cool, almost done. Just a few minor optimizations left. :-)

ptillet · 2013-07-19T17:23:10Z

Hi Albert! Great job, really !

I would have some suggestions for potential performance improvement, but I think it would be wiser to first have benchmarks to measure whether performance improvements are necessary.
However, how could we benchmark it? Should we compare against other libraries or compute FLOPs, keeping in mind that the number of floating point operation is data-dependent. I have not enough experience to give an order of magnitude of the sparsity coefficient threshold at which the operation is no longer bandwidth-limited... In all cases, I think this patch would be even better if you could provide a simple benchmark in addition to the tests ;)

karlrupp · 2013-07-19T18:42:37Z

@PhilippeTillet For a general sparse-dense multiplication this will always be memory bandwidth limited. To make it more compute-limited we would need to introduce some block-CSR format or similar.

Conflicts: tests/CMakeLists.txt

Sparse Matrix(compressed) * Dense Matrix product

karlrupp · 2013-07-25T17:05:31Z

Thanks, Albert!

Sparse Matrix(compressed) * Dense Matrix product

10944cf

Note: needs review!!!

karlrupp reviewed Jul 17, 2013
View reviewed changes

Sparse Matrix(compressed) * Dense Matrix product

cb9c047

Note: Revisioned version

karlrupp reviewed Jul 19, 2013
View reviewed changes

albertzaharovits added 2 commits July 23, 2013 11:40

Merge remote-tracking branch 'upstream/master'

81c8694

Conflicts: tests/CMakeLists.txt

Merge remote-tracking branch 'upstream/master'

af99e3d

karlrupp added a commit that referenced this pull request Jul 25, 2013

Merge pull request #37 from albertzaharovits/master

af257ca

Sparse Matrix(compressed) * Dense Matrix product

karlrupp merged commit af257ca into viennacl:master Jul 25, 2013

LutzWeischerFujitsu mentioned this pull request Feb 9, 2021

tests fail on AArch64, Fedora 33 #287

Open

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sparse Matrix(compressed) * Dense Matrix product #37

Sparse Matrix(compressed) * Dense Matrix product #37

albertzaharovits commented Jul 17, 2013

karlrupp commented Jul 17, 2013

karlrupp Jul 17, 2013

karlrupp Jul 19, 2013

karlrupp commented Jul 19, 2013

ptillet commented Jul 19, 2013

karlrupp commented Jul 19, 2013

karlrupp commented Jul 25, 2013

		for ( unsigned int col = get_local_id(0); col < result_col_size; col += get_local_size(0) ) {

		float r = 0;

Sparse Matrix(compressed) * Dense Matrix product #37

Sparse Matrix(compressed) * Dense Matrix product #37

Conversation

albertzaharovits commented Jul 17, 2013

karlrupp commented Jul 17, 2013

karlrupp Jul 17, 2013

Choose a reason for hiding this comment

karlrupp Jul 19, 2013

Choose a reason for hiding this comment

karlrupp commented Jul 19, 2013

ptillet commented Jul 19, 2013

karlrupp commented Jul 19, 2013

karlrupp commented Jul 25, 2013