Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ROCm] Add ROCm support for CSR Sparse Matrix Ops #34800

Conversation

deven-amd
Copy link
Contributor

This PR adds ROCm support for CSR Sparse Matrix Ops.

The PR has 6 commits which organize the changes as per functionality being changed. Please review the commits indidvidually.

The file cuda_sparse.h should ideally be renamed to gpu_sparse.h once this PR is taken. I have not made that change a part of this PR. I can add a commit to make that change, assuming the reviewers are okay with it...please let me know if that is needed.


/cc @whchung @chsigg

@tensorflow-bot tensorflow-bot bot added the size:XL CL Change Size:Extra Large label Dec 3, 2019
@whchung whchung requested a review from chsigg December 3, 2019 21:39
@whchung whchung added the kokoro:force-run Tests on submitted change label Dec 3, 2019
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Dec 3, 2019
@rthadur rthadur self-assigned this Dec 4, 2019
@rthadur rthadur added this to Assigned Reviewer in PR Queue via automation Dec 4, 2019
@rthadur rthadur added the comp:gpu GPU related issues label Dec 4, 2019
Copy link
Contributor

@chsigg chsigg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good thanks! I just a some minor nits.

@deven-amd
Copy link
Contributor Author

@chsigg, please re-review.

thanks as always :)

@rthadur rthadur requested a review from chsigg December 5, 2019 16:28
chsigg
chsigg previously approved these changes Dec 9, 2019
PR Queue automation moved this from Assigned Reviewer to Approved by Reviewer Dec 9, 2019
@tensorflow-bot tensorflow-bot bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Dec 9, 2019
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Dec 9, 2019
@rthadur rthadur added ready to pull PR ready for merge process and removed ready to pull PR ready for merge process labels Dec 10, 2019
@deven-amd
Copy link
Contributor Author

@rthadur gentle ping

@gbaned
Copy link
Contributor

gbaned commented Dec 13, 2019

@deven-amd Can you please resolve conflicts? Thanks!

@gbaned gbaned added stat:awaiting response Status - Awaiting response from author and removed ready to pull PR ready for merge process labels Dec 13, 2019
The failures are because either
* the subtests require support for complex type (which is not yet supported by ROCm)
* or they require a GPU kernel implementation for the SparseMatrixAdd op
  (which is also not supported by ROCm, because the underlying hipSPARSE API routine - csrgeam - does not exist).

There are also a couple of subtests commented out because hipSPARSE API errors out with an unknown error for them. Those will be looked into and fixed soon
@deven-amd deven-amd force-pushed the google_upstream_rocm_csr_sparse_matrix_support branch from bac4f54 to 5d1ccc1 Compare December 13, 2019 15:29
PR Queue automation moved this from Approved by Reviewer to Reviewer Requested Changes Dec 13, 2019
@deven-amd
Copy link
Contributor Author

@gbaned , rebased the PR to remove the merge-conflict. please merge. thanks

@tensorflowbutler tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Dec 14, 2019
@deven-amd
Copy link
Contributor Author

@chsigg, please re-approve

@gbaned, gentle ping

thanks

@ekuznetsov139
Copy link
Contributor

bump.

PR Queue automation moved this from Reviewer Requested Changes to Approved by Reviewer Jan 3, 2020
@tensorflow-bot tensorflow-bot bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Jan 3, 2020
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Jan 3, 2020
@tensorflowbutler tensorflowbutler removed the awaiting review Pull request awaiting review label Jan 3, 2020
@deven-amd
Copy link
Contributor Author

@gbaned, gentle ping

@rthadur
Copy link
Contributor

rthadur commented Jan 7, 2020

@deven-amd here are some internal internal, can you please check once.

ERROR: /tmpfs/tensor_flow/tensorflow/core/kernels/BUILD:3479:1: no such target '@local_config_rocm//rocm:hipsparse': target 'hipsparse' not declared in package 'rocm' defined by /tmpfs/bazel_output/_bazel_kbuilder//external/local_config_rocm/rocm/BUILD and referenced by '//tensorflow/core/kernels:cuda_sparse'

@deven-amd
Copy link
Contributor Author

deven-amd commented Jan 7, 2020

@rthadur , can you post the contents of then generated file /tmpfs/bazel_output/_bazel_kbuilder//external/local_config_rocm/rocm/BUILD ? That would help me understand the cause of the error better.

thanks

@deven-amd
Copy link
Contributor Author

also please post the bazel build command that leads to this error

@rthadur
Copy link
Contributor

rthadur commented Jan 7, 2020

@chsigg can you please assist ?

tensorflow-copybara pushed a commit that referenced this pull request Jan 14, 2020
…ocm_csr_sparse_matrix_support

PiperOrigin-RevId: 289617600
Change-Id: Ic1aa3714126d7b867295ae386b6be643c1dc83e4
@tensorflow-copybara tensorflow-copybara merged commit 5d1ccc1 into tensorflow:master Jan 14, 2020
PR Queue automation moved this from Approved by Reviewer to Merged Jan 14, 2020
@deven-amd deven-amd deleted the google_upstream_rocm_csr_sparse_matrix_support branch April 2, 2020 21:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes comp:gpu GPU related issues ready to pull PR ready for merge process size:XL CL Change Size:Extra Large
Projects
PR Queue
  
Merged
Development

Successfully merging this pull request may close these issues.

None yet

10 participants