Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a library node for np.transpose for >2 dims and a cuTENSOR implementation #1303

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

lamyiowce
Copy link
Contributor

Adds

  • a library node for np.transpose for >2 dims
  • a cuTENSOR implementation
  • pure implementation (copy-pasted what was in replacements.py before)
  • cutensor environment

What I don't know:

  • whether the file structure is correct
  • how to make DaCe find cuTensor in a pretty way. It has to be separately downloaded (it's not in cuda) and I don't know how that's handled in DaCe.

Copy link
Collaborator

@tbennun tbennun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Some minor modifications.

cmake_variables = {}
cmake_includes = []
cmake_libraries = ["cutensor"]
cmake_compile_flags = ["-L/users/jbazinsk/libcutensor-linux-x86_64-1.7.0.1-archive/lib/11"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe better to set up the LIBRARY_PATH envvar locally.

@@ -0,0 +1,54 @@
# Copyright 2019-2021 ETH Zurich and the DaCe authors. All rights reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Copyright 2019-2021 ETH Zurich and the DaCe authors. All rights reserved.
# Copyright 2019-2023 ETH Zurich and the DaCe authors. All rights reserved.

@@ -0,0 +1,67 @@
// Copyright 2019-2022 ETH Zurich and the DaCe authors. All rights reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Copyright 2019-2022 ETH Zurich and the DaCe authors. All rights reserved.
// Copyright 2019-2023 ETH Zurich and the DaCe authors. All rights reserved.


static void CheckCutensorError(cutensorStatus_t const& status) {
if (status != CUTENSOR_STATUS_SUCCESS) {
throw std::runtime_error("cuSPARSE failed with error code: " + std::to_string(status));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't there a cutensorGetErrorString or something similar?

@@ -0,0 +1,203 @@
# Copyright 2019-2021 ETH Zurich and the DaCe authors. All rights reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Copyright 2019-2021 ETH Zurich and the DaCe authors. All rights reserved.
# Copyright 2019-2023 ETH Zurich and the DaCe authors. All rights reserved.

@@ -37,8 +41,39 @@ def test_transpose():
assert rel_error <= 1e-5


@pytest.mark.parametrize('implementation', ['pure', 'cuTENSOR'])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cutensor should be marked as pytest.mark.gpu, because it won't be available on the standard machines. See pytest.param and the mark kwarg for more info.

@alexnick83
Copy link
Contributor

It would be best if you used the TensorTranspose library node in the ttranspose library, which includes an expansion for the HPTT library. You can also find in the same branch tests and a working cuTENSOR environment.

@BenWeber42
Copy link
Contributor

What's the status here? Is this related to #1309 ("Support for tensor linear algebra (transpose, dot products)")?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants