Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added getri_npvt #305

Merged
merged 6 commits into from
Aug 16, 2021
Merged

Added getri_npvt #305

merged 6 commits into from
Aug 16, 2021

Conversation

tfalders
Copy link
Collaborator

@tfalders tfalders commented Aug 6, 2021

Partially addresses SWDEV-297252.

This PR adds getri_npvt and getri_npvt_outofplace routines to the public API, in preparation for calling getri_npvt_outofplace from hipBLAS.

I haven't tuned the optimized block sizes for the npvt routines, as I'm not sure how.

Copy link
Collaborator

@jzuniga-amd jzuniga-amd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I only have a couple of comments...

@@ -10,15 +10,16 @@ rocblas_status rocsolver_getri_impl(rocblas_handle handle,
U A,
const rocblas_int lda,
rocblas_int* ipiv,
rocblas_int* info)
rocblas_int* info,
const bool pivot)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are passing pivot here as an argument. In GETF2/GETRF it is passed as a template parameter. I don't know if it really makes a difference, but if you want consistency, you might want to change here or in GETF2/GETRF.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this was a tough choice. Having it as a template parameter allows the if statement to be compiled out, but in exchange the function has to be compiled twice. Since we've had issues with our binary size and the cost of one if statement shouldn't be too expensive, I figured this approach made more sense.

I'm happy to update GETF2/GETRF to maintain consistency, but I'm concerned that might create conflicts with the work you're doing. If you're okay with it, I'll proceed with updating the functions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with either approach... (If you want to change GETF2/GETRF it should be ok; either you or I would have to resolve some conflicts depending on which PR goes in first, but it should be easy to do)

Now, supposing that we eventually apply the explicit instantiation of templates across the library as discussed before (to avoid repeating symbols for normal and strided_batched cases), would you keep PIVOT as a template parameter? In that case, better to leave it for now...

Copy link
Collaborator Author

@tfalders tfalders Aug 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we explicitly instantiate the templates for GETF2/GETRF, I'd probably still want to remove PIVOT as a template parameter. The savings of compiling out one conditional do not seem worth instantiating two versions of the same function.

Since your PR for GETF2/GETRF is planned to be opened soon, I think the easiest approach is for me to open a PR against your PR with the changes to PIVOT. I'll leave this PR as-is for now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me!

library/src/lapack/roclapack_getri.hpp Outdated Show resolved Hide resolved
Copy link
Collaborator

@cgmb cgmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a couple minor comments / questions / suggestions.

The main thing is just that there should be a change log entry.

@tfalders tfalders merged commit 4b3419a into ROCm:develop Aug 16, 2021
jzuniga-amd pushed a commit to jzuniga-amd/rocSOLVER that referenced this pull request Aug 19, 2021
* Drop gfx803 from default build architectures (ROCm#288)
* Added getri_npvt (ROCm#305)
* Added nullptr checks for ipiv in getri
* Implemented getri_npvt routines
* Added test cases for getri_npvt routines
* Updated changelog
* Minor corrections
* Changed pivot from a template argument to a function argument
* Add pivot argument to template functions
jzuniga-amd pushed a commit to jzuniga-amd/rocSOLVER that referenced this pull request Aug 19, 2021
* Drop gfx803 from default build architectures (ROCm#288)
* Added getri_npvt (ROCm#305)
* Added nullptr checks for ipiv in getri
* Implemented getri_npvt routines
* Added test cases for getri_npvt routines
* Updated changelog
* Minor corrections
* Changed pivot from a template argument to a function argument
* Add pivot argument to template functions
jzuniga-amd pushed a commit to jzuniga-amd/rocSOLVER that referenced this pull request Sep 8, 2021
* Added nullptr checks for ipiv in getri

* Implemented getri_npvt routines

* Added test cases for getri_npvt routines

* Updated changelog

* Minor corrections
jzuniga-amd pushed a commit to jzuniga-amd/rocSOLVER that referenced this pull request Sep 8, 2021
* Added nullptr checks for ipiv in getri

* Implemented getri_npvt routines

* Added test cases for getri_npvt routines

* Updated changelog

* Minor corrections
jzuniga-amd pushed a commit that referenced this pull request Sep 8, 2021
* Added nullptr checks for ipiv in getri

* Implemented getri_npvt routines

* Added test cases for getri_npvt routines

* Updated changelog

* Minor corrections
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants