Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[5.2.X] TRMM functions do not have correct correspondence in rocBLAS #1265

Closed
emankov opened this issue Sep 19, 2022 · 2 comments · Fixed by ROCm/HIPIFY#640
Closed

[5.2.X] TRMM functions do not have correct correspondence in rocBLAS #1265

emankov opened this issue Sep 19, 2022 · 2 comments · Fixed by ROCm/HIPIFY#640
Assignees

Comments

@emankov
Copy link

emankov commented Sep 19, 2022

rocBLAS TRMM functions rocblas_strmm, rocblas_dtrmm, rocblas_ctrmm, rocblas_ztrmm do not match neither cublas TRMM functions, nor cublas TRMM _v2 functions.

For instance:

cublasStrmm:

void CUBLASWINAPI cublasStrmm(char side,
                              char uplo,
                              char transa,
                              char diag,
                              int m,
                              int n,
                              float alpha,
                              const float* A,
                              int lda,
                              float* B,
                              int ldb);

cublasStrmm_v2:

CUBLASAPI cublasStatus_t CUBLASWINAPI cublasStrmm_v2(cublasHandle_t handle,
                                                     cublasSideMode_t side,
                                                     cublasFillMode_t uplo,
                                                     cublasOperation_t trans,
                                                     cublasDiagType_t diag,
                                                     int m,
                                                     int n,
                                                     const float* alpha, /* host or device pointer */
                                                     const float* A,
                                                     int lda,
                                                     const float* B,
                                                     int ldb,
                                                     float* C,
                                                     int ldc);

rocblas_strmm:

ROCBLAS_EXPORT rocblas_status rocblas_strmm(rocblas_handle    handle,
                                            rocblas_side      side,
                                            rocblas_fill      uplo,
                                            rocblas_operation transA,
                                            rocblas_diagonal  diag,
                                            rocblas_int       m,
                                            rocblas_int       n,
                                            const float*      alpha,
                                            const float*      A,
                                            rocblas_int       lda,
                                            float*            B,
                                            rocblas_int       ldb);

The same goes for hipBLAS analogues hipblasStrmm, hipblasDtrmm, hipblasCtrmm, hipblasZtrmm (ROCm/hipBLAS#524).

So, the above 4 rocBLAS and 4 hipBLAS functions are marked as HIP UNSUPPORTED.

[Solution]
As far as rocBLAS doesn't support v1 BLAS functions, populate rocblas TRMM functions with two missing arguments: float* C and int ldc and revise functions' logic.

@daineAMD
Copy link
Contributor

Hi @emankov,

For trmm functions, cuBLAS differs from the traditional BLAS API as cuBLAS includes two extra parameters, C and ldc (cuBLAS trmm doc, lapack doc). rocBLAS has typically followed the BLAS API as close as possible, resulting in this difference.

We have, however, implemented rocblas_xtrmm_outofplace functions which have the 2 extra parameters being requested. This function supports out-of-place and in-place functionality for trmm. We do not plan on changing the current rocblas_xtrmm API as the rocblas_xtrmm_outofplace API will take care of this functionality.

I hope this helps,
Daine

emankov added a commit to emankov/HIPIFY that referenced this issue Sep 22, 2022
+ Mark rocBLAS TRMM functions rocblas_(s|d|c|z)trmm_outofplace, as supported only for TRMM v2 CUDA analogues
+ Mark hipBLAS TRMM functions hipblas(S|D|C|Z)trmm as HIP_UNSUPPORTED
+ Regenerate and update docs and hipify-perl accordingly

[Reasons]
+ hipBLAS TRMM functions hipblas(S|D|C|Z)trmm, actually, do not match neither cublas TRMM functions, nor cublas TRMM _v2 functions: ROCm/hipBLAS#524
+ There is a correspondence between cuBLAS cublas_(s|d|c|z)trmm and rocBLAS TRMM rocblas_(s|d|c|z)trmm_outofplace, not rocblas_(s|d|c|z)trmm: fixed it

[ToDo]
+ Close ROCm/rocBLAS#1265 as erroneous
+ Remove HIP_UNSUPPORTED mark from hipblas(S|D|C|Z)trmm functions after merging ROCm/hipBLAS#504
+ Add cublas2rocblas and update cublas2hipblas synthetic tests
@emankov
Copy link
Author

emankov commented Sep 22, 2022

Thank you, @daineAMD!

I've renamed rocblas_xtrmm functions to rocblas_xtrmm_outofplace. Now they are always hipified from cublasXtrmm_v2 and from cublasXtrmm if cublas_v2.h is included in the source CUDA being hipified.

So closing this issue as ERRONEOUS.

rkamd pushed a commit to rkamd/rocBLAS that referenced this issue Sep 26, 2022
…rformance further. Related ticket SWDEV-325544 and LWPTENSILE-51. (ROCm#1265)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants