-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gt-blas: sycl / mkl oneAPI blas uses 64bit index type (ILP64 only) #127
Comments
I'm not really sure how much hassle it involves, but I think the best solution would be to define a GTBLAS_INTEGER_KIND or something, and use that on the Fortran side to create the right kind of index arrays in the first place. PETSc does something like that, I believe. |
That is basically what I do now for the C++ layer - there is a |
A static global won't work well, it needs to be a compile time constant so that the Fortran compiler can use the correct type. Ie., it basically requires a macro to switch the type inside of Fortran (I think one might be able to make the kind parameter itself a Fortran compile time constant, but that'd still need switching based on a macro, so using a macro directly is probably simpler). |
I was imagining it could be an opaque C_PTR in Fortran, but I guess that may not be the case. I'm inclined to do this via cmake/Makefile hackery in client code based on GTENSOR_DEVICE_X. With include files, it would have to be Fortran only and couldn't reuse the existing type alias in any way I can think of, so requires ugly duplication no matter what. If it's going to be a hack, trying to make it simplest / laziest version. |
I guess for this particular case you're right, it could be handled entirely opaquely on the Fortran side -- but I still don't much like that (e.g., we already had the case where someone wanted to actually look at the pivots). And it doesn't really handle the more general case where indices are actually calculated/used on the Fortran side (I think that's the case for the sparse solver, currently, though cusparse is a bit of a separate problem). |
The GPU API and the CPU API are largely separate, so it's possible to use LP64 for CPU and link CPU MKL appropriately, and the GPU API will still use ILP64. This will likely break if using SYCL_DEVICE_FILTER=host, but that is not something we are trying to support anyway, even if it might occasionally be useful for debugging. |
MKL oneAPI doesn't currently support LP64. This shows up for getrf/getrs calls for the pivot array. Since this is allocated on device, there is not an easy way around it, especially re the Fortran interface for GENE. Some options:
My inclination is to do (1) for now, and if it becomes a performance bottleneck we can revisit. If it's a big batch, then the computation should dominate anyway, so extra pivot array copies shouldn't be too horrible. Also, in the future LP64 may be supported better in oneAPI, then we can switch and they will all be consistent.
The text was updated successfully, but these errors were encountered: