-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unbalanced stack in x86 amax_sse.S #96
Comments
Hi IanRS, I just checked the codes. The samax subroutine called amax_sse.S. Because samax is not in Level 1 BLAS, we didn't test this function. I think this is an extension by Goto. I will write a test case for this function. Xianyi |
Hi Xianyi, I used to use GotoBLAS, but that did not compile with the latest processors. Part way through patching it to consider Nehalem the same as an older model I found OpenBLAS, so I switched to that. There are a number of routines with code available that do not appear in the 'standard' BLAS, or at least not via the cblas_ interface. e.g. Although the iXamax (X=s/d/c/z) has a cblas_ interface, iXamin does not. Neither does Xamax, Xamin, iRmax, iRmin, Rmax, Rmin or Rcabs1. R = S or D, but not C or Z. Some of these are also supplied by the ATLAS BLAS library, sometimes with the prefix atlas_ instead of cblas_, acknowledging that they are not part of the standard BLAS library, but may still be useful to users of that library. From: Xianyi Zhang reply@reply.github.com Hi IanRS, I just checked the codes. The samax subroutine called amax_sse.S. Because samax is not in Level 1 BLAS, we didn't test this function. I think this is an extension by Goto. I will write a test case for this function. Xianyi Reply to this email directly or view it on GitHub: |
Hi Ian, Thank you for this report. Because we plan to make a new release this week, we don't have enough time to support those functions. I will add a feature request into next release version (0.1.2 or 0.2.0). Xianyi |
There is a call to RESTOREREGISTERS in the kernel/x86/amax_sse.S but no matching earlier call to SAVEREGISTERS. This means that the stack pointer is incorrect when the results are stored and the registers popped after label L999, and the ret tries to jump to a data location instead of back to the calling routine.
RESTOREREGISTERS is not called by any other x86 kernel module so is probably safe to remove. Otherwise a SAVEREGISTERS should be inserted at line 79.
The text was updated successfully, but these errors were encountered: