New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aobasis: fallback to dgemm if libxsmm kernel unavailable for contraction #1629
Conversation
Thank you for catching this and fixing it! I consider this a bug. For all recent LIBXSMM or at some point we decided to never return NULL-pointer for JIT requests if just the usual preconditions were met (alpha, beta requirements, etc.). It could be a bug related to requesting an SSE kernel. I will probably try reproducing this to ensure it's not happening for our next release. Which version of LIBXSMM exposed this problem? I guess 1.16.1 ... |
@marci73 this fixes the regtest segfaults on tcopt9 After this change all except 1 regtest pass (see below):
|
7af619c
to
a7424fd
Compare
Ok, so should we rather do For reproducing the issue:
Yes, this is with libxsmm-1.16.1. |
Thank you for sharing the reproducer! I am on it (as a side-task). I am doing |
I have a hard time reproducing the problem. The output (single rank) looks like:
This is LIBXSMM's termination message showing statistics about generated kernels, which looks fine (beside of terminating after success). This is a debug build of CP2K (PSMP). I can try other builds if you think it a better match. I used GNU compiler to build CP2K/master, etc. I wonder if you can reach out (PM) and help reactivating access to UZH Portal? |
Sure, account reactivated and mail with information sent :) |
I have root-caused the problem, and it suggests to maybe not merge this PR. Essentially, LIBXSMM's detects SSE4 (CPUID) which includes checking if the OS permits using the extension (like state-save per XSAVE instruction on context-switch). The OS does not seem to permit using SSE4 on this specific system (I may take a deeper look why this is). However, our current master of LIBXSMM changed the behavior like using it anyway specifically in case of SSE4 (I think we came across such situations at least with some VMs). There are now two options:
The former case assumes after 20 years of SSE extension, any OS will support/use XSAVE even if it's not correctly signaled. For the latter case, LIBXSMM takes the requested code-path without further moderation ( |
I have prepared LIBXSMM 1.16.2. Above mentioned option remains a viable workaround as well ( |
I have released LIBXSMM 1.16.2. So, you can decide which of the above solutions you prefer. Though, |
a7424fd
to
f417d66
Compare
@oschuett can you please update the libxsmm tarball on the mirror? It seems port 22 is now closed on sham.cp2k.org. |
Voilà: https://www.cp2k.org/static/downloads/libxsmm-1.16.2.tar.gz
You were probably banned by fail2ban - try again. |
f417d66
to
8584dfa
Compare
this fixes segfaults occurring on Intel Westmere-EP
8584dfa
to
d846844
Compare
this fixes segfaults occurring on Intel Westmere-EP