New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect arch detection #15151
Comments
Can you post the "flags" of both nodes in |
I lost my allocation on my "correct" node (cn123), but I was able to grab another node (cn126) with the same processor (which Spack identifies correctly as a Haswell). On cn126 (correct arch detection):
On cn141 (incorrect arch detection):
Looks like cn126 has the flag "aes", while cn141 does not. |
Our detection is mainly targeted towards binary compatibility and as such we started from the instruction set mentioned in GCC manual. For
as there is for Now, it would be interesting to understand why there's no |
Same OS
```
[quellyn@cn126 Scratch]$ cat /etc/redhat-release
CentOS Linux release 7.7.1908 (Core)
[quellyn@cn141 ~]$ cat /etc/redhat-release
CentOS Linux release 7.7.1908 (Core)
```
Same kernel, too:
```
[quellyn@cn126 Scratch]$ uname -r
3.10.0-1062.9.1.el7.x86_64
[quellyn@cn141 ~]$ uname -r
3.10.0-1062.9.1.el7.x86_64
```
I'll ping the admins and see if there are any differences in BIOS
firmware version or settings for cn141.
Thanks!
Q
…On 2/21/20 10:53 AM, Massimiliano Culpo wrote:
Our detection is mainly targeted towards binary compatibility and as
such we started from the instruction set mentioned in GCC manual. For
|haswell| there's mention of |AES|:
|haswell Intel Haswell CPU with 64-bit extensions, MOVBE, MMX, SSE,
SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL,
FSGSBASE, RDRND, FMA, BMI, BMI2 and F16C instruction set support. |
as there is for |ivybridge|, |sandy-bridge| and |westmere|. |nehalem|
is the first architecture in the hierarchy that does not have support
for that. I would say that detection is working correctly for Spack,
as it ensures that the binaries that are generated would run on the node.
Now, it would be interesting to understand why there's no |aes| on
that node... Do they have all the same kernel and OS?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#15151?email_source=notifications&email_token=AH6ATNKDU6P5M4RHQYMJHYTREAIJTA5CNFSM4KZGUNG2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMTQ7KA#issuecomment-589762472>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AH6ATNOEMUA7CS27OJYTWGDREAIJTANCNFSM4KZGUNGQ>.
--
Quellyn L Snead
XCP-1, Lagrangian Codes
Los Alamos National Laboratory
Office: 505-667-4185
|
After talking to Massimiliano I'm pretty convinced this is NOT a Spack problem. I've put in a ticket with our cluster admins to see if there are BIOS differences between the Haswell nodes. Thanks so much for the help in troubleshooting! |
@boegel: FYI -- another reason not to use |
@tgamblin I just brought up |
Hi guys,
On our local Frankencluster, I've noticed an odd inconsistency with Spack's arch detection. This cluster is composed of many flavors of x86_64, Power 9, and ARM nodes, all running CentOS Linux release 7.7.1908 (Core). My particular issue is with our x86_64 nodes.
Example 1: On node cn123, with a fresh Spack instance:
The node itself agrees with this assessment:
Example 2: On node cn141, with a fresh Spack instance:
But this node disagrees with Spack; it thinks it's a Haswell also:
I'm afraid I don't understand the magic of Spack's arch detection well enough to even start looking for a root cause. If you could give me a hint as to where to start that would be great.
Thanks!
Quellyn
P.S. This is my first time opening an issue; please let me know if I've left something out.
The text was updated successfully, but these errors were encountered: