Skip to content

Conversation

@casparvl
Copy link
Contributor

Also, add some suggestion for future comparison to a reference (in comments).

This is initial work in order to check that the build node we get allocated has the expected set of instruction flags supported.

echo "bot/build.sh: EESSI_ACCELERATOR_TARGET_OVERRIDE='${EESSI_ACCELERATOR_TARGET_OVERRIDE}'"

# check if CPU architecture of the build host matches our expectation
lscpu_flags_line=$(lscpu | grep "Flags:")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would grab the full lscpu output, in a separate file?

Other fields (like Model name), and also additional info like host OS (/etc/os-release and /etc/redhat-release) would be relevant to grab

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. First step could just be to write output of lscpu to _bot_job{job_id}.lscpu

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we could definitely do both - I'd be happy if a file just gets dumped in the workdir of the bot like Thomas proposes. But to compare the flags against a reference, you definitely want the list of flags extracted separately, without any context. And, as stated in the comment below this code, you can quite easily compare between (sorted) bash arrays to spot any difference.

Copy link
Contributor Author

@casparvl casparvl Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words: I think there's two goals here

  1. Better logging, in which case we want to include as much info as possible (full lscpu and os-release output)
  2. Runtime checking of the supported Flags, and producing a hard abort if it is not as expected

I've implemented the 2nd, you want the first, I propose we do both ;-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the hard abort. It's anticipated yes, but not yet there, right.

Lets try to move quickly. Log the output of lscpu + os release into a file or two, keep grabbing the flags and print them to the job output. Putting this into production quickly we already gather information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the hard abort. It's anticipated yes, but not yet there, right.

You're right, it's not, this was just preparation, only logging the Flags and not doing anything with it yet.

Copy link
Contributor Author

@casparvl casparvl Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, see above.

FYI: I would like to keep what I have in terms of bash array. Once we have a reference, it'll be a 1 line change (loading the reference) + uncommenting the code below to implement the hard fail. No need to reinvent that later...

Copy link
Contributor

@trz42 trz42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - lets get this integrated, then iterate to add missing functionality.

@trz42 trz42 merged commit 85ead5d into EESSI:main Nov 12, 2025
64 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants