Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error checking in PyTorch easyblock #3085

Merged

Conversation

Flamefire
Copy link
Contributor

Factor out functions extracting tests from the log from the test method to allow running them on log files for easier testing/debugging when they fail

Turns out in e.g. #3070 (comment) there are some extra newlines and an extra message which made the test fail. Handle that by a) removing any empty lines and checking for that extra line.

Allow easier testing by invoking it with some string
Also accepts a full build log in which case it tries to find the test output
Remove empty lines before matching
There is an extra line "If in CI, skip info is located in the xml test reports, please either go to s3 or the hud to download them"
which throws of the matching.
Include that.
The type hint for lists isn't valid until 3.9
@boegel boegel changed the title Improve pytorch error checking Improve error checking in PyTorch easyblock Jan 30, 2024
@boegel boegel added this to the 4.x milestone Jan 30, 2024
@branfosj
Copy link
Member

Test report by @branfosj

Overview of tested easyconfigs (in order)

Build succeeded for 0 out of 1 (1 easyconfigs in total)
bear-pg0208u23a - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz (icelake), 1 x NVIDIA NVIDIA A100-SXM4-40GB, 520.61.05, Python 3.6.8
See https://gist.github.com/branfosj/c0e2b99663ee495644c28c13464841d3 for a full test report.

@branfosj
Copy link
Member

Test report by @branfosj

Overview of tested easyconfigs (in order)

  • SUCCESS PyTorch-2.1.2-foss-2023a.eb

Build succeeded for 1 out of 1 (1 easyconfigs in total)
bear-pg0105u03a - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz (icelake), Python 3.6.8
See https://gist.github.com/branfosj/32b81a3ee0eac24b096045ca16f55daa for a full test report.

@branfosj branfosj modified the milestones: 4.x, release after 4.9.0 Feb 13, 2024
@branfosj branfosj merged commit 1938fea into easybuilders:develop Feb 13, 2024
47 checks passed
@Flamefire Flamefire deleted the improve_pytorch_error_checking branch February 14, 2024 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants