New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase rtol in thermoToYaml for ppcle64/aarch64 systems #1271
Conversation
Can we bisect to find out why this happened? There haven't been a huge number of changes, so I'm surprised this is needed. |
It'll take me a few days to get to it, but sure |
Can you link to the page with the successful runs for 2.6.0b2? I'm curious if there are any differences in the library or compiler versions that could be responsible, besides changes to Cantera. Are you running these builds only when we tag a version, or are these run on every update to the |
The last four builds at https://koji.fedoraproject.org/koji/packageinfo?packageID=35082 (v2.6.0-18) are the successful builds on all supported architectures for the four current Fedoras 34-36, rawhide(37) @speth I will try to investigate a CI process with the Fedora builld system(s) to improve support for alternate architectures |
So it looks like things started to break with Cantera-2.6.0-14 in mid April? |
@ischoegl v2.6.0b2 broke my build routines ("spec file"), but that is separate from the current test failures on alternate 64 bit arches. The current failures are not listed on koji, only on copr (first links, not latter in response to @speth) as getting burned last time led to me not formally submitting to Fedora yet. |
There are definitely differences in the system package versions between these builds, at least for FC37. At the very least, @mefuller - can you point to the COPR logs for FC35 and FC36, which you also indicated were failing. Is there a link to a more general status page that would provide access to additional information? |
Looking through the logs makes it pretty clear why this started failing:
PR #1072, which was merged just after the 2.6.0b2 tag was specifically about changing the calculation of the partial molar volumes ( |
@speth logs for complete matrix of Fedora versions and architectures (except s390x) are at https://copr.fedorainfracloud.org/coprs/fuller/cantera-test/build/4357916/ FWIW, the modification I propose here to |
Also, I'm not super excited by the idea of just arbitrarily relaxing the tolerance on this test. On the two platforms that I have easy access to (macOS/arm64 and Ubuntu/x86_64), I can actually tighten these tolerances by quite a bit, since they ought to differ only by a modest roundoff error. Specifically, I can use a tolerance of 2e-15 for the entire I think it would be worth trying to understand why these platforms can only meet tolerances of 1.2e-14. Unfortunately, investigating this seems like it will be quite difficult unless someone has full access to machines with these architectures. |
@speth @bryanwweber @ischoegl I am really sorry about kicking the hornet's nest here - I started a draft PR intending for it to be a draft while I investigated this, including establishing what the relaxation would need to be to get tests passing and also seeing if there was anything else out there. I didn't realize it would grab any attention right away. From the perspective of Fedora packaging and (realistic) end-use cases, I don't believe (temporarily) excluding these architectures is a big deal, and I can do that. I can keep building against them routinely to check for improvements, since that's currently my technical level (I know big-endian and little-endian have to do with Gulliver's Travels, less what the practical implications for precision might be, etc.). I'm content to follow all of you ("upstream") on this one - I was just trying to propose a solution/work-around concurrent with the problem to be less of a pest, not more. |
@mefuller No worries, and apologies for the flurry of activity here. I think we're just very eager to make sure that the new release is working everywhere that we expect it to. |
I likewise agree! |
0d548b2
to
e226e5e
Compare
What is the current status on this? |
e226e5e
to
21380ea
Compare
Just rebased |
Personally, I’m not opposed to adopting this tweak; perhaps you could be a little more descriptive in the commit comment? |
@mefuller ... is this PR still relevant or can it be closed? |
For Fedora builds, I just patch it. I don't think this needs to be upstreamed especially as I haven't reevaluated whether it's still necessary at this time. |
Changes proposed in this pull request
rtol
forthermoToYaml
increased by 20% to allow tests to pass on ppcle64 and aarch64 buildsProblem was identified while building v2.6.0 for Fedora Linux 35+; no issue was present in v2.6.0b2:
Relevant logs from v2.6.0 builds:
https://download.copr.fedorainfracloud.org/results/fuller/cantera-test/fedora-rawhide-aarch64/04357916-cantera/builder-live.log.gz
https://download.copr.fedorainfracloud.org/results/fuller/cantera-test/fedora-rawhide-ppc64le/04357916-cantera/builder-live.log.gz
Checklist
scons build
&scons test
) and unit tests address code coverage