Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: Bump fmt minimum 7.0, fix fmt+gcc bugs #3973

Merged
merged 1 commit into from
Sep 5, 2023

Conversation

lgritz
Copy link
Collaborator

@lgritz lgritz commented Sep 2, 2023

Bump fmt minimum to 7.0 for OIIO 2.5. fmt 7.0 is 3 years old, so let's stop doing extra work to support 6.x. It's not that we depend on newer features per se, but I'd like to no longer need separate reference output for some tests whose formatting behavior changed slightly between 6.x and 7.0.

Rig CI to properly test "latest" and "bleeding edge" fmt. It turns out we had not yet been testing 10.1 or keeping up with their master.

In the process, I discovered that the combo of fmt >= 10.1 and gcc >= 11 results in a mangled AVX math heisenbug, symptomatic in our simd_test. I struggled with this on and off for many days, having great difficulty reproducing (though it does fail every time and deterministically, I just can't seem to narrow it to a smaller example). It only happens with gcc, only gcc >= 11, only when we use FMT_EXCEPTIONS=0 to disable true exceptions in fmt. The change happens specifically at fmt commit 9a034b0 (midway between 10.0 and 10.1), which changes the definition of FMT_THROW when exceptions are disabled. Why this should affect SIMD math is a total mystery, and currently I am suspecting either a gcc compiler error or that it's exposing undefined behavior. I found that redefining FMT_THROW on our side to something innocuous is a good workaround. If I no longer disable fmt exceptions, that also makes it work, but I don't know what that might break for us or downstream.

My report of this to the fmt project is here, maybe they'll uncover the true cause.

Bump fmt minimum to 7.0 for OIIO 2.5.  fmt 7.0 is 3 years old, so
let's stop doing extra work to support 6.x. It's not that we depend on
newer features per se, but I'd like to no longer need separate
reference output for some tests whose formatting behavior changed
slightly between 6.x and 7.0.

Rig CI to properly test "latest" and "bleeding edge" fmt. It turns out
we had not yet been testing 10.1 or keeping up with their master.

In the process, I discovered that the combo of fmt >= 10.1 and gcc >=
11 results in mangled AVX math heisenbug, symptomatic in our
simd_test.  I struggled with this on and off for many days, having
great difficulty reproducing in any smaller example (though it does
fail every time and deterministically, I just can't narrow it to a
smaller example). It only happens with gcc, only gcc >= 11, only when
we use FMT_EXCEPTIONS=0 to disable true exceptions in fmt. The change
happens specifically at fmt commit 9a034b0 (midway between 10.0 and
10.1), which changes the definition of FMT_THROW when exceptions are
disabled. Why this should affect SIMD math is a total mystery, and
currently I am suspecting either gcc copmiler error or UB. I found
that redefining FMT_THROW on our side to something innocuous is a good
workaround. If I no longer disable fmt exceptions, that also makes it
work, but I don't know what that might break for us or downstream.

Signed-off-by: Larry Gritz <lg@larrygritz.com>
@lgritz lgritz merged commit 4c36a07 into AcademySoftwareFoundation:master Sep 5, 2023
23 checks passed
@lgritz lgritz deleted the lg-fmt7up branch September 5, 2023 00:52
lgritz added a commit to lgritz/OpenImageIO that referenced this pull request Sep 5, 2023
…oftwareFoundation#3973)

Rig CI to properly test "latest" and "bleeding edge" fmt. It turns out
we had not yet been testing 10.1 or keeping up with their master.

In the process, I discovered that the combo of fmt >= 10.1 and gcc >= 11
results in a mangled AVX math heisenbug, symptomatic in our simd_test. I
struggled with this on and off for many days, having great difficulty
reproducing (though it does fail every time and deterministically, I
just can't seem to narrow it to a smaller example). It only happens with
gcc, only gcc >= 11, only when we use FMT_EXCEPTIONS=0 to disable true
exceptions in fmt. The change happens specifically at fmt commit 9a034b0
(midway between 10.0 and 10.1), which changes the definition of
FMT_THROW when exceptions are disabled. Why this should affect SIMD math
is a total mystery, and currently I am suspecting either a gcc compiler
error or that it's exposing undefined behavior. I found that redefining
FMT_THROW on our side to something innocuous is a good workaround. If I
no longer disable fmt exceptions, that also makes it work, but I don't
know what that might break for us or downstream.

My report of this to the fmt project is
[here](fmtlib/fmt#3620), maybe they'll uncover
the true cause.

Signed-off-by: Larry Gritz <lg@larrygritz.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant