New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PR#7452: tweak GCC options to try to work around the Skylake/Kaby lake bug #1228

Merged
merged 4 commits into from Jul 12, 2017

Conversation

Projects
None yet
9 participants
@damiendoligez
Member

damiendoligez commented Jul 7, 2017

Pending benchmarks on the performance impact of disabling some gcc optimizations, this PR tries to work around the infamous processor bug described in PR#7452 and copiously documented on the web:

NATIVECCCOMPOPTS=-DCAML_NAME_SPACE -Wall -Wno-unused
# -fno-tree-vrp is here to try to work around the Skylake/Kaby lake bug,
# and only works on GCC 4.2 and later.
NATIVECCCOMPOPTS=-DCAML_NAME_SPACE -Wall -Wno-unused -fno-tree-vrp

This comment has been minimized.

@dra27

dra27 Jul 7, 2017

Contributor

config/Makefile.mingw64 also needs updating

@dra27

dra27 Jul 7, 2017

Contributor

config/Makefile.mingw64 also needs updating

This comment has been minimized.

@damiendoligez

damiendoligez Jul 7, 2017

Member

Good catch! I've added the update.

@damiendoligez

damiendoligez Jul 7, 2017

Member

Good catch! I've added the update.

@gasche

This comment has been minimized.

Show comment
Hide comment
@gasche

gasche Jul 7, 2017

Member

The option only works on GCC >= 4.2, do we already rule out older GCC versions? I don't see any version-test in the patch.

Member

gasche commented Jul 7, 2017

The option only works on GCC >= 4.2, do we already rule out older GCC versions? I don't see any version-test in the patch.

@alainfrisch

This comment has been minimized.

Show comment
Hide comment
@alainfrisch

alainfrisch Jul 7, 2017

Contributor

I don't see any version-test in the patch.

There is:

if sh ./hasgot -Werror -fno-tree-vrp; then

in configure. For mingw ports, people usually tweak the Makefile themselves I think, the ones provided are just helpers/templates; so I'd say it's ok. And hopefully we will have a configure for Windows at some point.

Contributor

alainfrisch commented Jul 7, 2017

I don't see any version-test in the patch.

There is:

if sh ./hasgot -Werror -fno-tree-vrp; then

in configure. For mingw ports, people usually tweak the Makefile themselves I think, the ones provided are just helpers/templates; so I'd say it's ok. And hopefully we will have a configure for Windows at some point.

@xavierleroy

This is fine, modulo the update to Makefile.mingw64.

Show outdated Hide outdated configure
*gcc*,x86_64-*|*gcc*,i686-*)
if sh ./hasgot -Werror -fno-tree-vrp; then
bytecccompopts="$bytecccompopts -fno-tree-vrp"
inf "adding -fno-tree-vrp option to work around PR#7452"

This comment has been minimized.

@xavierleroy

xavierleroy Jul 7, 2017

Contributor

It would be nicer to start with a capital "A".

@xavierleroy

xavierleroy Jul 7, 2017

Contributor

It would be nicer to start with a capital "A".

This comment has been minimized.

@damiendoligez

damiendoligez Jul 7, 2017

Member

Your wish is my command.

@damiendoligez

damiendoligez Jul 7, 2017

Member

Your wish is my command.

@damiendoligez

This comment has been minimized.

Show comment
Hide comment
@damiendoligez

damiendoligez Jul 7, 2017

Member

Note that this shouldn't be merged before we analyze the benchmark results.

Member

damiendoligez commented Jul 7, 2017

Note that this shouldn't be merged before we analyze the benchmark results.

@hcarty

This comment has been minimized.

Show comment
Hide comment
@hcarty

hcarty Jul 7, 2017

Contributor

If this gets merged for 4.05.0, would it be considered for an additional 4.04.x release as well? Or would the recommended approach be to jump to 4.05.x if affected by the CPU bug?

Contributor

hcarty commented Jul 7, 2017

If this gets merged for 4.05.0, would it be considered for an additional 4.04.x release as well? Or would the recommended approach be to jump to 4.05.x if affected by the CPU bug?

@bluddy

This comment has been minimized.

Show comment
Hide comment
@bluddy

bluddy Jul 7, 2017

Is there any process to add a long-term TODO item to look into removing this once these processors are behind us?

bluddy commented Jul 7, 2017

Is there any process to add a long-term TODO item to look into removing this once these processors are behind us?

@dra27

This comment has been minimized.

Show comment
Hide comment
@dra27

dra27 Jul 8, 2017

Contributor

@hcarty - it also affects 4.03.0. The patch to configure applies cleanly to both 4.03.0 and 4.04.2; I think it would be better for this to back-ported via opam (cc @avsm)

Contributor

dra27 commented Jul 8, 2017

@hcarty - it also affects 4.03.0. The patch to configure applies cleanly to both 4.03.0 and 4.04.2; I think it would be better for this to back-ported via opam (cc @avsm)

@dra27

This comment has been minimized.

Show comment
Hide comment
@dra27

dra27 Jul 8, 2017

Contributor

Two thoughts on how configure determines whether to do this:

  • Is it worth looking at uname -p to restrict when it's enabled?
  • The message could be more descriptive - perhaps mentioning "Intel Skylake/Kaby Lake" rather than than our PR number?
  • Is it worth having a configure option to disable the workaround, e.g.:
diff --git a/configure b/configure
index 67a2654b0..a729f9901 100755
--- a/configure
+++ b/configure
@@ -65,6 +65,7 @@ afl_instrument=false
 max_testsuite_dir_retries=0
 with_cplugins=true
 with_fpic=false
+with_skylake_workaround=true

 # Try to turn internationalization off, can cause config.guess to malfunction!
 unset LANG
@@ -208,6 +209,8 @@ while : ; do
         safe_string=true;;
     -afl-instrument)
         afl_instrument=true;;
+    -no-skylake-workaround)
+        with_skylake_workaround=false;;
     *) if echo "$1" | grep -q -e '^--\?[a-zA-Z0-9-]\+='; then
          err "configure expects arguments of the form '-prefix /foo/bar'," \
              "not '-prefix=/foo/bar' (note the '=')."
@@ -865,8 +868,8 @@ fi


 # Try to work around the Skylake/Kaby Lake processor bug.
-case "$bytecc,$target" in
-    *gcc*,x86_64-*|*gcc*,i686-*)
+case "$with_skylake_workaround,$bytecc,$target" in
+    true,*gcc*,x86_64-*|true,*gcc*,i686-*)
         if sh ./hasgot -Werror -fno-tree-vrp; then
             bytecccompopts="$bytecccompopts -fno-tree-vrp"
             inf "Adding -fno-tree-vrp option to work around PR#7452"
Contributor

dra27 commented Jul 8, 2017

Two thoughts on how configure determines whether to do this:

  • Is it worth looking at uname -p to restrict when it's enabled?
  • The message could be more descriptive - perhaps mentioning "Intel Skylake/Kaby Lake" rather than than our PR number?
  • Is it worth having a configure option to disable the workaround, e.g.:
diff --git a/configure b/configure
index 67a2654b0..a729f9901 100755
--- a/configure
+++ b/configure
@@ -65,6 +65,7 @@ afl_instrument=false
 max_testsuite_dir_retries=0
 with_cplugins=true
 with_fpic=false
+with_skylake_workaround=true

 # Try to turn internationalization off, can cause config.guess to malfunction!
 unset LANG
@@ -208,6 +209,8 @@ while : ; do
         safe_string=true;;
     -afl-instrument)
         afl_instrument=true;;
+    -no-skylake-workaround)
+        with_skylake_workaround=false;;
     *) if echo "$1" | grep -q -e '^--\?[a-zA-Z0-9-]\+='; then
          err "configure expects arguments of the form '-prefix /foo/bar'," \
              "not '-prefix=/foo/bar' (note the '=')."
@@ -865,8 +868,8 @@ fi


 # Try to work around the Skylake/Kaby Lake processor bug.
-case "$bytecc,$target" in
-    *gcc*,x86_64-*|*gcc*,i686-*)
+case "$with_skylake_workaround,$bytecc,$target" in
+    true,*gcc*,x86_64-*|true,*gcc*,i686-*)
         if sh ./hasgot -Werror -fno-tree-vrp; then
             bytecccompopts="$bytecccompopts -fno-tree-vrp"
             inf "Adding -fno-tree-vrp option to work around PR#7452"
@dra27

This comment has been minimized.

Show comment
Hide comment
@dra27

dra27 Jul 8, 2017

Contributor

Finally, this is applied in BYTECCCOMPOPTS in the mingw Makefile's - shouldn't it being applied to $byteccprivatecompopts in configure? We want this option used for building the runtime only, right, not C files compiled by invoking the compiler?

Contributor

dra27 commented Jul 8, 2017

Finally, this is applied in BYTECCCOMPOPTS in the mingw Makefile's - shouldn't it being applied to $byteccprivatecompopts in configure? We want this option used for building the runtime only, right, not C files compiled by invoking the compiler?

@alainfrisch

This comment has been minimized.

Show comment
Hide comment
@alainfrisch

alainfrisch Jul 8, 2017

Contributor

configure doesn't know where compiled programs will run so I think it should not try to decide based on the local CPU.

Contributor

alainfrisch commented Jul 8, 2017

configure doesn't know where compiled programs will run so I think it should not try to decide based on the local CPU.

@dra27

This comment has been minimized.

Show comment
Hide comment
@dra27

dra27 Jul 8, 2017

Contributor

@alainfrisch - good point!

Contributor

dra27 commented Jul 8, 2017

@alainfrisch - good point!

@xavierleroy

This comment has been minimized.

Show comment
Hide comment
@xavierleroy

xavierleroy Jul 8, 2017

Contributor

We want this option used for building the runtime only, right, not C files compiled by invoking the compiler?

Very good point, thanks. Indeed, -fno-tree-vrp appears (empirically) to avoid triggering the hardware bug within the OCaml runtime system, but we have no reason to believe it will help with user C code. So it should be a "private" compiler flag.

Contributor

xavierleroy commented Jul 8, 2017

We want this option used for building the runtime only, right, not C files compiled by invoking the compiler?

Very good point, thanks. Indeed, -fno-tree-vrp appears (empirically) to avoid triggering the hardware bug within the OCaml runtime system, but we have no reason to believe it will help with user C code. So it should be a "private" compiler flag.

@xavierleroy

This comment has been minimized.

Show comment
Hide comment
@xavierleroy

xavierleroy Jul 8, 2017

Contributor

Concerning the -no-skylake-workaround flag: since this is the last thing delaying the 4.05.0 release, I'd suggest to keep things as simple as possible right now, with the workaround being systematically applied. Later we can think of ways to apply it conditionally. And also of a strategy to remove it eventually when unpatched Skylake/Kaby Lake systems have disappeared.

Contributor

xavierleroy commented Jul 8, 2017

Concerning the -no-skylake-workaround flag: since this is the last thing delaying the 4.05.0 release, I'd suggest to keep things as simple as possible right now, with the workaround being systematically applied. Later we can think of ways to apply it conditionally. And also of a strategy to remove it eventually when unpatched Skylake/Kaby Lake systems have disappeared.

@shindere

This comment has been minimized.

Show comment
Hide comment
@shindere

shindere Jul 10, 2017

Contributor
Contributor

shindere commented Jul 10, 2017

@xavierleroy

This comment has been minimized.

Show comment
Hide comment
@xavierleroy

xavierleroy Jul 10, 2017

Contributor

With Damien's latest changes I think this one is good to go. How long must we wait for the benchmark results?

Contributor

xavierleroy commented Jul 10, 2017

With Damien's latest changes I think this one is good to go. How long must we wait for the benchmark results?

@damiendoligez

This comment has been minimized.

Show comment
Hide comment
@damiendoligez

damiendoligez Jul 10, 2017

Member

@dra27

  • I'd rather not use trademarks in the message.
  • Switched to $byteccprivatecompopts after discussing with Xavier. Thanks for the suggestion.
Member

damiendoligez commented Jul 10, 2017

@dra27

  • I'd rather not use trademarks in the message.
  • Switched to $byteccprivatecompopts after discussing with Xavier. Thanks for the suggestion.
@damiendoligez

This comment has been minimized.

Show comment
Hide comment
@damiendoligez

damiendoligez Jul 10, 2017

Member

@mshinwell Any progress on the benchmarking?

Member

damiendoligez commented Jul 10, 2017

@mshinwell Any progress on the benchmarking?

@shindere

This comment has been minimized.

Show comment
Hide comment
@shindere

shindere Jul 11, 2017

Contributor
Contributor

shindere commented Jul 11, 2017

@alainfrisch alainfrisch added this to the 4.05.0 milestone Jul 11, 2017

@mshinwell

This comment has been minimized.

Show comment
Hide comment
@mshinwell

mshinwell Jul 11, 2017

Contributor

@damiendoligez Within noise I think. I'm not sure why the "mvar" test changed so much though; I wonder if it might not be a good benchmark.

http://bench.flambda.ocamlpro.com/compare?test=2017-07-10-2300%2F4.05.0%2B%2Bpbw%2Bbench&reference=2017-07-10-2300%2F4.05.0%2Bvanilla%2Bbench

Contributor

mshinwell commented Jul 11, 2017

@damiendoligez Within noise I think. I'm not sure why the "mvar" test changed so much though; I wonder if it might not be a good benchmark.

http://bench.flambda.ocamlpro.com/compare?test=2017-07-10-2300%2F4.05.0%2B%2Bpbw%2Bbench&reference=2017-07-10-2300%2F4.05.0%2Bvanilla%2Bbench

@damiendoligez

This comment has been minimized.

Show comment
Hide comment
@damiendoligez
Member

damiendoligez commented Jul 12, 2017

Thanks @mshinwell and @shindere .

@damiendoligez damiendoligez merged commit 2d03974 into ocaml:4.05 Jul 12, 2017

0 of 2 checks passed

continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details

@damiendoligez damiendoligez deleted the damiendoligez:processor-bug-workaround branch Jul 12, 2017

gasche added a commit to gasche/ocaml that referenced this pull request Jul 13, 2017

@shindere shindere referenced this pull request Jul 17, 2017

Merged

Merge 4.05 into trunk #1244

camlspotter pushed a commit to camlspotter/ocaml that referenced this pull request Oct 17, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment