Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stdlib: fix re:replace on LTO builds #2194

Merged
merged 2 commits into from
Apr 10, 2019
Merged

stdlib: fix re:replace on LTO builds #2194

merged 2 commits into from
Apr 10, 2019

Conversation

trofi
Copy link
Contributor

@trofi trofi commented Mar 28, 2019

Fabio Coatti reported elixir build failure in https://bugs.gentoo.org/681778.
The minimal reproducer looks like that (from otp git tree):

$ ./configure CFLAGS='-O2 -flto' LDFLAGS='-O2 -flto=8'
$ make
$ ERL_TOP=$PWD \
  PATH=$ERL_TOP/bin:$PATH \
    \
    bin/erl \
    \
    -noshell -eval 're:replace("a","b","c",[{return,list}]).' \
    -s erlang halt

{"init terminating in do_boot",{badarg,[{re,replace,["a","b","c",[{return,list}]],
    [{file,"re.erl"},{line,362}]},
     {erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,680}]},
     {init,start_it,1,[]},
     {init,start_em,1,[]},
     {init,do_boot,3,[]}]}}
init terminating in do_boot ({badarg,[{re,replace,[[_],[_],[_],[_]],[{_},{_}]},
    {erl_eval,do_apply,6,[{_},{_}]},{init,start_it,1,[]},{init,start_em,1,[]},{init,do_boot,3,[]}]})
Crash dump is being written to: erl_crash.dump...done

The failure happens in libpcre2 where stack overflow is mis-identified
at function entry of

erts_pcre_compile2()
    compile_regex()
      if (PUBL(stack_guard) != NULL && PUBL(stack_guard)())
      {
          *errorcodeptr= ERR85;
          return FALSE;
      }

The stack "overflow" detection happens in

thr_wrapper()
  ethr_set_stacklimit__()

because the stack usage code relies on the fact that ethr_set_stacklimit__()
and similar functions don't get inlined into callers for stack growth
measurement.

Before the change inlining avoidance was achieved by putting functions
into standalone translation units. LTO makes this technique inefficient.

The change marks functions explicitly as attribute((noinline)) on gcc.

Reported-by: Fabio Coatti
Bug: https://bugs.gentoo.org/681778

@CLAassistant
Copy link

CLAassistant commented Mar 28, 2019

CLA assistant check
All committers have signed the CLA.

@bjorng
Copy link
Contributor

bjorng commented Mar 29, 2019

Please remove the now redundant macro definition in beam_emu.c:

/*
* process_main() is already huge, so we want to avoid inlining
* into it. Especially functions that are seldom used.
*/
#ifdef __GNUC__
# define NOINLINE __attribute__((__noinline__))
#else
# define NOINLINE
#endif

@jhogberg jhogberg added team:VM Assigned to OTP team VM fix labels Mar 29, 2019
@trofi
Copy link
Contributor Author

trofi commented Mar 29, 2019

Please remove the now redundant macro definition in beam_emu.c

Good point! Removed.

@bjorng bjorng added the testing currently being tested, tag is used by OTP internal CI label Mar 29, 2019
@bjorng bjorng self-assigned this Mar 29, 2019
@bjorng bjorng removed the testing currently being tested, tag is used by OTP internal CI label Mar 29, 2019
@jhogberg
Copy link
Contributor

Thanks for the PR!

We've made a few attempts in the past and found that this stack check is not the only thing that goes wrong at runtime. While we think our last one got pretty far we had to put it on the shelf as other more important things got in the way. We'd really appreciate help in this area and it would be great if someone could pick up where we left off.

If you don't feel you have the time for that we'd still welcome your PR as it is. The only changes I'd like to see are that the NOINLINE macro is moved near the other inline macros in sys.h and ethread_inline.h, and given the same prefix (ERTS_ and ETHR_, respectively).

Sergei Trofimovich added 2 commits March 29, 2019 23:24
Fabio Coatti reported elixir build failure in https://bugs.gentoo.org/681778.
The minimal reproducer looks like that (from otp git tree):

    $ ./configure CFLAGS='-O2 -flto' LDFLAGS='-O2 -flto=8'
    $ make
    $ ERL_TOP=$PWD \
      PATH=$ERL_TOP/bin:$PATH \
        \
        bin/erl \
        \
        -noshell -eval 're:replace("a","b","c",[{return,list}]).' \
        -s erlang halt

    {"init terminating in do_boot",{badarg,[{re,replace,["a","b","c",[{return,list}]],
        [{file,"re.erl"},{line,362}]},
         {erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,680}]},
         {init,start_it,1,[]},
         {init,start_em,1,[]},
         {init,do_boot,3,[]}]}}
    init terminating in do_boot ({badarg,[{re,replace,[[_],[_],[_],[_]],[{_},{_}]},
        {erl_eval,do_apply,6,[{_},{_}]},{init,start_it,1,[]},{init,start_em,1,[]},{init,do_boot,3,[]}]})
    Crash dump is being written to: erl_crash.dump...done

The failure happens in libpcre2 where stack overflow is mis-identified
at function entry of

    erts_pcre_compile2()
        compile_regex()
          if (PUBL(stack_guard) != NULL && PUBL(stack_guard)())
          {
              *errorcodeptr= ERR85;
              return FALSE;
          }

The stack "overflow" detection happens in

    thr_wrapper()
        ethr_set_stacklimit__()

because the stack usage code relies on the fact that ethr_set_stacklimit__()
and similar functions don't get inlined into callers for stack growth
measurement.

Before the change inlining avoidance was achieved by putting functions
into standalone translation units. LTO makes this technique inefficient.

The change marks functions explicitly as __attribute__((__noinline__)) on gcc.

Reported-by: Fabio Coatti
Bug: https://bugs.gentoo.org/681778
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
@trofi
Copy link
Contributor Author

trofi commented Mar 29, 2019

We'd really appreciate help in this area and it would be great if someone could pick up where we left
off.

Looks like the changes at https://github.com/garazdawi/otp/commits/lukas/erts/lto-compile handle a few things:

  1. type mismatches across translation units seen by LTO (reported as warnings for me as well)
  2. --enable-lto flag
  3. fix of bug tackled by this pull request (by marking ethr_set_stacklimit__ as ETHR_NOINLINE)

I'll try to send a separate pull request for 1. as it will likely fix.

I'm not sure about 2. usefulness. CFLAGS/LDFLAGS is slightly more flexible way to pass LTO-related flags (and should Just Work on modern GNU-binutils/llvm with plugin support). But it's my personal bias as a package maintainer :)

NOINLINE macro is moved near the other inline macros in sys.h and ethread_inline.h, and given the same prefix (ERTS_ and ETHR_, respectively).

Renamed to ERTS_ and ETHR_ variants. Added a separate commit to convert existing users of NOINLINE.

@jhogberg
Copy link
Contributor

jhogberg commented Apr 1, 2019

We'd really appreciate help in this area and it would be great if someone could pick up where we left
off.

Looks like the changes at https://github.com/garazdawi/otp/commits/lukas/erts/lto-compile handle a few things:

1. type mismatches across translation units seen by LTO (reported as warnings for me as well)

2. `--enable-lto` flag

3. fix of bug tackled by this pull request (by marking `ethr_set_stacklimit__` as `ETHR_NOINLINE`)

I think there's some kind of fix for ethr_x86_cpuid__ as well, I can't recall how that misbehaved though.

I'll try to send a separate pull request for 1. as it will likely fix.

A separate commit in this PR is fine.

I'm not sure about 2. usefulness. CFLAGS/LDFLAGS is slightly more flexible way to pass LTO-related flags (and should Just Work on modern GNU-binutils/llvm with plugin support). But it's my personal bias as a package maintainer :)

We want to enable it by default once we believe it's stable, and for that we need the configure check and a flag to explicitly disable it, so we may as well add it now.

NOINLINE macro is moved near the other inline macros in sys.h and ethread_inline.h, and given the same prefix (ERTS_ and ETHR_, respectively).

Renamed to ERTS_ and ETHR_ variants. Added a separate commit to convert existing users of NOINLINE.

Great!

@bjorng bjorng assigned jhogberg and unassigned bjorng Apr 1, 2019
@jhogberg jhogberg merged commit 2124657 into erlang:master Apr 10, 2019
@jhogberg
Copy link
Contributor

Merged to master, thanks again for the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix team:VM Assigned to OTP team VM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants