Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is call_ext_last opcode argument properly computed? #7152

Open
pguyot opened this issue Apr 24, 2023 · 4 comments
Open

Is call_ext_last opcode argument properly computed? #7152

pguyot opened this issue Apr 24, 2023 · 4 comments
Assignees
Labels
bug Issue is reported as a bug team:VM Assigned to OTP team VM

Comments

@pguyot
Copy link
Contributor

pguyot commented Apr 24, 2023

Describe the bug
With OTP23 and OTP25 compilers, I do not understand the n_remaining value of call_ext_last opcode with the following code:

-module(call_ext_last).
-export([start/0]).

start() ->
    Bin = erlang:list_to_binary(lists:seq(1, 1024)),
    BinarySize = erlang:byte_size(Bin),
    LargeSubBin = binary:part(Bin, 1, BinarySize - 1),
    Pid = erlang:spawn(fun() -> f:loop(LargeSubBin) end),
    PidHeapSize0 = erlang:process_info(Pid, heap_size),
    LargeSubBin = send(Pid, get),
    ok = send(Pid, free),
    PidHeapSize2 = erlang:process_info(Pid, heap_size),
    case PidHeapSize2 - PidHeapSize0 of
        0 -> ok;
        % should be call_ext_last
        Delta -> throw({heap_delta, Delta})
    end,
    ok = send(Pid, halt),
    ok.

send(Pid, Msg) ->
    Ref = erlang:make_ref(),
    Pid ! {self(), Ref, Msg},
    receive
        {Ref, Reply} -> Reply
    end.

Function start/0 is compiled as follows (OTP25):

{function, start, 0, 2}.
  {label,1}.
    {line,[{location,"call_ext_last.erl",4}]}.
    {func_info,{atom,call_ext_last},{atom,start},0}.
  {label,2}.
    {allocate,3,0}.
    {init_yregs,{list,[{y,0},{y,1},{y,2}]}}.
    {move,{integer,1024},{x,1}}.
    {move,{integer,1},{x,0}}.
    {line,[{location,"call_ext_last.erl",5}]}.
    {call_ext,2,{extfunc,lists,seq,2}}.
    {call_ext,1,{extfunc,erlang,list_to_binary,1}}.
    {line,[{location,"call_ext_last.erl",6}]}.
    {gc_bif,byte_size,{f,0},1,[{tr,{x,0},{t_bitstring,8}}],{x,1}}.
    {line,[{location,"call_ext_last.erl",7}]}.
    {gc_bif,'-',
            {f,0},
            2,
            [{tr,{x,1},{t_integer,{0,288230376151711743}}},{integer,1}],
            {x,2}}.
    {move,{integer,1},{x,1}}.
    {call_ext,3,{extfunc,binary,part,3}}.
    {test_heap,{alloc,[{words,1},{floats,0},{funs,1}]},1}.
    {make_fun3,{f,17},0,0,{x,1},{list,[{x,0}]}}.
    {move,{x,0},{y,2}}.
    {move,{x,1},{x,0}}.
    {line,[{location,"call_ext_last.erl",8}]}.
    {call_ext,1,{extfunc,erlang,spawn,1}}.
    {move,{x,0},{y,1}}.
    {move,{atom,heap_size},{x,1}}.
    {line,[{location,"call_ext_last.erl",9}]}.
    {call_ext,2,{extfunc,erlang,process_info,2}}.
    {move,{x,0},{y,0}}.
    {move,{y,1},{x,0}}.
    {move,{atom,get},{x,1}}.
    {line,[{location,"call_ext_last.erl",10}]}.
    {call,2,{f,8}}. % send/2
    {test,is_eq_exact,{f,6},[{x,0},{y,2}]}.
    {move,{atom,free},{x,1}}.
    {move,{y,0},{y,2}}.
    {trim,1,2}.
    {move,{y,0},{x,0}}.
    {line,[{location,"call_ext_last.erl",11}]}.
    {call,2,{f,8}}. % send/2
    {test,is_eq_exact,{f,5},[{x,0},{atom,ok}]}.
    {move,{atom,heap_size},{x,1}}.
    {move,{y,0},{x,0}}.
    {line,[{location,"call_ext_last.erl",12}]}.
    {call_ext,2,{extfunc,erlang,process_info,2}}.
    {line,[{location,"call_ext_last.erl",13}]}.
    {gc_bif,'-',{f,0},1,[{x,0},{y,1}],{x,0}}.
    {test,is_eq_exact,{f,3},[{tr,{x,0},number},{integer,0}]}.
    {move,{atom,halt},{x,1}}.
    {move,{y,0},{x,0}}.
    {trim,2,0}.
    {line,[{location,"call_ext_last.erl",18}]}.
    {call,2,{f,8}}. % send/2
    {test,is_eq_exact,{f,4},[{x,0},{atom,ok}]}.
    {deallocate,0}.
    return.
  {label,3}.
    {test_heap,3,1}.
    {put_tuple2,{x,0},{list,[{atom,heap_delta},{x,0}]}}.
    {line,[{location,"call_ext_last.erl",16}]}.
    {call_ext_last,1,{extfunc,erlang,throw,1},3}.
  {label,4}.
    {line,[{location,"call_ext_last.erl",18}]}.
    {badmatch,{x,0}}.
  {label,5}.
    {line,[{location,"call_ext_last.erl",11}]}.
    {badmatch,{x,0}}.
  {label,6}.
    {line,[{location,"call_ext_last.erl",10}]}.
    {badmatch,{x,0}}.

There are two execution paths with different stack allocations.

If condition goes to the throw clause (label,3), the path is:

    {allocate,3,0}.
    {trim,1,2}.
    {call_ext_last,1,{extfunc,erlang,throw,1},3}.

If it doesn't, the path is:

    {allocate,3,0}.
    {trim,1,2}.
    {trim,2,0}.
    {deallocate,0}.

Should call_ext_last really have 3 (and not 2) as its last argument?

@pguyot pguyot added the bug Issue is reported as a bug label Apr 24, 2023
@rickard-green rickard-green added the team:VM Assigned to OTP team VM label Apr 24, 2023
@jhogberg
Copy link
Contributor

jhogberg commented Apr 24, 2023

Should call_ext_last really have 3 (and not 2) as its last argument?

No, but we don't care as an exception is thrown right away which makes the argument redundant. This lets us reduce code size by allowing jumps to exit BIFs regardless of how the stack looks, although I suppose the optimization has become less effective since line numbers were added. Perhaps we should revisit it.

We only do this for error/1,2, exit/1, throw/1, and primops that throw exceptions.

@jhogberg jhogberg self-assigned this Apr 24, 2023
@pguyot
Copy link
Contributor Author

pguyot commented Apr 24, 2023

@jhogberg I realize the beam_validator test fails if the ext function in call_last:b is changed from lists:seq/2 to erlang:throw/1.

This is referred to in this comment:

no ->
%% The compiler is allowed to emit garbage values for "Deallocate"
%% as we know that it will not be used in this case.
branch(?EXCEPTION_LABEL, Vst0, fun kill_state/1);

If I change the erlang:throw/1 call by altering the module or the function name, the validator rejects the asm. So I understand the proper asm here should be:

- {call_ext_last,1,{extfunc,erlang,throw,1},3}.
+ {call_ext_last,1,{extfunc,erlang,throw,1},2}.

Do you mean that for some optimization reason (?) erlc does not output a valid asm, and BEAM doesn't care?

@pguyot
Copy link
Contributor Author

pguyot commented Apr 25, 2023

For the record:

  • beam_trim optimization considers the sub-block (label) is safe (regarding stack frame changes) as the external function is known to not return. This happens here. There may be some cases where this allows for trim/2 instructions which wouldn't happen otherwise (not sure). It does not (currently) seem to be related to factorizing handlers and it does not yield a reduction in code size. As a side-effect, beam_trim optimization does not fix the deallocate argument of the call_ext_last/3 instruction.
  • BEAM (at least the emu flavor on master branch) doesn't care because call_ext_last/3 is rewritten to call_light_bif/2 followed by the faulty deallocate/1 which is never executed. Translation is here and here. This explains why it doesn't crash when the value is hand modified to something larger than the stack size.

Strictly speaking I am not sure we have to call this a bug as opcodes are not documented, but it will certainly require some workaround in AtomVM which doesn't rewrite call_ext_last/3 and currently always processes it as a "last" call, deallocating first.

pguyot added a commit to pguyot/AtomVM that referenced this issue Apr 25, 2023
BEAM's compiler and especially beam_trim optimization pass can generate an
incorrect deallocation `n_words` parameter for `call_ext_last/3`.
As a workaround, deallocate after the nif call, and before any error handling
that may need the stack to find catch handlers.

See:  erlang/otp#7152

Add test that currently crashes on master without this change

Signed-off-by: Paul Guyot <pguyot@kallisys.net>
@jhogberg
Copy link
Contributor

jhogberg commented Apr 25, 2023

  • There may be some cases where this allows for trim/2 instructions which wouldn't happen otherwise (not sure). It does not (currently) seem to be related to factorizing handlers and it does not yield a reduction in code size.

I just checked (scripts/diffable) and it significantly reduces stack sizes in some cases (and with that also loaded code size), so it seems the optimization is still worthwhile. It seems to mostly be places where the user has explicitly called throw and friends however which isn't super-common.

Do you mean that for some optimization reason (?) erlc does not output a valid asm, and BEAM doesn't care?

Sort of: we consider it valid in this specific case. :-)

I agree that it's ugly that the pass doesn't fix the stack size argument, but in some cases it can't and it won't be used either way, so we've just let it be.

bettio added a commit to atomvm/AtomVM that referenced this issue Apr 26, 2023
…_base

Add workaround for crash related to `call_ext_last/3`

BEAM's compiler and especially beam_trim optimization pass can generate an
incorrect deallocation `n_words` parameter for `call_ext_last/3`.
As a workaround, deallocate after the nif call, and before any error handling
that may need the stack to find catch handlers.

See:  erlang/otp#7152

Add test that currently crashes on master without this change

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue is reported as a bug team:VM Assigned to OTP team VM
Projects
None yet
Development

No branches or pull requests

4 participants