Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process_info/2 doesn't report some large appended binaries. #6573

Closed
markmeeus opened this issue Dec 18, 2022 · 5 comments
Closed

process_info/2 doesn't report some large appended binaries. #6573

markmeeus opened this issue Dec 18, 2022 · 5 comments
Assignees
Labels
bug Issue is reported as a bug team:VM Assigned to OTP team VM

Comments

@markmeeus
Copy link

markmeeus commented Dec 18, 2022

Describe the bug
process_info/2 doesn't report some large appended binaries. This behavior seems to be dependent on the size of the first binary in the append.
There is also a difference sometimes between calling process_info for self process vs another process. Where calling process_info for self usually has more chance of getting results.

To Reproduce
Compile this module

-module(bin_test).

-behaviour(gen_server).

-export([init/1, handle_call/3, handle_cast/2, test/1]).

init(_) ->
  {ok, #{bin => ""}}.

handle_call({add_binary, Size}, _, State) ->
  RandomBytes = crypto:strong_rand_bytes(Size),
  Bin = <<RandomBytes/binary, "a">>,
  NewState = maps:put(bin, Bin, State),
  timer:sleep(200 + rand:uniform(1000)), % To rule out internal racing
  check_binaries(self, Size, self()),
  {reply, ok, NewState}.

handle_cast(_, S) ->
  {noreply, S}.

test(From) ->
  {ok, Pid} = gen_server:start_link(?MODULE, nil, []),
  To = From + 16,
  lists:foreach(fun(Size) ->
    gen_server:call(Pid, {add_binary, Size}),
    check_binaries(other, Size, Pid)
  end, lists:seq(From, To)).

check_binaries(From, Size, Pid) ->
  {binary, Bins} = process_info(Pid, binary),
  % remove small bins, these are used to call io:fwrite/2
  Bins = lists:filter(fun({_, BinSize, _}) -> BinSize > 100 end, Bins),
  case Bins of
    [] -> io:fwrite("~p~n", [[From, Size, "emtpy!"]]);
    Bins -> io:fwrite("~p~n", [[From, Size, Bins]])
  end.

then run

bin_test:test(1024 * 1024 * 1024).

On my machine I get these results (consistently)

[self,1073741824,"emtpy!"]
[other,1073741824,"emtpy!"]
[self,1073741825,[{6730649640,1073741825,1}]]
[other,1073741825,[{6730649640,1073741825,1}]]
[self,1073741826,[{10381389864,1073741826,1}]]
[other,1073741826,[{10381389864,1073741826,1}]]
[self,1073741827,"emtpy!"]
[other,1073741827,"emtpy!"]
[self,1073741828,[{10381389864,1073741828,1}]]
[other,1073741828,"emtpy!"]
[self,1073741829,"emtpy!"]
[other,1073741829,"emtpy!"]
[self,1073741830,"emtpy!"]
[other,1073741830,"emtpy!"]
[self,1073741831,"emtpy!"]
[other,1073741831,"emtpy!"]
[self,1073741832,[{15105876008,1073741832,1}]]
[other,1073741832,"emtpy!"]
[self,1073741833,[{15105876008,1073741833,1}]]
[other,1073741833,"emtpy!"]
[self,1073741834,[{15105876008,1073741834,1}]]
[other,1073741834,[{15105876008,1073741834,1}]]
[self,1073741835,[{15105876008,1073741835,1}]]
[other,1073741835,[{15105876008,1073741835,1}]]
[self,1073741836,[{6730649640,1073741836,1}]]
[other,1073741836,[{6730649640,1073741836,1}]]
[self,1073741837,[{10381389864,1073741837,1}]]
[other,1073741837,[{10381389864,1073741837,1}]]
[self,1073741838,[{6730649640,1073741838,1}]]
[other,1073741838,[{6730649640,1073741838,1}]]
[self,1073741839,[{10381389864,1073741839,1}]]
[other,1073741839,[{10381389864,1073741839,1}]]
[self,1073741840,[{6730649640,1073741840,1}]]
[other,1073741840,[{6730649640,1073741840,1}]]

This line triggers the issue:
Bin = <<RandomBytes/binary, "a">>,
Replacing it with just
Bin = RandomBytes/binary,
does work as expected.

Expected behavior
To see at least the currently referenced large binaries reported.

Affected versions
Erlang/OTP 25 [erts-13.1.1] [source] [64-bit] [smp:16:16] [ds:16:16:10] [async-threads:1] [jit:ns] [dtrace]
I have seen this issue on older versions too.

Additional context
Running on an Intel Mac with macOS 12.6.

@markmeeus markmeeus added the bug Issue is reported as a bug label Dec 18, 2022
@markmeeus markmeeus changed the title process_info/2 with doesn't report appended some large appended binaries. process_info/2 with doesn't report some appended large appended binaries. Dec 18, 2022
@markmeeus markmeeus changed the title process_info/2 with doesn't report some appended large appended binaries. process_info/2 with doesn't report some large appended binaries. Dec 18, 2022
@markmeeus markmeeus changed the title process_info/2 with doesn't report some large appended binaries. process_info/2 doesn't report some large appended binaries. Dec 18, 2022
max-au added a commit to max-au/otp that referenced this issue Dec 18, 2022
When writable binaries were introduced in erlang#5195, they weren't
added to the output of `process_info(Pid, binary)`. It led to
interesting effects of binaries suddenly disappearing after
running implicit GC caused by BIF calls, including `process_info`
itself. Effectively, running `process_info(self(), binary)` twice
in a row resulted in different lists, because first call triggers
GC clearing out original binary from the off-heap list:

    start() ->
        OffHeapBin = crypto:strong_rand_bytes(1024 * 1024),
        Bin = <<OffHeapBin/binary, "a">>,
        Before = erlang:process_info(self(), [binary]),
        After = erlang:process_info(self(), [binary]),
        Before =/= After andalso erlang:display({wtf, Before, After}),
        Bin.

Fixes erlang#6573
@rickard-green rickard-green added the team:VM Assigned to OTP team VM label Dec 19, 2022
@markmeeus
Copy link
Author

markmeeus commented Dec 19, 2022

I did some investigating / debugging in the emulator today and became a bit wiser.
The append operation does not add the binary to the off_heap at all, but adds it to the wrt_bins instead.
The process_info implementation for binaries only iterates the off_heap, it does not count the wrt_bins.

Would it be correct to collect the wrt_bins in the process_info as well?

As to why it looks like they are reported under some circumstances, I found this comment in elr_bits.c erts_bs_private_append_checked:

     * Also make a new ProcBin as the old one may have been moved
     * from the 'wrt_bins' list to the regular 'off_heap' list by
     * the GC. To move it back would mean traversing the off_heap list
     * from the start. So instead create a new ProcBin for this
     * (hopefully) rare case.

So it might be the case that GC decides to move the no-longer-referenced ProcBin's from the wrt_bins to off_heap? Which is what makes them visible in to the process_info implementation?

@sverker
Copy link
Contributor

sverker commented Dec 20, 2022

Would it be correct to collect the wrt_bins in the process_info as well?

Yes, that is the fix that @max-au have done in #6574.

@sverker
Copy link
Contributor

sverker commented Dec 20, 2022

So it might be the case that GC decides to move the no-longer-referenced ProcBin's from the wrt_bins to off_heap? Which is what makes them visible in to the process_info implementation?

Yes, "inactive" writable binaries are moved from wrt_bins to off_heap by the GC in sweep_off_heap() in erl_gc.c.

The separate wrt_bins list was introduced as an optimization in 25.0, so this bug should not exist before that.

@markmeeus
Copy link
Author

I can confirm that the bug isn't present in 24.

@sverker
Copy link
Contributor

sverker commented Dec 21, 2022

Fixed by #6574.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue is reported as a bug team:VM Assigned to OTP team VM
Projects
None yet
Development

No branches or pull requests

3 participants