Major GC crashes on fragmented heap with large number of block size and first_fit policy #7815

vicuna · 2018-06-27T02:33:58Z

Original bug ID: 7815
Reporter: joris
Assigned to: @damiendoligez
Status: resolved (set by @xavierleroy on 2018-07-19T15:56:08Z)
Resolution: fixed
Priority: normal
Severity: crash
Platform: amd64
OS: linux
Version: 4.06.1
Target version: 4.07.1+dev/rc1
Fixed in version: 4.07.1+dev/rc1
Category: runtime system and C interface
Monitored by: @nojb @gasche @ygrek @jmeber

Bug description

Calling Gc.full_major (or simply major) after allocating thousands of block of different word size and freeing half of them with first_fit policy makes the runtime crash with no meaningful trace

Steps to reproduce

Compile the attached file with
opam install gperftools # install tcmalloc

ocamlfind ocamlopt -g -linkpkg -package gperftools minimal.ml

and run it. On my machine it also crashes with jemalloc with rounds = 3 and nr_blocks = 20000

opam install jemalloc
ocamlfind ocamlopt -g -linkpkg -package jemalloc_ctl minimal.ml

Additional information

Background:
This code is an un-natural example and it was obviously hand-crafted.

I came across this issue after trying to reproduce a bug in minor gc and caml_fl_allocate which under certain conditions makes minor gc run forever (or at least quadratically, but for half an hour on small heaps at least).

I was trying to fill the flp to see what would happen in this case. I don't know yet if those two issues are related

File attachments

minimal.ml

vicuna · 2018-06-27T13:09:00Z

Comment author: @stedolan

That's a nice reproduction case!

This bug is reproducible by running the bytecode interpreter under valgrind, which should make debugging easier. It seems to be independent of the choice of malloc, although it doesn't seem to actually segfault with the default glibc allocator.

In freelist.c, there's this loop:

    value buf [FLP_MAX];
    int j = 0;
    mlsize_t oldsz = sz;

    prev = flp[i];
    while (prev != flp[i+1]){
      cur = Next (prev);
      sz = Wosize_bp (cur);
      if (sz > prevsz){
        buf[j++] = prev;
        prevsz = sz;
        if (sz >= oldsz){
          CAMLassert (sz == oldsz);
          break;
        }
      }
      prev = cur;
    }

This example causes 'buf[j++] = prev' (line 345) to run more than FLP_MAX times, overflowing the buffer.

vicuna · 2018-07-11T14:55:46Z

Comment author: @damiendoligez

This is embarrassingly easy to fix, see #1896

@joris: do you want to be credited in the changelog with your real name?
@stedolan: would you like to review the fix?

vicuna · 2018-07-11T23:17:01Z

Comment author: joris

Awesome !

As you wish, i don't know what's the usual practice. In anycase my name is Joris Giovannangeli.

vicuna · 2018-07-19T15:56:08Z

Comment author: @xavierleroy

Commits 802ebbf (trunk) and 1bea41f (4.07 branch)

vicuna closed this as completed Jul 19, 2018

vicuna added the stdlib label Mar 14, 2019

vicuna added this to the 4.07.1 milestone Mar 14, 2019

vicuna assigned damiendoligez Mar 14, 2019

vicuna added the bug label Mar 20, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Major GC crashes on fragmented heap with large number of block size and first_fit policy #7815

Major GC crashes on fragmented heap with large number of block size and first_fit policy #7815

vicuna commented Jun 27, 2018 •

edited by nojb

Loading

vicuna commented Jun 27, 2018

vicuna commented Jul 11, 2018

vicuna commented Jul 11, 2018

vicuna commented Jul 19, 2018

Major GC crashes on fragmented heap with large number of block size and first_fit policy #7815

Major GC crashes on fragmented heap with large number of block size and first_fit policy #7815

Comments

vicuna commented Jun 27, 2018 • edited by nojb Loading

Bug description

Steps to reproduce

Additional information

File attachments

vicuna commented Jun 27, 2018

vicuna commented Jul 11, 2018

vicuna commented Jul 11, 2018

vicuna commented Jul 19, 2018

vicuna commented Jun 27, 2018 •

edited by nojb

Loading