Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocating custom blocks become expensive with large major heaps #7676

Closed
vicuna opened this Issue Nov 22, 2017 · 4 comments

Comments

Projects
None yet
2 participants
@vicuna
Copy link
Collaborator

vicuna commented Nov 22, 2017

Original bug ID: 7676
Reporter: @alainfrisch
Assigned to: @damiendoligez
Status: resolved (set by @damiendoligez on 2018-11-13T12:31:33Z)
Resolution: fixed
Priority: normal
Severity: minor
Fixed in version: 4.08.0+dev/beta1/beta2
Category: runtime system and C interface
Related to: #4108 #6133 #6294 #6962
Monitored by: @ygrek @dbuenzli @yakobowski

Bug description

Custom blocks allocated in or moved to the major heap add some "pressure" to trigger a major GC slice. The current criterion is such that a GC slice is triggered more often if the size of the major heap grows. This can result is noticeable slowdown.

The function implementing this strategy is:

CAMLexport void caml_adjust_gc_speed (mlsize_t res, mlsize_t max)
{
  if (max == 0) max = 1;
  if (res > max) res = max;
  caml_extra_heap_resources += (double) res / (double) max;
  if (caml_extra_heap_resources > 1.0){
    CAML_INSTR_INT ("request_major/adjust_gc_speed_1@", 1);
    caml_extra_heap_resources = 1.0;
    caml_request_major_slice ();
  }
  if (caml_extra_heap_resources
           > (double) caml_minor_heap_wsz / 2.0
             / (double) caml_stat_heap_wsz) {
    CAML_INSTR_INT ("request_major/adjust_gc_speed_2@", 1);
    caml_request_major_slice ();
  }
}

The second criterion becomes easier and easier to satisfy as the size of the major heap grows. It is not clear to me why the ratio of the size of the minor and major heaps is taken into account, nor why the constant 2 was chosen.

See below for a program that illustrates the behavior. It runs two computations starting from different heap sizes. The first one allocates a large block (in the major heap), then does many small allocations (to trigger a minor GC). The second one create an in_channel (which is a custom block, with a "pressure ratio" of 1/1000, i.e. one asks to force a full GC cycle every 1000 allocations). The timings are:

n =     2000 => 0.22s / 0.23s
n =     4000 => 0.22s / 0.35s
n =     6000 => 0.23s / 0.47s
n =     8000 => 0.22s / 0.73s
n =    10000 => 0.26s / 0.88s
n =    12000 => 0.23s / 1.06s
n =    14000 => 0.23s / 1.02s
n =    16000 => 0.24s / 1.19s
n =    18000 => 0.23s / 1.35s
n =    20000 => 0.25s / 1.30s
n =    22000 => 0.28s / 1.55s
n =    24000 => 0.25s / 1.83s
n =    26000 => 0.26s / 1.60s
n =    28000 => 0.23s / 1.88s
n =    30000 => 0.23s / 1.82s
n =    32000 => 0.25s / 2.79s
n =    34000 => 0.25s / 2.12s
n =    36000 => 0.23s / 2.66s
n =    38000 => 0.21s / 2.52s
n =    40000 => 0.25s / 2.38s

n is proportional to the size of the initial heap. One sees clearly that the speed of the first computation does not depend on the heap size, while the speed of the second one is directly impacted.

The marginal cost of opening a channel raises to about 2ms and would continue to grow with larger major heaps.

This specific problem with in_channel could be addressed in part by reducing the ratio 1/1000, but the real issue here might be with the criterion above, which impacts all kinds of custom objects (that add "pressure").

Additional information

let a = ref [||]

let test n =
  (* Force the major heap to grow *)
  Gc.compact ();
  a := Array.init n (fun _ -> Array.make 10000 0);
  Gc.compact ();
  let t0 = Sys.time () in
  (* First, try with allocating an array -> the speed of
     this is independent of the initial size of the major heap. *)
  for i = 1 to 1000 do
    let a = Array.make 4096 0 in
    (* Enough allocation to trigger a minor GC *)
    for _j = 1 to 1024 do ignore (Array.make 256 0) done;
    a.(0) <- i
  done;
  let t1 = Sys.time () in
  Gc.compact ();
  let t2 = Sys.time () in
  (* Second, try with allocating a custom (here, an in_channel) ->
     this becomes slower as the heap grows. *)
  for i = 1 to 1000 do
    let ic = open_in "foobar" in
    (* Enough allocation to trigger a minor GC *)
    for _j = 1 to 1024 do ignore (Array.make 256 0) done;
    close_in ic;
  done;
  let t3 = Sys.time () in
  Printf.printf "n = % 8i => %.02fs / %.02fs\n%!" n (t1 -. t0) (t3 -. t2)


let () =
  let oc = open_out "foobar" in close_out oc;
  for i = 1 to 20 do
    test (i * 2000)
  done
@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Dec 20, 2017

Comment author: @xavierleroy

Is this another incarnation of the familiar "allocation of custom objects mess up the speed of the major GC" problem? Or a distinct, new problem?

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Dec 21, 2017

Comment author: @alainfrisch

Is this another incarnation of the familiar "allocation of custom objects mess up the speed of the major GC" problem? Or a distinct, new problem?

Of course it's in the same family of problems, but I believe it's distinct from other reports. What I find specifically weird here is the second condition:

(caml_extra_heap_resources > (double) caml_minor_heap_wsz / 2.0 / (double) caml_stat_heap_wsz)

which can ultimately lead to a major GC for every allocation of a custom block. I don't understand the rationale behind computing the ratio of the size of the two heaps and comparing it with "caml_extra_heap_resources".

The first condition, OTOH, seems well understood (including its problems), and corresponds to the documented behavior.

Other tickets are about:

  • tracking custom blocks in the minor heap
  • allowing to specify the pressure on the GC as a ratio of the major heap size
  • making it easy for libraries to give control to their client code about the ratio to be used for its custom blocks
@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Nov 9, 2018

Comment author: @alainfrisch

The sample program now exihibits a much nicer behavior thanks to #1738 . That said, I believe it would still be worth revisiting the condition

if (caml_extra_heap_resources
> (double) caml_minor_heap_wsz / 2.0
/ (double) caml_stat_heap_wsz)

which looks super weird to me, so I'm leaving this ticket open.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Nov 13, 2018

Comment author: @damiendoligez

#2144

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.