Fix spurious major GC slices. #13086

damiendoligez · 2024-04-09T12:57:06Z

Reported to me by @stedolan.

Because of commit e6370d5 (memory.c) and PR #11750 we have some spurious major GC slices when a minor GC promotes more than 20% of the minor heap and a large allocation comes along before the next scheduled major slice (the one that happens when the minor heap is half full).

This is because the allocated_words counter will keep the amount of memory promoted by the minor GC until the next major slice takes it into account.

The consequence is that the large allocation will trigger a major slice on the spot, then the next scheduled major slice has very little work to do.

The solution is have a separate count for direct major allocations, and use it (instead of all major allocations) to trigger unscheduled major slices.

The problem is illustrated by the program found here: https://gist.github.com/damiendoligez/4d65d0ade50e6d0b2726e812a0eb7a14

Number of major slices (displayed by OCAMLRUNPARAM=v=0x40 ./a.out 2>&1 | grep '^allocated_words =' | wc):

version	large_allocs=true	large_allocs=false
trunk	3638	1879
this PR	1869	1879

Note that the amount of major GC work is not affected by this problem, but we still incur some overhead for starting the extra slices, and the latency profile is changed (the major slice pauses are closer to the minor GC pauses) so it's still worth fixing.

gasche · 2024-04-09T13:29:21Z

This could/should probably be a new flag for caml_shared_try_alloc, "do I come from the minor GC?", which would reduce the risk of getting this accounting wrong in the future if we call that function from more places

I convinced myself that the code says what it does, but how do we know that what it does is reasonable? (In particular, are there regressions on other workloads?)

stedolan · 2024-04-16T09:59:07Z

This could/should probably be a new flag for caml_shared_try_alloc, "do I come from the minor GC?", which would reduce the risk of getting this accounting wrong in the future if we call that function from more places

I disagree. This is a change to GC policy: caml_alloc_shr should trigger extra major GC eventually while allocations from promotions should not. It belongs where the GC policy is defined (major_gc.c and memory.c), not in the allocator (shared_heap.c).

I convinced myself that the code says what it does, but how do we know that what it does is reasonable? (In particular, are there regressions on other workloads?)

IIRC, this change narrows an earlier fix that made too wide a change: the / 5 condition resolved an issue with a program that was doing a lot of direct major allocation, but the change affected all programs, even ones that did very little direct major allocation. This change means that the fix for such programs is more precisely targeted.

stedolan approved these changes Apr 16, 2024

View reviewed changes

damiendoligez self-assigned this Apr 17, 2024

damiendoligez force-pushed the fix-spurious-slices branch from 2f8b342 to a2b5834 Compare April 17, 2024 14:07

damiendoligez added 2 commits April 17, 2024 16:12

Adjust accounting of allocations to avoid spurious major slices.

ca9b1af

Add Changes entry.

9bf1ec5

damiendoligez force-pushed the fix-spurious-slices branch from a2b5834 to 9bf1ec5 Compare April 17, 2024 14:13

damiendoligez added the merge-me label Apr 17, 2024

kayceesrk merged commit f37847f into ocaml:trunk Apr 18, 2024
17 checks passed

tmcgilchrist mentioned this pull request Apr 25, 2024

Regression with default GC settings between 4.14.2 and 5.1.1 #13123

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix spurious major GC slices. #13086

Fix spurious major GC slices. #13086

damiendoligez commented Apr 9, 2024

gasche commented Apr 9, 2024

stedolan commented Apr 16, 2024

Fix spurious major GC slices. #13086

Fix spurious major GC slices. #13086

Conversation

damiendoligez commented Apr 9, 2024

gasche commented Apr 9, 2024

stedolan commented Apr 16, 2024