Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix spurious major GC slices. #13086

Merged
merged 2 commits into from Apr 18, 2024
Merged

Conversation

damiendoligez
Copy link
Member

Reported to me by @stedolan.

Because of commit e6370d5 (memory.c) and PR #11750 we have some spurious major GC slices when a minor GC promotes more than 20% of the minor heap and a large allocation comes along before the next scheduled major slice (the one that happens when the minor heap is half full).

This is because the allocated_words counter will keep the amount of memory promoted by the minor GC until the next major slice takes it into account.

The consequence is that the large allocation will trigger a major slice on the spot, then the next scheduled major slice has very little work to do.

The solution is have a separate count for direct major allocations, and use it (instead of all major allocations) to trigger unscheduled major slices.

The problem is illustrated by the program found here: https://gist.github.com/damiendoligez/4d65d0ade50e6d0b2726e812a0eb7a14

Number of major slices (displayed by OCAMLRUNPARAM=v=0x40 ./a.out 2>&1 | grep '^allocated_words =' | wc):

version large_allocs=true large_allocs=false
trunk 3638 1879
this PR 1869 1879

Note that the amount of major GC work is not affected by this problem, but we still incur some overhead for starting the extra slices, and the latency profile is changed (the major slice pauses are closer to the minor GC pauses) so it's still worth fixing.

@gasche
Copy link
Member

gasche commented Apr 9, 2024

This could/should probably be a new flag for caml_shared_try_alloc, "do I come from the minor GC?", which would reduce the risk of getting this accounting wrong in the future if we call that function from more places

I convinced myself that the code says what it does, but how do we know that what it does is reasonable? (In particular, are there regressions on other workloads?)

@stedolan
Copy link
Contributor

This could/should probably be a new flag for caml_shared_try_alloc, "do I come from the minor GC?", which would reduce the risk of getting this accounting wrong in the future if we call that function from more places

I disagree. This is a change to GC policy: caml_alloc_shr should trigger extra major GC eventually while allocations from promotions should not. It belongs where the GC policy is defined (major_gc.c and memory.c), not in the allocator (shared_heap.c).

I convinced myself that the code says what it does, but how do we know that what it does is reasonable? (In particular, are there regressions on other workloads?)

IIRC, this change narrows an earlier fix that made too wide a change: the / 5 condition resolved an issue with a program that was doing a lot of direct major allocation, but the change affected all programs, even ones that did very little direct major allocation. This change means that the fix for such programs is more precisely targeted.

@kayceesrk kayceesrk merged commit f37847f into ocaml:trunk Apr 18, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants