Skip to content

Conversation

@sadiqj
Copy link
Contributor

@sadiqj sadiqj commented Dec 22, 2023

Marked as draft because we may not want to merge. If there's consensus I'll add a few tests to the PR.

This PR makes Gc.compact do a full major cycle before compacting and adds a Gc.quick_compact which only does a single one.

The former ensures that everything unreachable before Gc.compact is called will be swept and it's space potentially used to relocate blocks during compaction. In the latter, this is not the case, unreachable blocks before Gc.quick_compact won't be swept before compaction. This obviously results in a bigger heap afterwards. The trade-off is that there's a two thirds reduction in the number of heap traversals required. While it varies substantially by heap composition, I have very sparse heap benchmarks where Gc.quick_compact takes only a third of the time that Gc.compact does.

With this change Gc.compact should work similarly to Gc.compact from 4.x (and matches the current doc on Gc.mli!) - not doing so has already caught out some early users.

If we are going to do this, we should make sure this is part of 5.2 . (@Octachron )

probably of interest to @mshinwell @kayceesrk @damiendoligez

@gasche
Copy link
Member

gasche commented Dec 22, 2023

Backseat naming comment: I don't think that quick_compact is in fact quick by any reasonable metric (it traverses the whole heap), so I would rather call this compact and full_compact.

Obvious beginner question: should people just write Gc.full_major (); Gc.compact () if they know they want the full trip? (Answer from reading the code: that results in 4 GC cycles instead of 3.)

@mshinwell
Copy link
Contributor

I think it would be a significant improvement to leave the semantics of Gc.compact unchanged with regard to the running of finalisers compared to OCaml 4. From various pieces of code I have seen recently that caused failures with the OCaml 5 runtime, there seems to be a widespread expectation that this should be the case, and to my mind that seems completely reasonable. Compaction is expected to be expensive if triggered manually, and it is often triggered manually because there is a specific need to clean up. We shouldn't make that process harder for people.

"quick_compact" seems idiomatic enough to me (although I'm not sure this is an operation I would ever really use).

@gasche
Copy link
Member

gasche commented Dec 22, 2023

Ah, so the concern here is to remain compatible with the 4.x naming convention, which had only compact and was doing a full compaction. Sorry for missing this. In that case I guess Gc.compact is there to stay.

@kayceesrk
Copy link
Contributor

kayceesrk commented Dec 23, 2023

I think it would be a significant improvement to leave the semantics of Gc.compact unchanged with regard to the running of finalisers compared to OCaml 4.

Backwards compatibility is a good reason to have Gc.compact do a full major cycle. Users are surprised by the change in behaviour, and it is not obvious that the semantics has changed.

For Gc.quick_compact, is there a good reason in the wild to use this?

@sadiqj
Copy link
Contributor Author

sadiqj commented Dec 28, 2023

For Gc.quick_compact, is there a good reason in the wild to use this?

I think for long running workloads where you have realistically gone through many major collections it might be reasonable if you can't afford the cost of three synchronous stw collections. Compaction is currently the only way we release memory to the OS.

That said, there is an argument for making Gc.compact do a full major cycle in 5.2 and not adding quick_compact unless there's an explicit request from users.

Either way, I've added a test and some runtime events for gc_compact should we want it.

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the discussion, I believe that the proposed change makes sense and that exposing the quick version is reasonable. The implementation now looks very similar to full_major, which makes more sense to me. Approved.

I am still not fond of the word quick for something that is not actually quick. I thought about partial_compact, but this may give the wrong intuition that we only compact a part of the major heap (while in fact we compact everything, but based on some stale information on what is and isn't free). Maybe just compact_without_full_major?

@jberdine
Copy link
Contributor

FWIW I like the change to make the semantics of compact align with 4.x. I also like the compact_without_full_major name, it is clear and won't be used nearly often enough to warrant a short name.

@kayceesrk
Copy link
Contributor

compact_without_full_major sounds clear to me.

@sadiqj
Copy link
Contributor Author

sadiqj commented Dec 29, 2023

compact_without_full_major sounds good to me. I'll get the PR changed and mark it as non-draft.

@wikku
Copy link
Contributor

wikku commented Dec 29, 2023

val compact : ?full_major:bool -> unit -> unit is also a possibility.

@dra27
Copy link
Member

dra27 commented Jan 5, 2024

The optional argument seems good to me (I might possibly suggest with_full_major as the name?)

@sadiqj
Copy link
Contributor Author

sadiqj commented Jan 5, 2024

So one good argument against compact_without_full_major (or an optional argument full_major) is that it exposes the specific details of the GC to a user which means we can't really change functionality later without either breaking things or making the API even more difficult to understand.

For example, we may want to make compact more expensive in order to coalesce smaller mappings in to huge mappings on supported OSes - we may not want to do that for users who want a quick compaction.

So I'm leaning back towards quick_compact or compact: ?quick:bool -> unit -> unit

@gasche
Copy link
Member

gasche commented Jan 5, 2024

Maybe "incomplete" rather than "quick"?

@gasche
Copy link
Member

gasche commented Jan 5, 2024

Or "approx", "best_effort", etc.

@lpw25
Copy link
Contributor

lpw25 commented Jan 5, 2024

I think it is worth noting that the semantics of a "quick_compact" are genuinely different from a compact. Manual calls to compact are often done just after the user has finished a memory hungry phase of a program, and it's probably important to use compact there since they probably only just made the relevant memory dead. When would someone want "quick_compact"? I guess if they just have a spare moment and would like to return some memory to the OS if possible, but they don't have a particular reason to think they've just stopped using a lot of memory.

Maybe we can com up with a name unrelated to compact that captures that use case. Something like release_garbage ?

(Also -1 to all uses of default arguments)

@gasche
Copy link
Member

gasche commented Jan 5, 2024

What I hear is that we have neither a consensual API for "the other compact" nor a strong use-case. I think that the PR could move forward by splitting that part for later.

@gasche
Copy link
Member

gasche commented Jan 5, 2024

@lpw25 there may be programs that have a steady regime made of large iterations, that want to release memory after each iteration. The goal is to ensure that if there is an infrequent memory usage spike, it does not affect the average memory usage. In that case the incomplete compaction is fine -- it may take two iterations to recover from a spike instead of one, but the throughput gains could be worth it.

@sadiqj
Copy link
Contributor Author

sadiqj commented Jan 5, 2024

What I hear is that we have neither a consensual API for "the other compact" nor a strong use-case. I think that the PR could move forward by splitting that part for later.

I'm in agreement here. I'll change the PR so it does a full major for Gc.compact and we can re-visit a quicker variant if there's a request for it from users.

this matches user expectations from 4's Gc.compact
@sadiqj sadiqj marked this pull request as ready for review January 9, 2024 20:46
@sadiqj
Copy link
Contributor Author

sadiqj commented Jan 9, 2024

Right, two things to note here:

https://github.com/ocaml/ocaml/pull/12859/files#diff-2459310f73d3e612f536dfeaade8c2a0f476f8a8a0901b66d34954814e899fd6L268

I'm not convinced we need to empty the minor heap before calling finish_major_cycle as we empty the minor heaps as one of the first things we do in stw_cycle_all_domains:

caml_empty_minor_heap_no_major_slice_from_stw

This logic is subtle though, so it would be good for someone else to double-check. I'm also happy to leave the extra minor collection and defer to another PR.

The second thing to note is https://github.com/ocaml/ocaml/pull/12859/files#diff-dc06e9c510205e393752543cd6f306fc590813435584b4ea7fed68abc3c5a72cR4

Without that, Gc.compact ends up doing 4 major cycles. It seems after the first major cycle, we have a pending action to do another slice which then triggers another cycle. Curiously if I do Gc.major rather than Gc.full_major then we correctly do 3 in Gc.compact in bytecode but not in native. We probably want Gc.major to clear a pending major slice flag but this PR didn't feel like the place to do that.

@gasche gasche merged commit 8e280b3 into ocaml:trunk Jan 9, 2024
gasche pushed a commit that referenced this pull request Jan 9, 2024
ensure Gc.compact does a full major before the compactor runs
this matches user expectations from 4's Gc.compact

(cherry picked from commit 8e280b3)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants