Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert MemBalancer GC pacer from 1.10 release #51498

Closed
d-netto opened this issue Sep 28, 2023 · 15 comments
Closed

Revert MemBalancer GC pacer from 1.10 release #51498

d-netto opened this issue Sep 28, 2023 · 15 comments
Assignees
Labels
GC Garbage collector status:priority This should be addressed urgently
Milestone

Comments

@d-netto
Copy link
Member

d-netto commented Sep 28, 2023

There are a few issues which are possibly related to the new GC pacer (e.g. and #50705 and #50956).

Furthermore, the implementation currently in 1.10-beta2 and master seems to deviate from what has been originally implemented in v8 (see discussion in #50909).

Due to these reasons, we discussed in triage that it might be better to de-risk the release by reverting the MemBalancer implementation, but keep the per-page memory accounting introduced in that PR.

CC: @NHDaly, @gbaraldi, @vchuravy

@d-netto d-netto added the GC Garbage collector label Sep 28, 2023
@d-netto d-netto added this to the 1.10 milestone Sep 28, 2023
@vchuravy vchuravy added the status:priority This should be addressed urgently label Sep 28, 2023
@PallHaraldsson
Copy link
Contributor

PallHaraldsson commented Sep 29, 2023

@d-netto it seems you're doing great work on the GC, and even with reverting this the GC in 1.10 will be very different from 1.9 (and also a bit different at least from 1.11).

Do you think this will be fixed and the MemBalancer will make it into 1.10, or some 1.10.x? (1.10 will become LTS I understand, maybe not though 1.10.0, and it seems bad if 1.11 and the LTS different regarding this/GC). I just heard of it and was looking at the paper, and it seems like a great idea for Julia (and GC in general, e.g. the web/V8, and in effect Julia and web browsers would "cooperate", without communication). [Do you know if it's in V8, that is Chrome, and Edge already, not just in a future stable version, since the paper is about a year old already.]

@NHDaly
Copy link
Member

NHDaly commented Sep 30, 2023

At JuliaCon, we talked about generally trying to move towards a more reliable release process for julia, with shorter and more consistent release cycles. Part of achieving that is to be stricter about holding the line on release protocol. Once we feature-freeze a release, we should not be porting new features back to it, we should only be backporting bug fixes. If a feature proves to be too complicated to fix, as in this case, we are going to try to be more diligent about dropping it from the release and trying again for the next release. I believe that is the plan here as well.

So no, i don't think membalancer will make it into 1.10.

Thankfully, as you point out, 1.10 comes with a bunch of great GC improvements! But improving julia's GC triggering heuristics will have to wait til the next release, I think. We still don't know whether membalancer is going to be right for julia or not, and we don't know what its performance characteristics will look like, since it's currently implemented incorrectly. Hopefully it does prove to be an effective choice though! :)


With all of that said; I'm not sure how the LTS figures into all of this. Maybe LTS releases can be more lenient with accepting new features / perf improvements in future point releases? I think this is less clear. Maybe something to reevaluate once a fixed membalancer lands on master.

@d-netto
Copy link
Member Author

d-netto commented Sep 30, 2023

@KristofferC: what is the standard process to revert a feature from a release?

More specifically, I see two possible options to revert #50144:

  • Directly revert from the 1.10 branch and open a PR to the 1.10 branch itself
  • Revert it from master, and then add a "Backport to 1.10" label

Which one is preferable?

@PallHaraldsson
Copy link
Contributor

I did not suggest delaying releasing 1.10.0, or getting the MemBalancer into it. I agree with shorter cycles. I suggest delaying new LTS until MemBalancer is in. It seems very important. When it is in 1.11 and stable, then it can go into 1.10.1 or 1.10.2, whatever we decide will be LTS (if anything...). It just seems bad to have a new LTS and just recently improving the GC and have two new implementation going forward that are not then the same.

I'm not sure about "more lenient" with LTS, I think the policy is only bugfixes and security fixes, for good reason.

@MarisaKirisame
Copy link

the author of MemBalancer paper here. lmk if there is anything I can help.

@vchuravy
Copy link
Sponsor Member

Revert it from master, and then add a "Backport to 1.10" label

Generally this. There are some cases where the former works better, but in this case revert on master, backport that and then reland at a later time.

@KristofferC
Copy link
Sponsor Member

It would be good to do this in a somewhat "timely manner" because we really want to start releasing RCs but it isn't really wise to do so in case the GC implementation is to still significantly change (due to the revert).

@vtjnash
Copy link
Sponsor Member

vtjnash commented Oct 2, 2023

I think we just wanted to undo the backport, but not remove the code from master, right?

@ufechner7
Copy link

Related: #51589

@NHDaly
Copy link
Member

NHDaly commented Oct 12, 2023

@vtjnash that's what I thought / meant when we proposed this, but apparently the 1.9 heuristics are considered broken / a bug, which was why this was backported in the first place.

The GC team is proceeding with two main alternatives:

  • Reverting all the way back to the 1.8 GC heuristics (and therefore fixing the bug on 1.9)
  • Gabriel's proposed new GOGC-style GC. But we're concerned this is too new and may have its own issues, so I much prefer option 1. I think the whole team is on board with that, assuming that we don't find any bad interactions with the more recent code on this release and the older heuristics.

@PallHaraldsson
Copy link
Contributor

"GOGC-style GC"? What is that? I think you mean the MemBalancer, from Google's Blink in Chrome/-ium? Just making sure you meant "Go[ogle] GC", not Go's GC, and Google's Go language has a GOGC env variable...

@oscardssmith
Copy link
Member

no. This is go the language.

@PallHaraldsson
Copy link
Contributor

Ok good to know. I believe Go's GC is good or even excellent. But what is Go GC style? I mean I can look it up, what are the actual main differences? I haven't kept track Go's GC is likely also improving over time. Do you know if it has MemBalancer hysterics (yes)? Are you only meaning copying its or more of its full GC implementation?

@NHDaly
Copy link
Member

NHDaly commented Oct 12, 2023

I was referring to Gabriel's PR here: #51564

Sorry for the lack of details on github, we've been talking about this on slack as well and i was sloppy in my communication.

d-netto added a commit that referenced this issue Oct 20, 2023
The 1.10 GC heuristics introduced in
#50144 have been a source of
concerning issues such as
#50705 and
#51601. The PR also doesn't
correctly implement the paper on which it's based, as discussed in
#51498.

Test whether the 1.8 GC heuristics are a viable option.
@d-netto
Copy link
Member Author

d-netto commented Oct 20, 2023

Fixed by #51661.

We reverted to 1.8 heuristics on the 1.10 branch.

@d-netto d-netto closed this as completed Oct 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GC Garbage collector status:priority This should be addressed urgently
Projects
None yet
Development

No branches or pull requests

9 participants