Fix wrong gc join in BGC mark phase#126389
Conversation
There was a wrong join struct instance being used in the `background_mark_phase`. Instead of `bgc_t_join`, it was using `gc_t_join` that is meant for regular mark phase usage. It caused hangs during GC in some edge cases, for example for GCPerfSim invocation with the following command line: GCPerfSim.dll -tc 4 -tagb 540 -tlgb 0 -lohar 1000 -pohar 0 -sohsr 100-4000 -lohsr 102400-204800 -pohsr 100-204800 -sohsi 0 -lohsi 0 -pohsi 0 -sohpi 0 -lohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time The issue was introduced by the change that added the JAVA GC bridge. Close dotnet#122996
|
Tagging subscribers to this area: @agocke, @dotnet/gc |
There was a problem hiding this comment.
Pull request overview
Fixes a GC hang in gc_heap::background_mark_phase() by using the correct background-GC join structure for the Java GC bridge processing join/restart sequence, aligning BGC synchronization with the rest of background.cpp.
Changes:
- Replace
gc_t_joinwithbgc_t_joinfor thegc_join_bridge_processingbarrier inbackground_mark_phase(). - Ensure BGC threads are joined/restarted using the BGC-specific join instance during bridge object processing.
|
Is this a regression in 10? |
|
This is not really a regression since the gc bridge support was added on .net10. This code is under FEATURE_JAVAMARSHAL which is enabled only on android and debug builds. Still, backporting is reasonable to me. |
I do not see valid justification for a backport. |
|
You are right. I didn't realize initially that this code is server gc only. |
There was a wrong join struct instance being used in the
background_mark_phase. Instead ofbgc_t_join, it was usinggc_t_jointhat is meant for regular mark phase usage. It caused hangs during GC in some edge cases, for example for GCPerfSim invocation with the following command line:GCPerfSim.dll -tc 4 -tagb 540 -tlgb 0 -lohar 1000 -pohar 0 -sohsr 100-4000 -lohsr 102400-204800 -pohsr 100-204800 -sohsi 0 -lohsi 0 -pohsi 0 -sohpi 0 -lohpi 0 -sohfi 0 -lohfi 0 -pohfi 0 -allocType reference -testKind time
The issue was introduced by the change that added the JAVA GC bridge.
Close #122996