-
-
Notifications
You must be signed in to change notification settings - Fork 33.1k
GH-139951: Fix major GC performance regression #140262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* Count number of actually tracked objects, instead of trackable objects. This ensures that untracking tuples has the desired effect of reducing GC overhead * Do not track most untrackable tuples during creation. This prevents large numbers of small tuples causing execessive GCs.
|
||
/* Fast, but conservative check if an object maybe tracked | ||
May return true for an object that is not tracked, | ||
Will alwys return true for an object that is tracked. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will alwys return true for an object that is tracked. | |
Will always return true for an object that is tracked. |
if (gcstate->young.count > gcstate->young.threshold && | ||
gcstate->enabled && | ||
gcstate->young.threshold && | ||
if (gcstate->enabled && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to check gcstate->young.threshold
here too, because documentation states that setting gcstate->young.threshold
to 0 disables GC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, we need test on this. I've created a separate PR to add it:
#140304
uintptr_t not_visited = 1 ^ gcstate->visited_space; | ||
gc->_gc_next = ((uintptr_t)generation0) | not_visited; | ||
generation0->_gc_prev = (uintptr_t)gc; | ||
gcstate->young.count++; /* number of allocated GC objects */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gcstate->young.count++; /* number of allocated GC objects */ | |
gcstate->young.count++; /* number of tracked GC objects */ |
if (gcstate->young.count > gcstate->young.threshold && | ||
gcstate->enabled && | ||
gcstate->young.threshold && | ||
if (gcstate->enabled && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (gcstate->enabled && | |
if (gcstate->enabled && | |
gcstate->young.threshold && |
if (gcstate->young.count > 0) { | ||
gcstate->young.count--; | ||
} | ||
gcstate->heap_size--; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want to change heap_size
here? Or maybe it'll be better to rename this to something like all_count
?
if (gcstate->work_to_do < 0) { | ||
return; | ||
} | ||
untrack_tuples(&gcstate->young.head); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, assess_work_to_do
set gc.young.count
to zero and if in untrack_tuples
we untrack some tuples we don't decrease counter in _PyObject_GC_UNTRACK
. On my sight, it is not very correct, but this gives us some room to subsequent trackings.
I'm not sure we should address this in this PR, but I think I should say this.
I'm not sure I get what android (x86_64) fails.
|
I'm not at all familiar with the garbage collector, but one of the ways that Android differs from the other platforms is that it runs all the test suite serially in a single process. |
This PR:
For the example in the original report this makes performance on main a bit better than 3.13.
Benchmarking results show this is about neutral on performance otherwise.