-
-
Notifications
You must be signed in to change notification settings - Fork 29.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-33597: Reduce PyGC_Head size #7043
Conversation
Include/objimpl.h
Outdated
g->gc.gc_prev = _PyGC_generation0->gc.gc_prev; \ | ||
g->gc.gc_prev->gc.gc_next = g; \ | ||
_PyGC_generation0->gc.gc_prev = g; \ | ||
_PyGCHead_SET_PREV(g, _PyGC_generation0->gc.gc_prev); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems _PyGC_generation0->gc.gc_prev
can have the lowest bits set. This will trigger an assertion in _PyGCHead_SET_PREV()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only Python objects use lower bits. Link header (PyGC_Head without Python object) must not have flags.
(Surely, I need to add more comments.)
Include/objimpl.h
Outdated
g->gc.gc_next->gc.gc_prev = g->gc.gc_prev; \ | ||
assert(g->gc.gc_next != NULL); \ | ||
_PyGCHead_PREV(g)->gc.gc_next = g->gc.gc_next; \ | ||
_PyGCHead_SET_PREV(g->gc.gc_next, _PyGCHead_PREV(g)); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_PyGCHead_PREV(g)
is called twice. Wouldn't be better to cache the result?
Modules/gcmodule.c
Outdated
next = gc->gc.gc_next; | ||
if (PyTuple_CheckExact(op)) { | ||
_PyTuple_MaybeUntrack(op); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since gc->gc.gc_next->gc.gc_prev
is not real prev pointer, we can't call _PyTuple_MaybeUntrack(op)
here.
I moved it to split function. But it means this pull request adds one more link traversal.
By the way, current code is not idiomatic. While most tuples can be untracked, there are still many tuples
which arn't be untracked. For example, __mro__
or __bases__
.
In final generation, we check all tuples contents. In applications which triggers final generation GC frequently,
and there are many tuples, it can be significant overhead.
Anyway, I need to find "tracked tuple heavy" application to benchmark before optimize here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If final generation GC is triggered frequently, it means our heuristic is inadequate. Full collections can be very expensive.
Modules/gcmodule.c
Outdated
@@ -227,20 +267,38 @@ append_objects(PyObject *py_list, PyGC_Head *gc_list) | |||
return 0; | |||
} | |||
|
|||
#if GC_DEBUG | |||
static void | |||
validate_list(PyGC_Head *head, uintptr_t expected_mask) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pitrou @serhiy-storchaka
How do you think this function? Should I remove these checks and GC_DEBUG flag?
This validation was useful while hacking, and I think it's good document about when MASK flags are set and cleared.
But I can replace them to just a line of comment, of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with this function. Have you tried to always enable it in debug mode, or is it too slow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I enabled it while I'm debugging. It made test significantly slower.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. The performance seems acceptable, and reducing the memory footprint can only be a good idea :-)
Include/objimpl.h
Outdated
Py_FatalError("GC object already tracked"); \ | ||
_PyGCHead_SET_REFS(g, _PyGC_REFS_REACHABLE); \ | ||
assert((g->gc.gc_prev & 6) == 0); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
& 6
? Why?
Include/objimpl.h
Outdated
g->gc.gc_next->gc.gc_prev = g->gc.gc_prev; \ | ||
PyGC_Head *prev = _PyGCHead_PREV(g); \ | ||
assert(g->gc.gc_next != NULL); \ | ||
prev->gc.gc_next = g->gc.gc_next; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... I wonder if the look would less weird if we also had _PyGCHead_NEXT
and _PyGCHead_SET_NEXT
macros.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I skip using _PyGCHead_PREV
and _PyGCHead_SET_PREV
when flags must not be set. (i.e. list header) too.
Include/objimpl.h
Outdated
g->gc.gc_next = NULL; \ | ||
g->gc.gc_prev &= _PyGC_PREV_MASK_FINALIZED; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that the same as _PyGCHead_SET_PREV(g, NULL)
? The latter seems clearer to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bit 1 must be kept after untracking.
But bit 2 and 3 must be cleared when untracking, because this macro may be called while GC
(i.e. weakref callback or tp_finalize).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not adding a macro setting the pointer and flags at once? _PyGCHead_SET_PREV_FLAGS(g->gc.gc_next, prev, flags).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it's not useful. gc_prev is used as:
- Set prev pointer, without touching any flags. (_PyTrash_deposit_object and _PyObject_GC_TRACK)
- Set prev pointer to list header node. Flags are not used in list header. (_PyObject_GC_TRACK)
- Clear prev pointer and flags, but keep only
_PyGC_PREV_MASK_FINALIZED
. (only here.) - Clear gc_prev entirely.
= (uintptr_t)0
.
Include/objimpl.h
Outdated
union _gc_head *gc_prev; | ||
Py_ssize_t gc_refs; | ||
union _gc_head *gc_next; // NULL means the object is not tracked | ||
uintptr_t gc_prev; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a small comment here explaining this field?
This was impossible... Remaining ideas are:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, here is a real review. When I wrote my first LGTM, I didn't read carefully the code, but the code also changed a lot since I reviewed it.
Include/objimpl.h
Outdated
union _gc_head *gc_prev; | ||
Py_ssize_t gc_refs; | ||
union _gc_head *gc_next; // NULL means the object is not tracked | ||
uintptr_t gc_prev; // Pointer to previous object in the list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I read gc_prev, I expect it to only contain a pointer to the previous entry, but it contains two information. Would it be possible to find a better name? (gc_prev might be fine) Maybe "gc_prev_flags"?
Maybe add "_" prefix to "explain" that the field must be accessed through macros? (ex: _gc_prev_flags)
If you rename the field, it makes sure that applications accessing (read or... write!) directly gc_prev will break which is a good thing IMHO.
Can you also please document which flags are stored in this field? Finalized, collecting and tentatively unreachable. Mention maybe that the two last are only used internally and so are not accessible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also please document which flags are stored in this field? Finalized, collecting and tentatively unreachable. Mention maybe that the two last are only used internally and so are not accessible?
Finalized is documented below. And I don't want document other two bits in this file
because they are tightly coupled with GC algorithm. I added document in gcmodule.c instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed member name to _gc_prev and _gc_next.
I don't want 3rd party touches them directly.
Include/objimpl.h
Outdated
#define _PyObject_GC_TRACK(o) do { \ | ||
PyGC_Head *g = _Py_AS_GC(o); \ | ||
if (_PyGCHead_REFS(g) != _PyGC_REFS_UNTRACKED) \ | ||
if (g->gc.gc_next != NULL) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: you may take this change as an opportunity to add { ... } (PEP 7) :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Include/objimpl.h
Outdated
Py_ssize_t gc_refs; | ||
union _gc_head *gc_next; // NULL means the object is not tracked | ||
uintptr_t gc_prev; // Pointer to previous object in the list. | ||
// Lowest three bits are used for flags. | ||
} gc; | ||
double dummy; /* force worst-case alignment */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hum, I think that alignment using sizeof(double) comes the C standard. If I'm right, it might help to document it here :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is separated issue, see https://bugs.python.org/issue33589
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh, it's not separated issue. Since I this pull request removes last Py_ssize_t
member from gc, there are only two pointers (or integer having same size to pointer) in the struct.
It must be aligned 8byte on all 32bit and 64bit platforms. dummy can be removed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed dummy, and stop using union too.
Include/objimpl.h
Outdated
#define _PyGCHead_SET_PREV(g, p) do { \ | ||
assert(((uintptr_t)p & ~_PyGC_PREV_MASK) == 0); \ | ||
(g)->gc.gc_prev = ((g)->gc.gc_prev & ~_PyGC_PREV_MASK) \ | ||
| ((uintptr_t)(p)); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a macro for (g->gc.gc_prev & _PyGC_PREV_MASK_INTERNAL)? Maybe #define _PyGCHead_REFS() or #define _PyGCHead_FLAGS()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's worth enough...
Include/objimpl.h
Outdated
g->gc.gc_next = NULL; \ | ||
g->gc.gc_prev &= _PyGC_PREV_MASK_FINALIZED; \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not adding a macro setting the pointer and flags at once? _PyGCHead_SET_PREV_FLAGS(g->gc.gc_next, prev, flags).
Modules/gcmodule.c
Outdated
} | ||
|
||
static inline void | ||
gc_clear_masks(PyGC_Head *g) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't clear a mask but flags, no? Maybe: gc_clear_internal_flags()?
Maybe rename "internal flags" to "collection flags"? I don't know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I removed MASK_TENTATIVELY_UNREACHABLE
from gc_prev, I renamed this to gc_clear_collecting
.
Modules/gcmodule.c
Outdated
|
||
/*** list functions ***/ | ||
|
||
static void | ||
gc_list_init(PyGC_Head *list) | ||
{ | ||
list->gc.gc_prev = list; | ||
list->gc.gc_prev = (uintptr_t)list; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to use a macro here since the field is special. _PyGCHead_SET_PREV_FLAGS(list, 0)?
Ditto below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code doesn't mean "clear flags". List header node (gc_list) doesn't have any flags.
So gc_prev is really pure pointer without any flags here.
I'll add one line comment here too.
Modules/gcmodule.c
Outdated
} | ||
|
||
static inline void | ||
gc_set_refs(PyGC_Head *g, Py_ssize_t v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"v" should be "refs" no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
Modules/gcmodule.c
Outdated
} | ||
|
||
static inline void | ||
gc_reset_refs(PyGC_Head *g, Py_ssize_t v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename "v" to "refs"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
Modules/gcmodule.c
Outdated
|| gc_refs == GC_UNTRACKED); | ||
} | ||
else { | ||
assert(gc_refs > 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why flags are no longer tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because there are no GC_REACHABLE
and GC_UNTRACKED
now.
Instead, there are "early return" above.
+ if (gc->gc.gc_next == NULL || !gc_is_collecting(gc)) {
+ return 0
gc_refs == GC_REACHABLE
meant the object is tracked but not in current gc_list. It is !gc_is_collecting(gc)
now.
gc_refs == GC_UNTRACKED
meant the object is not tracked. It is gc->gc.gc_next == NULL
now.
When you're done making the requested changes, leave the comment: |
I found Now I think keeping two implementation is not worth enough. |
It's not only about Windows, other OSes can do it too (perhaps Linux does?). |
You're right... Linux can use 3GB for user virtual address. And it's default for x86. Then, (a) Use int64_t instead of intpr_t, or (b) keep two implementation... I think I should try (b). |
I found UNREACHABLE flag is only needed in move_unreachable() essentially. Now we require 4bytes aligned pointer, and support 1G-1 refcnt on 32bit (4bytes for each pointer) platform. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL
Include/objimpl.h
Outdated
union _gc_head *gc_prev; | ||
Py_ssize_t gc_refs; | ||
union _gc_head *gc_next; // NULL means the object is not tracked | ||
uintptr_t gc_prev; // Pointer to previous object in the list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed member name to _gc_prev and _gc_next.
I don't want 3rd party touches them directly.
Include/objimpl.h
Outdated
Py_ssize_t gc_refs; | ||
union _gc_head *gc_next; // NULL means the object is not tracked | ||
uintptr_t gc_prev; // Pointer to previous object in the list. | ||
// Lowest three bits are used for flags. | ||
} gc; | ||
double dummy; /* force worst-case alignment */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed dummy, and stop using union too.
Include/objimpl.h
Outdated
#define _PyGCHead_SET_PREV(g, p) do { \ | ||
assert(((uintptr_t)p & ~_PyGC_PREV_MASK) == 0); \ | ||
(g)->gc.gc_prev = ((g)->gc.gc_prev & ~_PyGC_PREV_MASK) \ | ||
| ((uintptr_t)(p)); \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's worth enough...
Include/objimpl.h
Outdated
#define _PyObject_GC_TRACK(o) do { \ | ||
PyGC_Head *g = _Py_AS_GC(o); \ | ||
if (_PyGCHead_REFS(g) != _PyGC_REFS_UNTRACKED) \ | ||
if (g->gc.gc_next != NULL) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Modules/gcmodule.c
Outdated
} | ||
|
||
static inline void | ||
gc_clear_masks(PyGC_Head *g) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I removed MASK_TENTATIVELY_UNREACHABLE
from gc_prev, I renamed this to gc_clear_collecting
.
Modules/gcmodule.c
Outdated
} | ||
|
||
static inline void | ||
gc_set_refs(PyGC_Head *g, Py_ssize_t v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
Modules/gcmodule.c
Outdated
} | ||
|
||
static inline void | ||
gc_reset_refs(PyGC_Head *g, Py_ssize_t v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
Modules/gcmodule.c
Outdated
(visitproc)visit_reachable, | ||
(void *)young); | ||
gc_set_prev(gc, prev); | ||
gc->gc.gc_prev &= ~MASK_COLLECTING; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added comment.
Modules/gcmodule.c
Outdated
} | ||
gc = next; | ||
} | ||
young->gc.gc_prev = (uintptr_t)prev; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added.
Modules/gcmodule.c
Outdated
if (gc_get_refs(gc) != 0) { | ||
ret = -1; | ||
} | ||
_PyGCHead_SET_PREV(gc, prev); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doubly-linked list is broken and restored in this function. So no need to add it in function description.
+ // Use gc_refs and break gc_prev again.
comment is above in this function.
Include/objimpl.h
Outdated
#define _PyGCHead_NEXT(g) ((PyGC_Head*)(g)->_gc_next) | ||
#define _PyGCHead_SET_NEXT(g, p) ((g)->_gc_next = (uintptr_t)(p)) | ||
|
||
// Lowest two bits of _gc_prev is used for flags described below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below? Do you mean above?
Objects/object.c
Outdated
@@ -2121,8 +2122,7 @@ _PyTrash_destroy_chain(void) | |||
PyObject *op = _PyRuntime.gc.trash_delete_later; | |||
destructor dealloc = Py_TYPE(op)->tp_dealloc; | |||
|
|||
_PyRuntime.gc.trash_delete_later = | |||
(PyObject*) _Py_AS_GC(op)->gc.gc_prev; | |||
_PyRuntime.gc.trash_delete_later = (PyObject*) _Py_AS_GC(op)->_gc_prev; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not _PyGCHead_PREV(op)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
Objects/object.c
Outdated
@@ -2159,8 +2159,7 @@ _PyTrash_thread_destroy_chain(void) | |||
PyObject *op = tstate->trash_delete_later; | |||
destructor dealloc = Py_TYPE(op)->tp_dealloc; | |||
|
|||
tstate->trash_delete_later = | |||
(PyObject*) _Py_AS_GC(op)->gc.gc_prev; | |||
tstate->trash_delete_later = (PyObject*) _Py_AS_GC(op)->_gc_prev; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
Modules/gcmodule.c
Outdated
#define GC_NEXT _PyGCHead_NEXT | ||
#define GC_PREV _PyGCHead_PREV | ||
|
||
// Bit 0 of _gc_prev is used for _PyGC_PREV_MASK_FINALIZED in objimpl.h |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment doesn't seem to match the line being commented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated.
Modules/gcmodule.c
Outdated
// move_legacy_finalizers() removes this flag instead. | ||
// Between them, unreachable list is not normal list and we can not use | ||
// most gc_list_* functions for it. We should manually tweaking unreachable | ||
// list in these two functions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand "We should manually tweaking unreachable list in these two functions".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed last sentence.
I meant we should manually set _gc_prev and _gc_next, like this:
+ // Manually unlink gc from unreachable list because
+ PyGC_Head *prev = GC_PREV(gc);
+ PyGC_Head *next = (PyGC_Head*)(gc->_gc_next & ~NEXT_MASK_UNREACHABLE);
+ assert(prev->_gc_next & NEXT_MASK_UNREACHABLE);
+ assert(next->_gc_next & NEXT_MASK_UNREACHABLE);
+ prev->_gc_next = gc->_gc_next; // copy NEXT_MASK_UNREACHABLE
If there are no more objections, I want to merge this in next week. |
https://bugs.python.org/issue33597