-
Notifications
You must be signed in to change notification settings - Fork 7.9k
[WIP] POC for dynamic GC buffer resizing #3025
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
zend_uchar flags, /* used for strings & objects */ | ||
uint16_t gc_info) /* keeps GC root number (or 0) and color */ | ||
zend_uchar gc_flags, /* GC flags, including GC color */ | ||
uint16_t extra_flags) /* additional type specific flags */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the Idea of mixing unchangeable flags with GC color. I can't say how this should look like, but probably this might be done in some better way.
On one hand, the GC algorithm would benefit if we change color to BLACK on ADDREF. This is not done now, but potentially might be done quite efficient (especially on 64-bit system) if we combine reference-counter and color mutation.
On the other hand, we may move GC color into GC buffer at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ZEND_RC_DEBUG compatibility should be fixed.
ZEND_RC_MOD_CHECK() macro should use "u.v.gc_flags" instead of "u.v.flags".
Even better, would switch to macros.
Zend/zend_types.h
Outdated
zend_uchar flags, | ||
zend_uchar _unused, | ||
zend_uchar nIteratorsCount, | ||
zend_uchar consistency) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably may use two low bits of extra_flags to implement consistency checks (now you commented them)
zend_hash_destroy(in_hash); | ||
|
||
in_hash->u.v.flags = out_hash.u.v.flags; | ||
HT_FLAGS(in_hash) = HT_FLAGS(&out_hash); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here, you overwrite iterators counter, set few lines above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wrong. Everything was fine.
Previously the GC root was stored in the zend_refcounted header in 14 bits, with an additional 2 bits used for the GC color. This limits us to a root buffer of at most ~16k entries, which is no longer tenable. This commit performs the following changes: * Arrays and objects now use the zend_refcounted_gc header, which is zend_refcounted followed by a u32 gc_root field. * The field in zend_refcounted that previously held the gc_info is now extra_flags. This flag space is freely usable for type-specific flags. * The GC color is now stored as part of the GC flags. * The object flags have been moved from the GC flags to the extra flags, as there is no longer enough space for them. TODO: The array flags need to be moved into extra flags as well, to make sure that these changes are size-neutral on x64.
Introduce HT_FLAGS() macro to ease the migration. This brings array size back to what it was.
This is going to make it easier to resize the GC buffer dynamically, as we will not have to take care of migrating all the pointers. It also is a more compact representation on 64-bit systems. On the other hand, the operations become slightly more expensive. This patch preserves the sentinel-based doubly-linked lists. The sentinels now have to be part of the root buffer proper, rather than being separate, so the first two root buffer entries are reserved for the roots and to_free sentinels. This patch leaves two things in a broken state: * additional_buffer doesn't work. Managing references into separate buffer allocations seems too complicated and inefficient to me, now that we have the ability to reallocate the buffer. I plan to fix this by increasing the buffer size if we need the space to store additional entries during GC. * recursive GC may not work correctly, as the to_free list is no longer stored locally during the actual GC procedure. I plan to fix this by not allowing recursive GC and instead just collecting garbage, as a dynamically resized root buffer can guarantee that we don't lose anything.
If we don't have enough space to store additional garbage, resize the root buffer. The resizing heuristic is just a dummy for now.
Keep track of the number of roots and collect based on that, instead of going by the buffer size.
During the initial root collection phase the GC is marked as "protected". During this phase addition of new roots is prohibited. Otherwise, while the GC is active, addition of new roots is allowed, but a nested GC is prohibited. GC will be delayed until the current one finishes.
Only don't collect roots if gc.enabled=0 at startup, in which case the root buffer is not allocated at all.
5a1dc6a
to
70aec4c
Compare
GC_SET_REFCOUNT(op_array->static_variables, 2); | ||
GC_TYPE_INFO(op_array->static_variables) = IS_ARRAY | (IS_ARRAY_IMMUTABLE << GC_FLAGS_SHIFT); | ||
GC_TYPE(op_array->static_variables) = IS_ARRAY; | ||
GC_FLAGS(op_array->static_variables) = IS_ARRAY_IMMUTABLE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment about keeping value of "extra_flags".
Superseded by #3165. |
This PR is based on #2994.
This PR does a couple of things:
From some initial testing, this works well to avoid GC performance issues. If the active working set is greater than 10k objects, the GC buffer is increased until GC is no longer continuously triggered.
Of course, the current implementation can cause the reverse issue: Once the buffer becomes too large, GC may never be triggered again. The collection criterion will need to be improved for that. For example, we might want to add a timer-based GC (which triggers a GC VM interrupt) to make sure that GC runs from time to time.
Another issue is that while the GC threshold may shrink again, the GC buffer currently is never reduced. It may make sense to allow making it smaller, but this would require compacting the roots to be contiguous.