Skip to content

Conversation

nikic
Copy link
Member

@nikic nikic commented Jan 19, 2018

This PR is based on #2994.

This PR does a couple of things:

  • Move the GC buffers from using pointers to using offsets, similar to what we do in hashtables. The idea behind this is that the GC buffer can now be easily reallocated without having to migrate pointers. Additionally, this reduces the size of the GC buffer on 64-bit systems. I've kept the previous design that uses cyclic linked list. A difference is that the sentinel of the cyclic list now has to be part of the root buffer, so the first two roots are reserved for the roots list and to_free list.
  • Remove the additional buffer mechanism and nested GC runs. If additional space is needed, the GC buffer is simply grown. Nested GCs are forbidden, but additional roots can still be added to the buffer (with potential resizing), so they are no longer necessary.
  • Make sure that we're still rooting everything even if gc_disable() was called. Previously this only collected up to 10k roots, now the buffer will be resized to hold all roots. This makes the temporary gc_disable() + gc_enable() pattern guaranteed leak-free.
  • Finally, this implements a very simple dummy heuristic for resizing the root buffer. If during a GC run not enough garbage is collected, the GC collection threshold will be increased by a factor of two. This prevents GC for continuously rerunning without collecting anything.

From some initial testing, this works well to avoid GC performance issues. If the active working set is greater than 10k objects, the GC buffer is increased until GC is no longer continuously triggered.

Of course, the current implementation can cause the reverse issue: Once the buffer becomes too large, GC may never be triggered again. The collection criterion will need to be improved for that. For example, we might want to add a timer-based GC (which triggers a GC VM interrupt) to make sure that GC runs from time to time.

Another issue is that while the GC threshold may shrink again, the GC buffer currently is never reduced. It may make sense to allow making it smaller, but this would require compacting the roots to be contiguous.

zend_uchar flags, /* used for strings & objects */
uint16_t gc_info) /* keeps GC root number (or 0) and color */
zend_uchar gc_flags, /* GC flags, including GC color */
uint16_t extra_flags) /* additional type specific flags */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the Idea of mixing unchangeable flags with GC color. I can't say how this should look like, but probably this might be done in some better way.

On one hand, the GC algorithm would benefit if we change color to BLACK on ADDREF. This is not done now, but potentially might be done quite efficient (especially on 64-bit system) if we combine reference-counter and color mutation.

On the other hand, we may move GC color into GC buffer at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ZEND_RC_DEBUG compatibility should be fixed.
ZEND_RC_MOD_CHECK() macro should use "u.v.gc_flags" instead of "u.v.flags".
Even better, would switch to macros.

zend_uchar flags,
zend_uchar _unused,
zend_uchar nIteratorsCount,
zend_uchar consistency)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably may use two low bits of extra_flags to implement consistency checks (now you commented them)

zend_hash_destroy(in_hash);

in_hash->u.v.flags = out_hash.u.v.flags;
HT_FLAGS(in_hash) = HT_FLAGS(&out_hash);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here, you overwrite iterators counter, set few lines above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wrong. Everything was fine.

@nikic nikic added the Feature label Jan 24, 2018
nikic added 10 commits January 24, 2018 20:59
Previously the GC root was stored in the zend_refcounted header
in 14 bits, with an additional 2 bits used for the GC color.
This limits us to a root buffer of at most ~16k entries, which
is no longer tenable.

This commit performs the following changes:
 * Arrays and objects now use the zend_refcounted_gc header, which
   is zend_refcounted followed by a u32 gc_root field.
 * The field in zend_refcounted that previously held the gc_info
   is now extra_flags. This flag space is freely usable for
   type-specific flags.
 * The GC color is now stored as part of the GC flags.
 * The object flags have been moved from the GC flags to the extra
   flags, as there is no longer enough space for them.

TODO: The array flags need to be moved into extra flags as well,
to make sure that these changes are size-neutral on x64.
Introduce HT_FLAGS() macro to ease the migration.

This brings array size back to what it was.
This is going to make it easier to resize the GC buffer dynamically,
as we will not have to take care of migrating all the pointers. It
also is a more compact representation on 64-bit systems. On the
other hand, the operations become slightly more expensive.

This patch preserves the sentinel-based doubly-linked lists. The
sentinels now have to be part of the root buffer proper, rather than
being separate, so the first two root buffer entries are reserved
for the roots and to_free sentinels.

This patch leaves two things in a broken state:
 * additional_buffer doesn't work. Managing references into
   separate buffer allocations seems too complicated and inefficient
   to me, now that we have the ability to reallocate the buffer. I
   plan to fix this by increasing the buffer size if we need the
   space to store additional entries during GC.
 * recursive GC may not work correctly, as the to_free list is no
   longer stored locally during the actual GC procedure. I plan to
   fix this by not allowing recursive GC and instead just collecting
   garbage, as a dynamically resized root buffer can guarantee that
   we don't lose anything.
If we don't have enough space to store additional garbage, resize
the root buffer. The resizing heuristic is just a dummy for now.
Keep track of the number of roots and collect based on that, instead
of going by the buffer size.
During the initial root collection phase the GC is marked as
"protected". During this phase addition of new roots is prohibited.

Otherwise, while the GC is active, addition of new roots is allowed,
but a nested GC is prohibited. GC will be delayed until the current
one finishes.
Only don't collect roots if gc.enabled=0 at startup, in which case
the root buffer is not allocated at all.
GC_SET_REFCOUNT(op_array->static_variables, 2);
GC_TYPE_INFO(op_array->static_variables) = IS_ARRAY | (IS_ARRAY_IMMUTABLE << GC_FLAGS_SHIFT);
GC_TYPE(op_array->static_variables) = IS_ARRAY;
GC_FLAGS(op_array->static_variables) = IS_ARRAY_IMMUTABLE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment about keeping value of "extra_flags".

@nikic
Copy link
Member Author

nikic commented Mar 1, 2018

Superseded by #3165.

@nikic nikic closed this Mar 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants