Skip to content

Conversation

@kooldev
Copy link
Contributor

@kooldev kooldev commented Jun 10, 2021

Edit: I played around with this code and I am not sure if this is a good approach...

This PR introduces a new variable EG(fiber_switch_enabled) that controls if fiber context switches are allowed. It is set to true by default and can be switched to false in code sections that must not be interrupted (due to not being reentrant). I disabled context switches during zend_gc_collect_cycles() because a fiber context switch during a destructor call can prevent garbage collection from working:

$fiber = new Fiber(function () {
    // Use a function call to ensure that no ref to $a and $b (other than the cycle) is present upon return.
    call_user_func(function () {
        $a = new class () {};

        $b = new class() {
            public function __destruct()
            {
                // Assuming this switches fibers to wait for network IO, it will prevent any further
                // GC runs because GC_G(gc_active) is only reset when collect cycles finishes.
                // This may block indefinitely (depending on network) and might also be a (hidden) nested call.
                Socket::sendGoodbye();
            }
        };

        $a->next = $b;
        $b->next = $a;
    });

    gc_collect_cycles();
});

// This could be called from the accept() loop of an async server.
$fiber->start();

The GC issue is only relevant in cycles collection, however userland code never knows if a destructor is called as part of a GC run or simply due to refcount reaching 0 (in this case switching fibers could be allowed). I decided to disable fiber context switches during any object destructor / free call to keep things simple and consistent (doing anyting that involves fibers in destructors is a horrible idea in general...).

Due to the nature of fibers (they are basically invisible to users of an API) even a simple internal function call like zval_ptr_dtor() may result in a fiber context switch. This PR prevents this from happening in all GC-related areas. At the same time it can be reused by all internals code and extensions to keep fiber switching in check (signal handling / pcntl will probably also need this).

@kooldev
Copy link
Contributor Author

kooldev commented Jun 10, 2021

@trowski Ping 😄

@bwoebi
Copy link
Member

bwoebi commented Jun 10, 2021

Is it not possible to make the GC re-entrant? So that calling gc_collect_cycles() (or trivial gc triggers) during fibers will just do the proper job?

At least I'm very afraid of this … sounds like a bevahiour which may not happen during testing, but happens with bigger arrays (for example, causing a fuller root buffer and leading to invoking the cycle collector). I.e. random production issues.

Regarding zval_ptr_dtor causing a gc run and fiber switch: yes, but this should not be problematic? … not more problematic than executing destructor code in general within any zval_ptr_dtor, independently of fibers.

... Just imagine: write a simple log line … leading to a log buffer being full and then followed by a http request … ending up in a fiber switch. I really would rather have the GC stuck than this.

@kooldev kooldev changed the title Implement fiber context switch blocking Implement fiber context switch blocking? Jun 10, 2021
@kooldev
Copy link
Contributor Author

kooldev commented Jun 10, 2021

@bwoebi If someone was able to refactor GC collect cycles to be re-entrant that would of course be way better than what this PR does. I am not sure if I am able to achieve that (I have no idea how to approach it). I am not sure if blocking context switches is the way to go eighter...

I did write the code as I was refactoring fibers and decided to leave it out because it also introduces unexpected behavior. I should have put that in the text (edited). There are probably quite a few other places that PHP internals / extensions will struggle wit re-entrancy...

@ez5393ez

This comment was marked as spam.

@trowski
Copy link
Member

trowski commented Jun 14, 2021

Closed via fdc2274.

@trowski trowski closed this Jun 14, 2021
@kooldev kooldev deleted the fiber-switch-block branch June 14, 2021 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants