Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function not working concurrently #162

Open
mpjraaij opened this issue Jul 27, 2018 · 10 comments

Comments

2 participants
@mpjraaij
Copy link

commented Jul 27, 2018

Hi there,

I'm new to recoil and using it with React. I'm trying to run functions concurrently, not synchronously.

This is my code to test:

    public function execute($json)
    {
        $loop = Factory::create();
        $kernel = ReactKernel::create($loop);

        $kernel->execute(function () {
            yield [
                $this->fetchTrade(1),
                $this->fetchTrade(2),
                $this->fetchTrade(3),
            ];
        });

        $loop->run();
    }

    private function fetchTrade($id) {
        echo $id . PHP_EOL;
        sleep(1);
        yield;
        return;
    }

Unfortunately the code runs synchronously and not concurrently (the sleep is called after each execution). How do I correct this code so it does work concurrently like in the example?

Thanks so much,
Maarten

@jmalloc

This comment has been minimized.

Copy link
Member

commented Jul 27, 2018

Hi, if I understand correctly, the issue is the call to the standard sleep(). Instead, use Recoil::sleep() which is "cooperative", allowing the other strands to execute in the meantime:

use Recoil\Recoil;

public function execute($json)
{
    $loop = Factory::create();
    $kernel = ReactKernel::create($loop);

    $kernel->execute(function () {
        yield [
            $this->fetchTrade(1),
            $this->fetchTrade(2),
            $this->fetchTrade(3),
        ];
    });

    $loop->run();
}

private function fetchTrade($id) {
    echo $id . PHP_EOL;
    yield Recoil::sleep(1);
}

This should produce the same output, but the program will only wait for ~1 second overall.

Let me know if this helps :)

@mpjraaij

This comment has been minimized.

Copy link
Author

commented Jul 27, 2018

That does indeed help a bit..

However I was not aware that the sleep() function is the standard.. I replaced that with a function that takes 2 seconds to execute.. how can I add concurrent behaviour to any function?

Let's say the function would be:

private function fetchTrade($id) {
    // Doing stuff for 2 seconds

    yield;
}

Would this be enough or would I have to translate the fetchTrade function into a concurrent coroutine?

@jmalloc

This comment has been minimized.

Copy link
Member

commented Jul 28, 2018

would I have to translate the fetchTrade function into a concurrent coroutine?

Yep, this is the key. Tools like Recoil, React, etc can't make code that isn't already asynchronous execute concurrently. Nor do they allow for any kind of parallelism (there's still only ever one actual OS thread).

If you take a look through the Recoil class you'll see that some of the operations are categorised as either COOPERATIVE or NON-COOPERATIVE. These cooperative operations "cooperate" by giving control to the Recoil kernel to do other work while the operation executes. Note that all such operations wait on some event to occur, such as waking from a sleep or for a stream to become readable. All we're really doing is making sure we're able to use CPU to do something useful while we wait for these events.

If fetchTrade() were CPU bound, there'd be no advantage to using an asynchronous framework, because there would be none of these events to wait for. Assuming, however, that fetchTrade() makes a HTTP request, we can allow the CPU to do something while it waits for the result by performing that network IO in a way that Recoil (well, React, really) is aware of - namely by using the Recoil::read() and write() operations on a network stream, or more practically using an HTTP client that's built for React (such as Buzz).

I hope this is helpful. Please keep the questions coming, and forgive me if some explanations are too basic, but this kind of question comes up often enough that I figure this info might be helpful to others, too.


As an aside, if you're going to write some code using Recoil-style coroutines (which of course I encourage!) I'd like to point out the existence of the recoil/dev package. It's primary feature is an instrumentation system that re-writes stack traces inside exceptions to show the "yield points" in coroutines rather than the PHP code inside the Recoil kernel.

This has proven invaluable for dealing with Recoil code in real projects. The instrumentation system is performed entirely within assert() statements, and so incurs no overhead in production where assertions are disabled.

There is an example that shows the difference between the stack trace with the instrumentation enabled vs disabled. All it requires from your codebase is that you annotate functions that are intended as coroutines like so:

use Generator as Coroutine; // important

function myGenerator(): Coroutine {
    yield Recoil::sleep(5);
}
@mpjraaij

This comment has been minimized.

Copy link
Author

commented Jul 29, 2018

Thanks for the elaborate post! I'm determined to make this work with Recoil.

If I understand correctly, it works like this: (correct me where I'm wrong).

Anything that's CPU bound, like calculations, fetching from DB, etc. is non-cooperative and hence can't be done asynchronous. However if I make an API call to an external service, this can be done cooperative.

This means I could start fetching API calls async, and while I'm doing that have the system continue to do some CPU calculations (sync). And then when the calls are finished, save everything to the DB (sync).

For example:

  1. Run a batch of HTTP requests
  2. Do some CPU work
  3. Yield the HTTP requests

If I understood correctly it will start fetching the requests concurrently, while it fetches it will continue running the CPU work and when the CPU work is done it will await/yield the requests.

This leaves me to these questions:

  • When is it smart to use a strand?
  • How exactly can I use the cooperate() function in the Kernel?

I also see your point with the stack traces, I'd have to implement it in Phalcon. If I'm not mistaken this gets included after the autoload include, am I right?

@jmalloc

This comment has been minimized.

Copy link
Member

commented Jul 29, 2018

You've pretty much nailed it!

The use of the word "yield" is a bit confusing, I try to use it only when referring to the yield keyword, perhaps "await" is better.

I will point out that DB queries are going to involve some kind of network or disk IO and hence can be cooperative, too. As a gross generalisation, typical web backends spend most of their time waiting on IO so there's usually some tangible benefit to all of this.

There are ReactPHP bindings for MySQL, for example, though I've not used them myself. If you are bound to using some DBAL provided by your framework it's probably NOT going to be cooperative, as you say.

When is it smart to use a strand?

This is a difficult question to answer well because the correct and unrevealing answer is "always"; Recoil can only do work within a strand.

Strands are analogous to an operating system thread. Strands each have their own call stack, and they are the "unit of work" that is switched between by the kernel. Only when one strand cooperates may others execute.

The original example code you posted used at least 4 strands - one for the call to execute() then one each for the 3 fetchTrade() coroutines.

You might start new strands in order to perform portions of work that all serve a single function of your application, as you did in your example. It's the "orchestration" of these strands together that's powerful.

You might also have multiple strands performing tasks that do not depend on each other directly. For example, at my day job we have an application that syncs documents with a remote provider. It uses one strand to fetch an Atom feed and populate an "updated queue" in a database. Another strand reads that queue and starts yet another strand to fetch each individual document.

How exactly can I use the cooperate() function in the Kernel?

cooperate() forces the kernel to switch to some other strand. Its use should be uncommon. Typically it will be called inside the loop of some long-running, CPU-bound operation, just to allow the kernel to respond to IO in a timely manner. There is no reason to use cooperate() inside a coroutine that is doing other cooperative operations.

Because this kind of explicit cooperation is necessary, Recoil can be said to perform cooperative multitasking. Contrast this to preemptive multitasking, which is what your OS is doing.


recoil/dev includes a Composer plugin that literally replaces Composer's own autoloader after installation. It still uses the Composer autoloader for all resolution logic. I admit I know nothing about Phalcon though I wouldn't have assumed it would present any particular difficulties.

@mpjraaij

This comment has been minimized.

Copy link
Author

commented Jul 30, 2018

Thank you so much for taking the time to explain this all!

I hope it will help others too.

I have indeed tried the MySQL bindings from ReactPHP, but have found that they're no faster than the Phalcon ORM (it's C-based and caches most), hence I might as well stick to what I know.

You've helped out a lot and I'm confident I can make it work now. Many many thanks.

@mpjraaij

This comment has been minimized.

Copy link
Author

commented Jul 30, 2018

One more question that keeps haunting me. I have for example this piece of code:

    private function getSingleTrade($id): Coroutine
    {
        $trade = Trades::findFirst([
            'conditions' => 'id = {id:int}',
            'bind' => [
                'id' => $id
            ]
        ]);

        yield;
        return $trade;
    }

When I then do var_dump ($this->getSingleTrade($id)) I get this:

object(Generator)#119 (0) {
}

Why is the return not filled in the Generator?

@jmalloc

This comment has been minimized.

Copy link
Member

commented Jul 30, 2018

I have indeed tried the MySQL bindings from ReactPHP, but have found that they're no faster than the Phalcon ORM

They may not be faster for a given query but importantly they are cooperative, so you can run several queries concurrently, as well as any other kind of IO. This probably wont matter to you much if you're starting a kernel to perform a single task with a small concurrent aspect, but just remember everything else stops to wait for those blocking DB queries made by a conventional DBAL.

Why is the return not filled in the Generator?

The return value of a generator function in PHP is always a \Generator, which is an iterator-like object that steps between the yield statements each time it is advanced. It's Recoil's job to step through that \Generator and interpret the things it yields as instructions. Recoil's terminology for these yielded things is "dispatchable values".

Try this code, and you'll see how to access the values you're expecting directly from the \Generator:

function blah() {
    yield "I am a yielded value!";
    return "I am the return value!";
}

$gen = blah();

foreach ($gen as $value) {
    echo $value . PHP_EOL;
}

echo $gen->getReturn() . PHP_EOL;

When you want to use a \Generator as a coroutine you need to give it Recoil for execution. The example below would dump the Trade object as you expect.

function run() {
    $get = $o->getSingleTrade($id);
    var_dump(yield $get);
}

$run = run();
$kernel->execute($run);

I've assigned the generators to the variables $run and $get just so they have names we can refer to. Notice the yield in the var_dump() call above. This is what lets Recoil "see" $get.

The process goes something like this:

  • Recoil starts iterating over $run
  • sees the yielded $get, which it decides to treat it as a coroutine
  • pushes $get onto the call-stack for the current strand and starts iterating over it instead
  • (repeats this whole process for the things yielded from $get)
  • reaches the end of $get and pops it from the call stack
  • calls $get->getReturn() to get the Trade, which it sends back to $run
  • continues iterating $run
  • reaches the end of $run and terminates the strand

If you haven't already I'd recommend reading this post by Christopher Pitt. It's a fantastic introduction to the ideas in play here from the very basics of iterators to the use of generators as coroutines.

@mpjraaij

This comment has been minimized.

Copy link
Author

commented Jul 31, 2018

Awesome! That brings me even closer to understanding. So let's say I have this:

    public function blah() {
        yield "I am a yielded value!";
        return "I am the return value!";
    }

    private function run()
    {
        for ($i = 0; $i < 5; $i++) {
            $array[] = $this->blah();
        }

        foreach ($array as $gen) {
            foreach ($gen as $value) {
                echo $value . PHP_EOL;
            }
        }

        echo $gen->getReturn() . PHP_EOL;
    }

My outcome is this (as expected):

I am a yielded value!
I am a yielded value!
I am a yielded value!
I am a yielded value!
I am a yielded value!
I am the return value!

Can I somehow simplify this code or is this how I do it?

@jmalloc

This comment has been minimized.

Copy link
Member

commented Jul 31, 2018

I guess that depends on what you wan't to achieve there. Recoil is not involved in that code there is no coroutine, only "normal" generators.

Note in the example you pasted you're only using the value of getReturn() from the last generator in the array. The code below produces the same output, and is "simpler" insofar as it uses only single generator.

public function blah() {
    for ($i = 0; $i < 5; ++$i) {
            yield "I am a yielded value!";
    }

    return "I am the return value!";
}

public function run()
{
    $gen = $this->blah();

    foreach ($gen as $value) {
        echo $value . PHP_EOL;
    }

    echo $gen->getReturn() . PHP_EOL;
}

@jmalloc jmalloc added the question label Aug 1, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.