Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to implement max_execution_time for swoole http server #3078

Closed
thbley opened this issue Jan 20, 2020 · 16 comments
Closed

How to implement max_execution_time for swoole http server #3078

thbley opened this issue Jan 20, 2020 · 16 comments
Labels

Comments

@thbley
Copy link

thbley commented Jan 20, 2020

Please answer these questions before submitting your issue. Thanks!

  1. What did you do? If possible, provide a simple script for reproducing the error.

Following the code example at https://www.swoole.co.uk/docs/modules/swoole-http-server-doc I'd like to ask if you could provide a code example enabling http workers to stop processing a request after a certain period of time (similar to max_execution_time in php-fpm)

  1. What version of Swoole are you using (show your php --ri swoole)?

swoole 4.4.15

  1. What is your machine environment used (including version of kernel & php & gcc) ?

Docker version 19.03.5
kernel 5.4.7

Thanks!

@healerz
Copy link

healerz commented Jan 28, 2020

I'd like to set a timeout (max execution time) for workers as well. And i couldn't find a solution to achieve this either.
So an example, if this is possible somehow, would be very much appreciated :)

@kenashkov
Copy link

kenashkov commented Jan 28, 2020

If you mean to forcefully terminate a worker in case a coroutine blocks (due to an error/endless loop or a blocking IO operation) and exceeds certain amount of time you could implement a watchdog using Swoole\Timer and Swoole\Table. The idea is that with Timer you register a callback that "checks in" every X seconds in a shared memory (swoole table) and another worker (again with Timer) checks every X seconds when was the last time every other worker checked in... If it is more than X seconds (X+1) you can kill that worker. Swoole will automatically restart it.

P.S.
Any other shared store (like Redis) instead of swoole\table would do the job Im just suggesting Swoole\table to avoid additional requirements.

P.P.S.
Here is an example implementation - https://github.com/AzonMedia/watchdog. See Watchdog.php & Backends/SwooleTableBackend.php. In this case every worker serves as a watchdog for every other worker. The watchdog is registered at WorkerStart with Kernel::$Watchdog->checkin() and Kernel::$Watchdog->check() (these two methods register the two Timers - one for the "Im alive" message and one for checking the rest of the workers when they last said "Im alive").

@twose
Copy link
Member

twose commented Feb 3, 2020

we have a watchdog for coroutine (but English documents are not synchronized)

https://github.com/swoole/swoole-src/blob/master/tests/swoole_coroutine_scheduler/preemptive/while.phpt

@twose twose added the question label Feb 3, 2020
@kenashkov
Copy link

kenashkov commented Feb 3, 2020

Indeed there is the preemptive scheduler but in my understanding using it will cause issues when accessing & modifying global or static variables (for example objects from Dependency Injection container). For example:

//some code running for 10 msec (10msec was the allocated time slot, right?)
if (some_class::$DI_container->User->current_user_id === 0) {
    //a coroutine switch may occur here
    //and then the assignment may overwrite a valid value when the execution goes back to this coroutine
    some_class::$DI_container->User->current_user_id === 1;
}

So any time a variable shared between coroutines is being checked and modified locking needs to be employed.
Shared variables may not be so obvious - the dependencies may be passed as arguments to methods and then the developer may just not notice that he is modifying a shared object. This can introduce very rare and hard to track bugs.
In the cooperative model we know when the switches occur and because there is actually only a single worker/thread being executed there is no need to lock in these cases.
P.S.
Of course the best thing is to completely avoid global state or it to be immutable once the server is started. Also the given example above is bad as the User will be always a dependency derived from $Context, not from global $DI object.
P.P.S.
Actually the enable_preemptive_scheduler mode does not solve completely the issue with the endless loop as with this mode the endless loop will keep slowing down the given Worker as any time it is switched to the problematic coroutine the worker will be blocked for 10msec. And the given watchdog will not help in this case as every 10msec it will report "alive"... If the watchdog Timer is set to under 9 msec but then it this is too limiting as a blocking IO operation may take longer.

Is there any other mechanism to limit the maximum execution time of a coroutine? Maybe a call like:

Swoole\Coroutine::set('coroutine_max_time', 10);//measured in msec

And then an exception to be thrown if timeout is exceeded. And this to be valid for all coroutines. The best will be if it is allowed to be changed/increased during runtime too...

@doubaokun
Copy link
Contributor

doubaokun commented Feb 16, 2020

@thomasbley @healerz please check the following example:

<?php
$server = new Swoole\HTTP\Server("0.0.0.0", 9501);

$server->on("start", function (Swoole\Http\Server $server) {
    echo "Swoole http server is started at http://127.0.0.1:9501\n";
});

$server->on("request", function (Swoole\Http\Request $request, Swoole\Http\Response $response) {
	$max_execution_ms = 500;
	$timer = Swoole\Timer::after($max_execution_ms , function() use ($response, $max_execution_ms) {
		if(!$response->output) {
			$response->output = 1;
			$response->header("Content-Type", "text/plain");
			$response->end("Timeout after {$max_execution_ms}ms\n");
		}
	});

	// your application layer latency
	co::sleep(0.3);
	if(!$response->output) {
		$response->output = 1;
		$response->header("Content-Type", "text/plain");
		$response->end("Hello World\n");
	}
    
	Swoole\Timer::clear($timer);
});

$server->start();

@doubaokun
Copy link
Contributor

doubaokun commented Feb 16, 2020

@kenashkov A timer can be used within a coroutine to limit the max execution time.

Indeed all the coroutine are modifying the same global or static variable at unpredictable time point if I\O involves which may be not as expected.

Under such case, at the moment we would suggest to isolate global or static variables based on the Coroutine Cid, the Corotuine Cid can get with Coroutine::getCid(), for example each global variable can be isolated as static $globals[$cid][$var] = $value.

This may be improved in the future.

@kenashkov
Copy link

kenashkov commented Feb 18, 2020

@doubaokun We have done this with an $arr[$cid] already (before the getContext() was available) but now I think it is more appropriate to use the coroutine context.

How exactly you mean to use Swoole\Timer to limit the execution? I imagine we could register a function with the Timer and this function to check all other coroutines for how long they are running. Each coroutine could register in its context the time when it was started or the registered function if running often can track this as well and the terminate a coroutine (not the whole worker) but how? Is there currently a way to terminate a coroutine like Coroutine::kill($cid)? There is the yield() method but this just interrupts it... Indeed a coroutine interrupted this way will no longer execute unless resume()d but it will just leak.

@doubaokun
Copy link
Contributor

@kenashkov please check my example, onRequest callback is a Coroutine with a timer to return/kill itself based on time limitation.

@kenashkov
Copy link

I was looking on my mobile phone and didnt see your previous answer with the example. I checked it now and I still think it doesnt solve completely the problem because we have have a sub-coroutines in the main coroutine handling the request. Let say a subcoroutine reading DB, file ops, then doing some processing... My point was that there could be an endless loop in one of these and while your example solves the problem to push response to the client it will not actually terminate this subcoroutine. And if not terminated or suspended it will keep eating CPU and block the worker.

As per my previous post though I was thinking we can implement a watchdog using the timer and just suspend such a coroutine and send error notifications. It will create a small leak (the stack of the coroutine) but at least will not be blocking the worker and wait until the server is restarted. But in fact this wont work either as if the worker is blocked by an endless loop the timer does not interrupt it to switch to the registered coroutine. This means that your example will not work either if the script runs for more than the $max_execution_ms because of an endless loop (in the main coroutine).

Even using ini_set("swoole.enable_preemptive_scheduler","1"); does not help in the case of an endless loop as it seems this scheduler still counts on a yield point somewhere.

I tested this on:
Version => 4.5.0-alpha
Built => Dec 27 2019 09:46:30

@doubaokun
Copy link
Contributor

@kenashkov the coroutine execution status should be managed by the code, return the coroutine when it is necessary, for example:

<?php

Co\run(function() {
	$max_execution_ms = 500;
	$done = 0;
	$timer = Swoole\Timer::after($max_execution_ms , function() use (&$done) {
		$done = 1;
		echo "timeout \n";
	});

	// other codes
	while (1) {
		if($done) {
			Swoole\Timer::clear($timer);
			echo "end\n";
			return;
		}
		co::sleep(0.1);
		echo "1\n";
	}

	Swoole\Timer::clear($timer);
	echo "end\n";
});

echo "main";

@kenashkov
Copy link

Thanks for replying! My point was that if there is no yield point in the code like:

Co\run(function() {
	$max_execution_ms = 500;
	$done = 0;
	$timer = Swoole\Timer::after($max_execution_ms , function() use (&$done) {
		$done = 1;
		echo "timeout \n";
	});

	// other codes
//no function that yields here
	while (1) {
		if($done) {
			Swoole\Timer::clear($timer);
			echo "end\n";
			return;
		}
//		co::sleep(0.1);// assume no yield here
		echo "1\n";
	}

	Swoole\Timer::clear($timer);
	echo "end\n";
});

echo "main";

Basically if there is a loop doing data processing (not pulling from DB, no coroutine yield) that because of an error in the break conditions becomes endless under certain initial data conditions. In this case in Swoole context there seems to be no way to interrupt this unless a watchdog is used and another worker terminates the blocked worker. In normal PHP context this will be terminated when max_execution_time is reached. The only way to have this in swoole is to explicitly include a code in the loop that yields to ensure the timer has a chance to run and do it as in your example.

@healerz
Copy link

healerz commented Mar 4, 2020

@doubaokun thanks for your reply and the example.
I tried your example and it works, but unfortunately only if co::sleep() is called within an infinite loop. Therefore I can confirm @kenashkov statement, and this is not the solution to the problem.
Can you reopen this issue? Personally I believe it's crucial to have a mechanism to kill infinite running workers after a certain time. It can always happen that one of the developers working on the project introduces an infinite loop under certain circumstances.

@doubaokun doubaokun reopened this Mar 4, 2020
@doubaokun
Copy link
Contributor

@healerz Can you test your code with the setup:

<?php
ini_set("swoole.enable_preemptive_scheduler","1");

@healerz
Copy link

healerz commented Mar 4, 2020

@doubaokun
I tried it with the mentioned ini setting, unfortunately, the timer never gets triggered.
Just to prevent any misunderstandings or in case I don't understand you correctly, the following example shows the boiled down case I'm talking about:

<?php

use Swoole\Http\Request;
use Swoole\Http\Response;
use Swoole\Http\Server;
use Swoole\Timer;

require __DIR__ . '/../vendor/autoload.php';

ini_set('swoole.enable_preemptive_scheduler', '1');

$server = new Server('0.0.0.0', 9001);

$server->on('start', function (Server $server) {
    echo 'Swoole http server started' . PHP_EOL;
});

$server->on('request', function (Request $request, Response $response) {
    $maxEcecutionTimeMs = 500;
    $timer = Timer::after($maxEcecutionTimeMs, function () use ($response, $maxEcecutionTime) {
        if (!$response->output) {
            $response->output = 1;
            $response->end("Timeout after {$maxEcecutionTimeMs}ms\n");
        }
    });

    while (true) {
        echo 'infinite process' . PHP_EOL;
        sleep(1);
    }

    Timer::clear($timer);
});

$server->start();

My expectation would be that the process gets terminated after 500ms, or it's at least what I'm trying to achieve to prevent everlasting processes :)

@kenashkov
Copy link

kenashkov commented Mar 5, 2020

Yes, this was my point - that there may be a an infinite loop containing no code that yelds (be it co::sleep() or any other). In this case the preemptive scheduler does not work either as it seems the internal event loop is never accessed as the executor gets stuck in the loop. I have no knowledge of the internals but I would assume that an interrupt inside the PHP VM will need to be used/implemented to interrupt the endless loop and yield back to the event loop/scheduler (I guess the current max_execution_time logic at. The only current solution is to use a watchdog and just restart the worker process and emit an error message...

P.S. this issue I think is related to #2827 - I was looking how to measure how much exact time was spent in a coroutine for profiling purpose (thus I needed to know when they switch). Perhaps if no API for tracking the switching is implemented at least API for accessing how much time the coroutine is being executed could be added if the current issue (max execution time per coroutine) is considered
Thanks

@doubaokun
Copy link
Contributor

doubaokun commented Mar 5, 2020

@healerz

co::sleep should be used in coroutine context instead of syscall sleep.

@kenashkov

If you enable preemptive scheduler, the process won't be dead and is still able to accept and process new connections, but the loops are still running.

The concept is controlling the timeout, terminate the loop at the application layer:

<?php
<?php

use Swoole\Http\Request;
use Swoole\Http\Response;
use Swoole\Http\Server;
use Swoole\Timer;

ini_set('swoole.enable_preemptive_scheduler', '1');

$server = new Server('0.0.0.0', 9501, SWOOLE_BASE);

$server->on('start', function (Server $server) {
    echo 'Swoole http server started' . PHP_EOL;
});

$server->on('request', function (Request $request, Response $response) {
    $maxEcecutionTimeMs = 500;
    $die = 0;
    $timer = Timer::after($maxEcecutionTimeMs, function () use ($response, $maxEcecutionTimeMs, &$die) {
        if (!$die) {
            $die = 1;
            $response->end("Timeout after {$maxEcecutionTimeMs}ms\n");
        }
    });

    while (true) {
        // check the coroutine status and exit if $die = 1
        if($die) {
            return;
        }
        co::sleep(1);
        echo "tick\n";
    }

    $response->end("Hi\n");
});

$server->start();

Will have a look at #2827

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants