Skip to content
This repository has been archived by the owner on Dec 16, 2019. It is now read-only.

start does not return reliably #481

Closed
pfofi opened this issue Sep 4, 2015 · 21 comments
Closed

start does not return reliably #481

pfofi opened this issue Sep 4, 2015 · 21 comments
Labels

Comments

@pfofi
Copy link

pfofi commented Sep 4, 2015

Hello,

I've been experimenting with a769e5b and PHP-CLI 7.0.0 RC2 under Debian unstable.

It seems that synchronized(), when called from a thread, prevents execution in the main thread:

class TestThread extends Thread
{
    public function run()
    {
        self::synchronized(function () {
            sleep(PHP_INT_MAX); // lock indefinitely
        });
    }
}

$thread = new TestThread;
$thread->start();
sleep(1); // give thread time to enter synchronized function
echo 'Main thread done. Waiting...', PHP_EOL;

Expected result:
Main thread done. Waiting... (after a second and then waiting for interruption).

Actual result:
*emptiness*

In PHP 5 it did work.

As always, I'm not sure if this is "by design" or a bug.

@krakjoe
Copy link
Owner

krakjoe commented Sep 4, 2015

Don't use sleep, I think I might override the function so that it can't be used; Sleep is intended to make processes sleep, usleep is nicer, it is intended for threads, however it is still not the best method of forcing a thread to wait.

If you want reliable code, you should only ever wait for something.

The example is strange, in the real world, the task at hand is never wait forever, so I have to break the rules to make the code work.

<?php
class TestThread extends Thread {

    public function run() {
        $this->synchronized(function () {
            while (!$this->done)
                $this->wait();
        });
    }
}

$thread = new TestThread;
$thread->start();

echo 'Main thread done. Waiting...', PHP_EOL;

$thread->synchronized(function() use($thread){
        $thread->wait(3 * 1000000);
        $thread->done = true;
        $thread->notify();
});

    $thread->join();

You must use preconditions; it is not just the programmer that can explicitly notify the thread, it can be done implicitly (by calling join for example), or internals otherwise changing the state of the thread.

@krakjoe krakjoe closed this as completed Sep 4, 2015
@krakjoe
Copy link
Owner

krakjoe commented Sep 4, 2015

Here's some terrible example code, sometimes, it will hang forever and other times will output what seem like impossible results:

<?php
class Test extends Thread {

    public function run() {
        $this->synchronized(function(){
            var_dump($this->wait());
        });
    }
}

$test = new Test();
$test->start();
$test->join();
?>

If it hangs, kill it, if it doesn't it will/might output:

bool(true)

From the call to wait which should only return true if notification was received, but that doesn't make sense, because the main context never notified it, but internals did when you called join.

I don't know if that makes it clearer, or more confusing, I hope clearer ...

Only ever wait for something ...

@pfofi
Copy link
Author

pfofi commented Sep 4, 2015

Wow, thanks for the effort, Joe!

Of course my example is not being used in the real world. It's crap. Nobody would use such a thing (I hope). I was just trying to demonstrate the issue in the shortest possible form and I completely agree that sleep() should virtually never be used.
I actually only use it for testing purposes.

But I'm still concerned how synchronized() is implemented now. Does it really lock all the other threads when called?

Slightly altered example:

class TestThread extends Thread
{
    public function run()
    {
        self::synchronized(function () {
            for (;;); // lock indefinitely (bad idea, I know)
        });
    }
}

$thread = new TestThread;
$thread->start();
for ($i = 0; $i < 10000000; ++$i); // doing some heavy lifting
echo 'Main thread done. Waiting...', PHP_EOL;

Edit: Okay, I did another test and other threads don't seem to be affected.

class CounterThread extends Thread
{
    public function run()
    {
        for ($i = 0;;) {
            echo self::getThreadId(), ' ', ++$i, PHP_EOL;
            usleep(1000000);
        }
    }
}

class TestThread extends Thread
{
    public function run()
    {
        self::synchronized(function () {
            echo 'Entering endless lock...', PHP_EOL;
            usleep(PHP_INT_MAX);
        });
    }
}

$counter = new CounterThread;
$counter->start();

$test = new TestThread;
$test->start();

Conclusion: Only the main thread is affected sometimes. So it's by design I guess.

@krakjoe
Copy link
Owner

krakjoe commented Sep 4, 2015

It works the same as it did before, it acquires a mutex, only in v3, the same mutex is used for state and properties too, which makes it much more powerful and predictable. In v2, there were many mutex and condition used for synchronization, in v3 there is a single monitor (condition, mutex and state) for the each object.

@krakjoe
Copy link
Owner

krakjoe commented Sep 4, 2015

There is still a problem in your code: usleep is intended for threads, but in no way is it designed for synchronization. While it's effective, in that it will make only a single thread sleep, it does not leave the thread in a receptive state, even for testing it's a bad idea to use it. Threaded::wait provides the same functionality (you can provide timeout), but leaves the thread in a receptive (to notification) state.

@pfofi
Copy link
Author

pfofi commented Sep 4, 2015

usleep is intended for threads, but in no way is it designed for synchronization.

Yes, absolutely, I agree 100%.

it does not leave the thread in a receptive state

That was my intention. wait() would not show what I wanted you to see :)

@chibisuke
Copy link
Contributor

chibisuke commented Sep 5, 2015 via email

@pfofi
Copy link
Author

pfofi commented Sep 5, 2015

I just checked by appending flush() and it did not change the outcome.

However, I just realized my first example does sometimes output the expected result. It's a race condition apparently.

Here are two other examples to demonstrate the issue:

2 cores @ 100%:

class TestThread extends Thread
{
    public function run()
    {
        for(;;);
    }
}

$thread = new TestThread;
$thread->start();

for(;;);

Only 1 core @ 100%:

class TestThread extends Thread
{
    public function run()
    {
        self::synchronized(function () {
            for(;;);
        });
    }
}

$thread = new TestThread;
$thread->start();

for(;;);

Notice how the main thread is not touching the threaded object after start(). Is this really how it should behave?

@krakjoe
Copy link
Owner

krakjoe commented Sep 5, 2015

I'm not sure what you are trying to show me, both of those examples consume 2 cores, because that's how you wrote it :s

I'm not sure what you are confused about either ...

@pfofi
Copy link
Author

pfofi commented Sep 5, 2015

OK, thanks for checking. It's a machine-dependent issue then. I'll check a few options.

@krakjoe
Copy link
Owner

krakjoe commented Sep 5, 2015

If you can provide me access to a machine where those examples behave as you suggest, that'd be good ... do you have a stackoverflow account ?

@cottton
Copy link

cottton commented Sep 5, 2015

@krakjoe
Are we still able to use ::wait(timeout) in v3?
from your example:

class Test extends Thread {
    public $timeout;
    public function run() {
        $this->synchronized(function(){
            var_dump($this->wait($this->timeout));
        });
    }
}

$test = new Test();
$test->timeout = 2000000;
$test->start();
$test->join();

@pfofi
Copy link
Author

pfofi commented Sep 5, 2015

access to a machine where those examples behave as you suggest

Sorry, I can't provide direct access. But I think the problem is virtualisation.
I always experiment on a Debian installation in VirtualBox. To be sure I made a clean install of a minimalist Debian 8.1 (just Standard System Utilities + gcc, make and autoconf) - same problem here. Then I tested on bare metal and there my two examples ran fine - both consume 2 cores as expected.

@krakjoe
Copy link
Owner

krakjoe commented Sep 5, 2015

@cotton yes

@pfofi if you can use vagrant to setup a machine where it occurs, I can take a look ... However, there is no guarantee from the OS that the threads will be scheduled as we expect, for all kinds of reasons that might not happen, and it might look like you are not taking up as much CPU as you would like/need too. I would expect a virtual machine to be particularly vulnerable to this kind of thing, but I've never seen it happening, so if it were reproducible it would be worth looking at.

@pfofi
Copy link
Author

pfofi commented Sep 7, 2015

I think I got to the bottom of this. The issue should more precisely be called "start() does not reliably return".

I assume start() cannot return if the thread is holding a lock. Why this occurs so frequently in a VM is a mystery to me. I experienced this using hashicorp/precise64 + PHP 7.0.0 RC2 and b724d37 in approximately 50% of all starts.

When the thread is properly synchronized before it enters an unresponsive state, the problem does not exist.

2 cores consumed, always:

class TestThread extends Thread
{
    public $ready = false;

    public function run()
    {
        self::synchronized(function () {
            while (!$this->ready) {
                self::wait();
            }

            for (;;);
        });
    }
}

$thread = new TestThread;
$thread->start();
$thread->synchronized(function () use ($thread) {
    $thread->ready = true;
    $thread->notify();
});

for (;;);

The question is now, should the programmer be able to assume that start() returns, no matter what the thread is doing in run()?

By the way, using b724d37, I see you have disabled sleep() and usleep(). Is time_nanosleep() a valid option (not for synchronization of course)?

@krakjoe
Copy link
Owner

krakjoe commented Sep 7, 2015

There isn't really an obvious pathway to that, if you look at the code for pthreads_start, it waits for a flag that is set before control enters zend to execute user code. Now, you may not actually see that in user code, for many reasons.

Start should indeed return no matter what the user is doing.

I'll have a go at reproducing with the setup you mention ... are you any good with gdb and can you use stackoverflow chat for comms ?

@krakjoe krakjoe reopened this Sep 7, 2015
@krakjoe
Copy link
Owner

krakjoe commented Sep 7, 2015

I reproduced it ...

@krakjoe krakjoe changed the title Thread::synchronized() locks main thread start does not return reliably Sep 7, 2015
krakjoe added a commit that referenced this issue Sep 7, 2015
@krakjoe
Copy link
Owner

krakjoe commented Sep 7, 2015

Feedback when you can ... off out this evening, but have a play with that and see if better ?

Just to note, I'm certain of the correctness of the original code, but when you search various bug trackers for these kinds of symptoms it becomes apparent that there are versions of glibc etc out in the wild that can have strange behaviour.

krakjoe added a commit that referenced this issue Sep 7, 2015
@krakjoe krakjoe closed this as completed in b889340 Sep 8, 2015
@krakjoe
Copy link
Owner

krakjoe commented Sep 8, 2015

That should do it ... thanks for all the effort it took to debug the problem @pfofi, can you confirm everything is okay ?

@pfofi
Copy link
Author

pfofi commented Sep 8, 2015

Confirmed, all good :)

@krakjoe
Copy link
Owner

krakjoe commented Sep 8, 2015

Awesome, thanks again for taking the time to get to the bottom of it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants