Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Socket 'EPIPE' broken pipe handling #243

Closed
Devristo opened this issue Dec 10, 2013 · 4 comments
Closed

Socket 'EPIPE' broken pipe handling #243

Devristo opened this issue Dec 10, 2013 · 4 comments
Labels

Comments

@Devristo
Copy link

First of all, thanks for bringing cool stuff like this to php!

After a while my event loop takes 100% cpu, however it remains functional. I tried to debug the reason. I am not 100% sure but I think I have a plausible theory of what is causing it.

The Buffer class does not (and cannot) detect broken pipes when writing using fwrite. React keeps it in the loop's 'write' list indefinitely. Causing high cpu usage on my machine.

Here is an strace of the php process with high cpu. The event loop (libevent in my case) is still functioning, but it keeps running into the broken pipe.

17:38:41 epoll_wait(5, {{EPOLLIN|EPOLLOUT|EPOLLHUP, {u32=15, u64=15}}}, 32, 2406) = 1
17:38:41 clock_gettime(CLOCK_MONOTONIC, {5221949, 315601823}) = 0
17:38:41 gettimeofday({1386693521, 503053}, NULL) = 0
17:38:41 recv(15, "", 4096, 0)          = 0
17:38:41 send(15, "\201~\2a{\"tag\":\"server-52a740b77da5f"..., 8192, 0) = -1 EPIPE (Broken pipe)
17:38:41 --- SIGPIPE (Broken pipe) @ 0 (o) ---
17:38:41 epoll_wait(5, {{EPOLLIN|EPOLLOUT|EPOLLHUP, {u32=15, u64=15}}}, 32, 2405) = 1
17:38:41 clock_gettime(CLOCK_MONOTONIC, {5221949, 316353584}) = 0
17:38:41 gettimeofday({1386693521, 503805}, NULL) = 0
17:38:41 recv(15, "", 4096, 0)          = 0
17:38:41 send(15, "\201~\2a{\"tag\":\"server-52a740b77da5f"..., 8192, 0) = -1 EPIPE (Broken pipe)
17:38:41 --- SIGPIPE (Broken pipe) @ 0 (o) ---
17:38:41 epoll_wait(5, {{EPOLLIN|EPOLLOUT|EPOLLHUP, {u32=15, u64=15}}}, 32, 2404) = 1
17:38:41 clock_gettime(CLOCK_MONOTONIC, {5221949, 317030692}) = 0
17:38:41 gettimeofday({1386693521, 504483}, NULL) = 0
17:38:41 recv(15, "", 4096, 0)          = 0
17:38:41 send(15, "\201~\2a{\"tag\":\"server-52a740b77da5f"..., 8192, 0) = -1 EPIPE (Broken pipe)
17:38:41 --- SIGPIPE (Broken pipe) @ 0 (o) ---
17:38:41 epoll_wait(5, {{EPOLLIN|EPOLLOUT|EPOLLHUP, {u32=15, u64=15}}}, 32, 2404) = 1
17:38:41 clock_gettime(CLOCK_MONOTONIC, {5221949, 317708032}) = 0
17:38:41 gettimeofday({1386693521, 505159}, NULL) = 0
17:38:41 recv(15, "", 4096, 0)          = 0
17:38:41 send(15, "\201~\2a{\"tag\":\"server-52a740b77da5f"..., 8192, 0) = -1 EPIPE (Broken pipe)
17:38:41 --- SIGPIPE (Broken pipe) @ 0 (o) ---
17:38:41 epoll_wait(5, {{EPOLLIN|EPOLLOUT|EPOLLHUP, {u32=15, u64=15}}}, 32, 2403) = 1
17:38:41 clock_gettime(CLOCK_MONOTONIC, {5221949, 318382954}) = 0
17:38:41 gettimeofday({1386693521, 505833}, NULL) = 0
17:38:41 recv(15, "", 4096, 0)          = 0
17:38:41 send(15, "\201~\2a{\"tag\":\"server-52a740b77da5f"..., 8192, 0) = -1 EPIPE (Broken pipe)
17:38:41 --- SIGPIPE (Broken pipe) @ 0 (o) ---
17:38:41 epoll_wait(5, {{EPOLLIN|EPOLLOUT|EPOLLHUP, {u32=15, u64=15}}}, 32, 2402) = 1
17:38:41 clock_gettime(CLOCK_MONOTONIC, {5221949, 319065790}) = 0
17:38:41 gettimeofday({1386693521, 506519}, NULL) = 0
17:38:41 recv(15, "", 4096, 0)          = 0
17:38:41 send(15, "\201~\2a{\"tag\":\"server-52a740b77da5f"..., 8192, 0) = -1 EPIPE (Broken pipe)
17:38:41 --- SIGPIPE (Broken pipe) @ 0 (o) ---
17:38:41 epoll_wait(5, {{EPOLLIN|EPOLLOUT|EPOLLHUP, {u32=15, u64=15}}}, 32, 2402) = 1

I think I ran into the following php 'bug':

http://www.php.net/manual/en/function.fwrite.php#96951
https://bugs.php.net/bug.php?id=39598

The post suggest that you cannot depend on the return value of fwrite to detect errors. The first link suggests testing whether the value is 0. Although this not a perfect solution, since 0 is also returned when the socket's internal send buffer is full.

Perhaps a fix could be that the buffer emits an error out after a few consecutive 0-byte writes?

@igorw
Copy link
Contributor

igorw commented Dec 10, 2013

Can you share some more details such as: operating system + version, PHP version, libevent version, libevent ext version, minimal testing script?

@Devristo
Copy link
Author

OS is Ubuntu Server 12.04
PHP 5.5.6-1+debphp.org~precise+2 (cli) (built: Nov 21 2013 14:39:07)
libevent version => 2.0.16-stable

I have installed libevent ext using:
pecl install channel://pecl.php.net/libevent-0.1.0

I dont have a minimal testing script at the time, I will look around to see how to emulate a broken pipe error.

@Devristo
Copy link
Author

I wasnt able to reproduce it, except once while actually yanking out my network connection. I am not sure what causes the problem, my setup or the libevent loop used. But the strace output suggests that broken streams remain in the event loop for some reason.

However since I made two minor changes in Connection.php and Buffer.php I haven't ran into the problems on my production server.

I know you probably cant do much without reproducing it. But if the issue pops up again here is what I changed:

Buffer.php (This solved the recurring epoll_wait on a EPIPE socket)

    public function handleWrite()
    {
        if (!is_resource($this->stream) || ('generic_socket' === $this->meta['stream_type'] && feof($this->stream))) {
            $this->emit('error', array(new \RuntimeException('Tried to write to closed or invalid stream.')));

            return;
        }

        set_error_handler(array($this, 'errorHandler'));

        $sent = fwrite($this->stream, $this->data);

        restore_error_handler();

        if (false === $sent || 0 === $sent) {  // This will break the connection if socket's OS send buffer is full, retry count would be more fancy
            $this->emit('error', array(new \ErrorException(
                $this->lastError['message'],
                0,
                $this->lastError['number'],
                $this->lastError['file'],
                $this->lastError['line']
            )));

            return;
        }

        $len = strlen($this->data);
        if ($len >= $this->softLimit && $len - $sent < $this->softLimit) {
            $this->emit('drain');
        }

        $this->data = (string) substr($this->data, $sent);

        if (0 === strlen($this->data)) {
            $this->loop->removeWriteStream($this->stream);
            $this->listening = false;

            $this->emit('full-drain');
        }
    }

Later I noticed I had the same problem with sockets that I never write too, only read. The strace showed high-pace repeated 0 bytes recv from the same socket raising CPU usage as well. Since there the epoll returns the socket, there should either be bytes read or the socket has been closed. Therefore I changed the Connection.php too:

Connection.php

    public function handleData($stream)
    {
        if (!is_resource($stream) || feof($stream)) {
            $this->end();
                return;
        }

        $data = fread($stream, $this->bufferSize);
        if ('' === $data || false === $data) {
            $this->end();
        } else {
            $this->onData($data);
        }
    }

Above can be simplified, I left it as is since it worked for me. I have fread in it, I know there is some discussion whether to use stream_socket_recv or this, I guess that would work just as well.

@Devristo
Copy link
Author

Seems this commit fixed the issue as well: (the addition of feof helps, the Buffer modification is not necessary)

424e873#diff-ff2311b1d39f83cb5ca5c77f10391d2c

Will close this issue myself.

Edit: seems that I used a way old version. Sorry for all the spamming. Thanks again for this great component!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants