New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHP Fatal error: Uncaught ErrorException: fwrite(): send of 19 bytes failed with errno=32 #424

Closed
Mirocow opened this Issue Jul 17, 2016 · 21 comments

Comments

Projects
None yet
@Mirocow
Copy link

Mirocow commented Jul 17, 2016

Sometimes after send basic_nack

$message->info['channel']->basic_nack($message->info['delivery_tag']);

Error:

PHP Fatal error:  Uncaught ErrorException: fwrite(): send of 19 bytes failed with errno=32 Broken pipe in /usr/local/var/www/1vse.loc/yotalot/vendor/videlalvaro/php-amqplib/PhpAmqpLib/Wire/IO/StreamIO.php:281
Stack trace:
#0 [internal function]: PhpAmqpLib\Wire\IO\StreamIO->error_handler(8, 'fwrite(): send ...', '/usr/local/var/...', 281, Array)
#1 /usr/local/var/www/1vse.loc/yotalot/vendor/videlalvaro/php-amqplib/PhpAmqpLib/Wire/IO/StreamIO.php(281): fwrite(Resource id #99, '\x01\x00\x01\x00\x00\x00\v\x00\x14\x00(\x00\x00\x00\x00...', 8192)
#2 /usr/local/var/www/1vse.loc/yotalot/vendor/videlalvaro/php-amqplib/PhpAmqpLib/Connection/AbstractConnection.php(327): PhpAmqpLib\Wire\IO\StreamIO->write('\x01\x00\x01\x00\x00\x00\v\x00\x14\x00(\x00\x00\x00\x00...')
#3 /usr/local/var/www/1vse.loc/yotalot/vendor/videlalvaro/php-amqplib/PhpAmqpLib/Connection/AbstractConnection.php(448): PhpAmqpLib\Connection\AbstractConnection->write('\x01\x00\x01\x00\x00\x00\v\x00\x14\x00(\x00\x00\x00\x00...')
#4 /usr/local/var/www/1vse.loc/yotalot/vendor in /usr/local/var/www/1vse.loc/yotalot/vendor/videlalvaro/php-amqplib/PhpAmqpLib/Wire/IO/StreamIO.php on line 281
PHP Stack trace:
PHP   1. yii\base\ErrorHandler->handleException($exception = class ErrorException { protected $message = 'fwrite(): send of 21 bytes failed with errno=32 Broken pipe'; private ${Exception}:string = 'ErrorException: fwrite(): send of 21 bytes failed with errno=32 Broken pipe in /usr/local/var/www/1vse.loc/yotalot/vendor/videlalvaro/php-amqplib/PhpAmqpLib/Wire/IO/StreamIO.php:281\nStack trace:\n#0 [internal function]: PhpAmqpLib\\Wire\\IO\\StreamIO->error_handler(8, \'fwrite(): send ...\', \'/usr/local/var/...\', 281, Array)\n#1 /usr/local/var/www/1vse.loc/yotalot/vendor/videlalvaro/php-amqplib/PhpAmqpLib/Wire/IO/StreamIO.php(281): fwrite(Resource id #99, \'\\x01\\x00\\x01\\x00\\x00\\x00\\r\\x00<\\x00P\\...'; protected $code = 0; protected $file = '/usr/local/var/www/1vse.loc/yotalot/vendor/videlalvaro/php-amqplib/PhpAmqpLib/Wire/IO/StreamIO.php'; protected $line = 281; private ${Exception}:trace = array (0 => array (...), 1 => array (...), 2 => array (...), 3 => array (...), 4 => array (...), 5 => array (...), 6 => array (...), 7 => array (...), 8 => array (...), 9 => array (...), 10 => array (...), 11 => array (...), 12 => array (...), 13 => array (...), 14 => array (...), 15 => array (...), 16 => array (...), 17 => array (...), 18 => array (...), 19 => array (...), 20 => array (...), 21 => array (...), 22 => array (...), 23 => array (...), 24 => array (...), 25 => array (...), 26 => array (...), 27 => array (...)); private ${Exception}:previous = NULL; protected $severity = 8; public $xdebug_message = '\nErrorException: fwrite(): send of 21 bytes failed with errno=32 Broken pipe in /usr/local/var/www/1vse.loc/yotalot/vendor/videlalvaro/php-amqplib/PhpAmqpLib/Wire/IO/StreamIO.php on line 281\n\nCall Stack:\n    0.0047     372344   1. {main}() /usr/local/var/www/1vse.loc/yotalot/yii:0\n    0.1164    2978784   2. yii\\base\\Application->run() /usr/local/var/www/1vse.loc/yotalot/yii:31\n    0.1168    3016944   3. yii\\console\\Application->handleRequest(class yii\\console\\Request) /usr/local/var/www/1vse.loc...' }) /usr/local/var/www/1vse.loc/yotalot/vendor/yiisoft/yii2/base/ErrorHandler.php:0
@michaelklishin

This comment has been minimized.

Copy link
Collaborator

michaelklishin commented Jul 18, 2016

What's in server logs?

@adamlc

This comment has been minimized.

Copy link

adamlc commented Jul 19, 2016

Weirdly im getting exactly the same. Its been fine for ages. @Mirocow are you on AWS?

@michaelklishin the server logs dont really reveal anything useful :(

PHP Fatal error:  Uncaught exception 'ErrorException' with message 'fwrite(): send of 19 bytes failed with errno=32 Broken pipe' in /root/worker/vendor/php-amqplib/php-amqplib/PhpAmqpLib/Wire/IO/StreamIO.php:281
Stack trace:
#0 [internal function]: PhpAmqpLib\Wire\IO\StreamIO->error_handler(8, 'fwrite(): send ...', '/root/worker/ve...', 281, Array)
#1 /root/worker/vendor/php-amqplib/php-amqplib/PhpAmqpLib/Wire/IO/StreamIO.php(281): fwrite(Resource id #616, '\x01\x00\x01\x00\x00\x00\v\x00\x14\x00(\x00\x00\x00\x00...', 8192)
#2 /root/worker/vendor/php-amqplib/php-amqplib/PhpAmqpLib/Connection/AbstractConnection.php(327): PhpAmqpLib\Wire\IO\StreamIO->write('\x01\x00\x01\x00\x00\x00\v\x00\x14\x00(\x00\x00\x00\x00...')
#3 /root/worker/vendor/php-amqplib/php-amqplib/PhpAmqpLib/Connection/AbstractConnection.php(448): PhpAmqpLib\Connection\AbstractConnection->write('\x01\x00\x01\x00\x00\x00\v\x00\x14\x00(\x00\x00\x00\x00...')
#4 /root/worker/vendor/php-amqplib/php-amqplib/PhpAmqpLib/Channel/AbstractChannel.php(230): PhpAmqpLib\Connect in /root/worker/vendor/php-amqplib/php-amqplib/PhpAmqpLib/Wire/IO/StreamIO.php on line 281
@adamlc

This comment has been minimized.

Copy link

adamlc commented Jul 20, 2016

I spent a little bit of time debugging this yesterday, I think I've figured it out. From what I can determine it looks like I had a blocking process that was taking longer than the heartbeat. RabbitMQ was then closing the channel (I can see this in the Management UI) hence the broken pipe error.

If im correct then I guess there needs to be a more elegant way of handling when the lib misses a heartbeat and the channel is closed.

@Mirocow

This comment has been minimized.

Copy link

Mirocow commented Jul 23, 2016

php

PHP 7.0.8 (cli) (built: Jun 23 2016 16:30:56) ( NTS )
Copyright (c) 1997-2016 The PHP Group
Zend Engine v3.0.0, Copyright (c) 1998-2016 Zend Technologies
with Xdebug v2.4.0, Copyright (c) 2002-2016, by Derick Rethans

I check it on Mac OS and Debian OS too.

when i don`t use basic_nack or basic_ack process work perfect

@aldump

This comment has been minimized.

Copy link

aldump commented Aug 26, 2016

I have the same issue

@aldump

This comment has been minimized.

Copy link

aldump commented Aug 26, 2016

rabbitmq_1 | =ERROR REPORT==== 26-Aug-2016::11:59:17 ===
rabbitmq_1 | closing AMQP connection <0.22032.0> (172.18.0.2:58931 -> 172.18.0.4:5672):
rabbitmq_1 | {writer,send_failed,{error,timeout}}

here is log

@vlefort

This comment has been minimized.

Copy link

vlefort commented Sep 8, 2016

Same issue... Some information about resolution ? Many tickets on this problem :/

@keatis

This comment has been minimized.

Copy link

keatis commented Jan 31, 2017

$channel->basic_qos(0, 1, false);
after making new connection worked for us (set the second parameter, prefetch count, to 1).
We consume one message at a time, so you may fit this parameter to your needs.

UPD: in that situation we had 2K of new messages each 30min, and it took us between 0.1 and 30 seconds to process and ack/nack a message. We had 60 as a heartbeat. And we often stuck with php exception after processing ~250 messages.

Look in here also: #91

@Jekis

This comment has been minimized.

Copy link

Jekis commented Jan 31, 2017

Solution is to set hearbeat longer time than consumer could take, or set heartbeat to 0. For example, if your consumer could take 60 seconds set heartbeat to 61.

@andrefigueira

This comment has been minimized.

Copy link

andrefigueira commented Jul 20, 2017

Getting the same issue, going to try @Jekis solution to see if it helps.

@speller

This comment has been minimized.

Copy link

speller commented Sep 26, 2017

I have the same issue. I did nothing but is started to appear. Tried to rise up timeouts and heartbeat values up to maximum values, tried to set keepalive to true but without success. Server says only this in his log:

=INFO REPORT==== 26-Sep-2017::14:33:19 ===
closing AMQP connection <0.13219.456> (153.x.x.x:56468 -> 185.x.x.x:5672)

My message producer sends 4-6 messages to the server and fails with the exception as in the subject.

@michaelklishin

This comment has been minimized.

Copy link
Collaborator

michaelklishin commented Sep 26, 2017

{writer,send_failed,{error,timeout}}

means that a socket write on the RabbitMQ end timed out. There isn't much RabbitMQ or this client can do to avoid them.

closing AMQP connection <0.13219.456> (153.x.x.x:56468 -> 185.x.x.x:5672)

simply means that a client connection was closed (lines before might indicate if the close was server-initiated but all we have here is a single line).

Missed heartbeats are visibly logged by the server. If a consumer can block I/O activity for longer than a heartbeat timeout, the server will indeed consider the peer unavailable but it will log more than just

closing AMQP connection <0.13219.456> (153.x.x.x:56468 -> 185.x.x.x:5672)

In general there are all kinds of things that can result in failed socket writes and not all of them are under client library's control. @adamlc explained one scenario above, which comes down to the fundamental limitation to how this client does I/O and dispatches consumer operations. Plenty of other cases are probably just failed socket writes.

Java and .NET clients ignore all I/O errors when a connection is being closed (because they don't matter any more). If someone has evidence that this client doesn't, please file a new issue with more details.

@michaelklishin

This comment has been minimized.

Copy link
Collaborator

michaelklishin commented Sep 26, 2017

This thread seems to be a honeypot for all fwrite failures and now I'm being accused of "closing issues without providing a workaround". Below are some common scenarios that may lead to failed socket writes in this or any other RabbitMQ client, compiled from from several years of RabbitMQ mailing list discussions. Please investigate if each of them may be applicable to your case first,
before filing a new issue or leaving a comment saying that "I too see a stack trace that looks like this".

Missed (Client) Heartbeats

First common reason is missed heartbeats detected by RabbitMQ. When this happens, RabbitMQ will add a log entry about it and then close connection, per specification requirements.
Here's what a missed client heartbeat looks like in RabbitMQ logs:

2017-09-26 08:04:53.596 [warning] <0.2375.628> closing AMQP connection <0.2375.628> (127.0.0.1:54720 -> 127.0.0.1:5672):
missed heartbeats from client, timeout: 8s

If consumer operations take longer than the heartbeat timeout, without substantial changes to how this client does I/O and operation dispatch, increasing the heartbeat is the only workaround I can think of (I'm hardly a PHP expert but pretty familiar with multiple other clients as well as RabbitMQ itself). Disabling heartbeats entirely is possible but is not something I'd recommend, even
though this client does this by default as far as I can tell. If you disable heartbeats,
make sure your kernel is configured to use sensible TCP keepalive settings, which by default is not the case on all popular
Linux distributions :(

An Intermediary Closes "Inactive" TCP Connections

Second common reason: TCP connection is closed by an intermediary (e.g. a proxy or load balancer).
If you see the following in RabbitMQ log

2017-09-26 08:08:39.659 [warning] <0.23042.628> closing AMQP connection <0.23042.628> (127.0.0.1:54792 -> 127.0.0.1:5672, vhost: '/', user: 'guest'):
client unexpectedly closed TCP connection

it means that client's TCP connection was closed before AMQP 0-9-1 (this client's) connection was. Sometimes this is harmless and means that apps do not close connections before they terminate. Not very nice but has no functional downsides. Such log entries could also indicate a failing client application process (irrelevant in this thread), or, quite commonly, a proxy closing the connection. Proxies and load balancers have TCP connection inactivity timeouts, mentioned in the Heartbeats guide). They often range from 30s to 5m.
While heartbeats and TCP keepalives were not designed to work around unfortunate load balancer or proxy settings, by producing periodic network traffic they happen to do just that.

Other Connection Lifecycle Log Entries

Below entries are not necessarily related to failed socket writes but it's worth explaining them
because they are very useful in troubleshooting connectivity-related issues (and can be
misinterpreted).

If you see just the following in RabbitMQ logs:

closing AMQP connection <0.13219.456> (153.x.x.x:56468 -> 185.x.x.x:5672)

(without any mentions or heartbeats, unexpectedly closed TCP connection, connection errors, or any timed out socket writes on RabbitMQ's end), it means that a client connection was cleanly and successfully closed, and it was the application that initiated it.

This guy

{writer,send_failed,{error,timeout}}

means that RabbitMQ attempted to write to a socket but that operation timed out. If you see this
at roughly the same time as a client write failure, it means that a connection went bad but
heartbeats or TCP keepalives were not enabled and did not detect it quicker.

Other Possible Reasons? TCP Connections Can Fail.

In most other cases, a failed socket write is just that, a failed socket write. Network connections can fail or degrade. This client or RabbitMQ cannot avoid that. This is exactly why messaging protocols such as AMQP 0-9-1, STOMP, MQTT have introduced heartbeats, which in this client doesn't serve its purpose very well.

An Alternative with Unfortunate Up Defaults: TCP Keepalives

TCP keepalives can be (and were meant to be, if it wasn't for Linux defaults that were great in 1990s but not any more) used as an alternative.
They do not guarantee that your connection will never fail — the goal is still to detect such connections quicker — and thus you will still likely see failed socket writes.

Hopefully this explains what may be going on here and why this issue cannot be once and for all addressed by this client, although a functioning concurrent heartbeat implementation would help a lot, as it does in other clients.

@m1ch3lp3r3z

This comment has been minimized.

Copy link

m1ch3lp3r3z commented Oct 3, 2017

I was getting the same issue, calling basic_ack throws that fwrite connection error, not for all messages, but for most of them. It starts consuming messages just fine and after some time I start seeing those errors (in a PHP cli long running script). It's been working like a charm for 2 years until I recently updated 2.4 => 2.6/2.7.

The quick workaround (I guess not the ideal solution tho) is @keatis suggestion, setting prefetch_count to 1:

$channel->basic_qos(0, 1, false);

slower but safe.

@Lara-m

This comment has been minimized.

Copy link

Lara-m commented Oct 26, 2017

I solved it this way:
It seemed that even when I stopped all php services, this issue was still there. So I got a list of processes and noticed so many semi-orphan processes. Killing them all solved the issue.

tl; dr :

  • stop services.
  • ps axjf to make sure there are orphans.
  • killall -9 php (or whatever else is orphan, like supervisord)
  • restart services.
@flc1125

This comment has been minimized.

Copy link

flc1125 commented Dec 8, 2017

我的问题是@michaelklishin说的如下原因,现在正常了

An Intermediary Closes "Inactive" TCP Connections

Second common reason: TCP connection is closed by an intermediary** (e.g. a proxy or load balancer)**.
If you see the following in RabbitMQ log

@kushsharma

This comment has been minimized.

Copy link

kushsharma commented Jun 2, 2018

I had the same issue. Setting prefetch count to 1 is working for me as I don't care little performance tradeoff with stability.
$channel->basic_qos(0, 1, false);
syntax:
basic_qos($prefetch_size, $prefetch_count, $a_global);

@anapsix

This comment has been minimized.

Copy link

anapsix commented Aug 21, 2018

For the record.. had similar issue, suspecting AWS ELB
Environment: AWS, RabbitMQ 3.7.7 in K8s, AMQP exposed via two services: 1 - "LoadBalancer" type K8s Service (AWS Load Balancer provisioned by K8s), 2 - "ClusterIP" type.

Consumer connecting to Service 1 (LoadBalancer) have been getting frequent errors

fwrite(): send of 13 bytes failed with errno=32 Broken pipe

From what I can tell, heartbeat is not set explicitly in PHP, and should default to 60.
ELB is configured with Idle timeout: 90 seconds. Yet, problem remained..

Issue went away when consumer was configured to connect to Service 2 (ClusterIP), instead of connecting to Service 1 (LoadBalancer).

@marek-obuchowicz

This comment has been minimized.

Copy link
Contributor

marek-obuchowicz commented Sep 5, 2018

@anapsix did you try increasing ELB idle timeout to bigger value (like 5mins)?
AFAIK the default heartbeat time is 0, you might want to explicitly set it to sth. like 30s

@marek-obuchowicz

This comment has been minimized.

Copy link
Contributor

marek-obuchowicz commented Sep 21, 2018

@michaelklishin thanks for very detailed response. It helps a lot.

However I still have one case:

  • in rabbitmq server logs I see indeed client unexpectedly closed TCP connection
  • i am connecting through AWS ELB, with idle timeout set to 3600s
  • client crashes instantly after running, no more than one second since starting, so i don't suspect connection being closed by the loadbalancer
  • client recieves error: fwrite(): send of 49 bytes failed with errno=11

Looking on errno=11 for fwrite - E_AGAIN - do you have any suggestions how to workaround this problem? The error happens occasionally (but regularly), description of fwrite syscall says:

The file descriptor fd refers to a file other than a socket
and has been marked nonblocking (O_NONBLOCK), and the write
would block.  See open(2) for further details on the
O_NONBLOCK flag.

Do you have any tips/ideas?

@speller

This comment has been minimized.

Copy link

speller commented Oct 2, 2018

In my case the problem was in the type of the $heartbeat parameter of the AMQPStreamConnection. When I sent null instead of 0, I got few kinds of connection corruption error messages. After fixing my code to send int 0, the problem has gone.

I suggest the authors to add type checking of the input parameters of the AMQPStreamConnection constructor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment