-
Notifications
You must be signed in to change notification settings - Fork 141
Description
Problem
We have a Fluentd instance running behind an Amazon Load Balancer. Load balancer has a timeout setting (defaults to 60s, for purposes of testing I have set it to 5s). As per docs: "If no data has been sent or received by the time that the idle timeout period elapses, the load balancer closes the connection."
When sending data, the FluentSender tries to reuse the connection if possible, and reconnect in case of a socket.error
. If we keep sending events one after another (time between sends is < timeout) everything works fine. But if we consider the following example:
- Send event no. 1
- Wait for timeout (e.g. 10s)
- Send event no. 2
Since the connection has been closed by the load balancer, I would expect to receive a socket error when sending event no.2. But the FluentSender (or specifically, the socket
object) does not seem to register that the connection was dropped. There is no error when sending event no. 2 (emit
will return True
), but the event never reaches the destination.
Debugging info
This was tested on Python 2.7.6 and Python 3.4.3.
I have prepared some examples to demonstrate the issue. For simplicity, we are using the socket
directly, without using the FluentSender.
Example 1
In this example we send 1 event, then wait for timeout, and send a few more events. There is no socket error, even though the connection was dropped.
Code:
import socket
import time
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(3.0)
sock.connect(('myhost, 24224))
# send event no. 1
print('Sending event 1')
sock.sendall('event 1')
# wait for timeout
print('Waiting for timeout...')
time.sleep(10)
# send more events
for i in range(2, 10):
print('Sending event {}'.format(i))
sock.sendall('event {}'.format(i))
Output:
Sending event 1
Waiting for timeout...
Sending event 2
Sending event 3
Sending event 4
Sending event 5
Sending event 6
Sending event 7
Sending event 8
Sending event 9
Sending event 10
Example 2
This is a similar example as Example 1, but with a short delay between additional events. In this case we do get a socket error, but only when trying to send the 4th event after the timeout.
Code:
# ... snip
# send more events
for i in range(2, 10):
print('Sending event {}'.format(i))
sock.sendall('event {}'.format(i))
# add a short delay before sending the next event
time.sleep(0.1)
Output:
Sending event 1
Waiting for timeout...
Sending event 2
Sending event 3
Sending event 4
Sending event 5
Traceback (most recent call last):
File "socktst.py", line 18, in <module>
sock.sendall('event {}'.format(i))
File "/usr/lib/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
socket.error: [Errno 32] Broken pipe
Solution?
One way of solving this would be to add an option to always reconnect (False by default), instead of reusing the connection. The problem might also be caused with the way the load balancer drops connections, but not sure how to check this. So any help on this is appreciated.