Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't use soTimeout to actively timeout NIO connections [INT-4450] #8392

Closed
spring-operator opened this issue Apr 12, 2018 · 5 comments
Closed
Assignees
Labels
in: TCP/UDP status: invalid Not reproducable or not relevant to the current state of the project

Comments

@spring-operator
Copy link
Contributor

Michel Jung opened INT-4450 and commented

Update: Save your time and read my comment first :)

I have an application with long-running TCP NIO connections.

Assume a client silently drops, e.g. because of loss of power. On the server side, the TCP connection will stay open for 15min until "Connection timed out".

Since this is too long for my use case, I tried setting AbstractConnectionFactory.setSoTimeout(int) to 10s. This value is then used for Socket.setSoTimeout here:

The JavaDoc of Socket.setSoTimeout reads:

[...] With this option set to a non-zero timeout, a read() call on the InputStream associated with this Socket will block for only this amount of time. If the timeout expires, a java.net.SocketTimeoutException is raised, though the Socket is still valid. [...]

So that's exactly what I want.

Next thing that happens is, my client gets dropped every 10s. How so? Well, soTimeout is also used to timeout active connections that haven't sent anything within that time:

else if (this.soTimeout > 0) {
TcpNioConnection connection = connections.get(channel);
if (now - connection.getLastRead() >= this.soTimeout) {
/*
* For client connections, we have to wait for 2 timeouts if the last
* send was within the current timeout.
*/
if (!connection.isServer() &&
now - connection.getLastSend() < this.soTimeout &&
now - connection.getLastRead() < this.soTimeout * 2) {
if (logger.isDebugEnabled()) {
logger.debug("Skipping a connection timeout because we have a recent send " +
connection.getConnectionId());
}
}
else {
if (logger.isWarnEnabled()) {
logger.warn("Timing out TcpNioConnection " + connection.getConnectionId());
}
SocketTimeoutException exception = new SocketTimeoutException("Timing out connection");
connection.publishConnectionExceptionEvent(exception);
connection.timeout();
connection.sendExceptionToListener(exception);
}
}
}

To my understanding, these are two different cases that should be configured differently. With the current implementation, the only way I see to achieve what I need is to implement a PING-PONG mechanism in both, the client and the server. That seems totally unnecessary though since it's TCP.

Am I missing something or should this be fixed in spring-integration?


Affects: 5.0.4

@spring-operator
Copy link
Contributor Author

Michel Jung commented

I just realized that Socket.setSoTimeout does not do what I want, so Spring Integration's use of soTimeout is consistent.
So this issue is invalid and can be closed.

Any way to reduce the TCP timeout when there's not ACK from the other side?

@spring-operator
Copy link
Contributor Author

Gary Russell commented

NIO doesn't have a native soTimeout (unlike Socket ).

I am not sure why it is not doing what you expect. With NIO, we use the soTimeout to approximate the Socket behavior. In the TcpNioServerConnectionFactory we use the soTimeout on the selector.select(). We then invoke the code you linked to processNioSelects.

Therein, if selectionCount == 0 (no events) and we are a server-side connection, and the soTimeout has been exceeded since the last read, we close the connection.

Now, it's an approximation; with a timeout of 10 seconds, it might take up to 19.999 seconds to timeout since the last read is likely to have occurred during the previous select so it might take 2 cycles to detect the timeout.

On the client side, we look at the last send time, and explicitly wait for 2x the timeout since we might be expecting a reply; but I assume you are on the server side.

If you are not seeing this behavior, a DEBUG log showing the unexpected behavior would be useful.

We have a test case for an NIO read timeout here.

@spring-operator
Copy link
Contributor Author

Gary Russell commented

I just wrote a quick app to verify...

@SpringBootApplication
public class Int4450Application implements ApplicationListener<TcpConnectionCloseEvent> {

	public static void main(String[] args) {
		SpringApplication.run(Int4450Application.class, args).close();
	}

	@Override
	public void onApplicationEvent(TcpConnectionCloseEvent event) {
		System.out.println(event);
	}

	@Bean
	public ApplicationRunner runner() {
		return args -> {
			Socket socket = SocketFactory.getDefault().createSocket("localhost", 1234);
			System.out.println("sending foo");
			socket.getOutputStream().write("foo\r\n".getBytes());
			Thread.sleep(30_000);
			socket.close();
		};
	}

	@Bean
	public IntegrationFlow flow(AbstractServerConnectionFactory connectionFactory) {
		return IntegrationFlows.from(Tcp.inboundAdapter(connectionFactory))
				.transform(Transformers.objectToString())
				.handle(System.out::println)
				.get();
	}

	@Bean
	public AbstractServerConnectionFactory cf() {
		TcpNioServerConnectionFactory cf = new TcpNioServerConnectionFactory(1234);
		cf.setSoTimeout(10_000);
		return cf;
	}

}

and

2018-04-12 13:12:35.264  INFO 52939 --- [pool-1-thread-1] .s.i.i.t.c.TcpNioServerConnectionFactory : cf, port=1234 Listening
sending foo
GenericMessage [payload=foo, headers={ip_tcp_remotePort=57547, ip_connectionId=localhost:57547:1234:4185fb1e-a368-4996-9959-02075645ee73, ip_localInetAddress=/127.0.0.1, ip_address=127.0.0.1, id=5407ef79-f6a0-8eb3-3e69-84e81b96382a, ip_hostname=localhost, timestamp=1523553155280}]
2018-04-12 13:12:45.278  WARN 52939 --- [pool-1-thread-1] .s.i.i.t.c.TcpNioServerConnectionFactory : Timing out TcpNioConnection localhost:57547:1234:4185fb1e-a368-4996-9959-02075645ee73
TcpConnectionCloseEvent [source=TcpNioConnection:localhost:57547:1234:4185fb1e-a368-4996-9959-02075645ee73], [factory=cf, connectionId=localhost:57547:1234:4185fb1e-a368-4996-9959-02075645ee73] **CLOSED**

@spring-operator
Copy link
Contributor Author

Gary Russell commented

After re-reading your description, I think you don't want to timeout the socket after 10s but you want to detect that the peer has silently closed the socket (powered off) or a network element (router, firewall etc) has done so.

The solution for that is soKeepAlive - the TCP stack will do the ping/pong for you and detect if the peer socket has gone away.

See the HOW TO

This keep alive time is set at the operating system/tcp stack level and is not configurable per socket. It is often quite a long time by default (2 hours on linux IIRC) but can be reduced.

Socket keepalives are done at a lower layer in the stack and there is no application interaction, which solves your

I need is to implement a PING-PONG mechanism in both, the client and the server. That seems totally unnecessary

comment.

@spring-operator
Copy link
Contributor Author

Gary Russell commented

Closing due to inactivity; if SO_KEEPALIVE doesn't solve your problem, please comment and we'll reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: TCP/UDP status: invalid Not reproducable or not relevant to the current state of the project
Projects
None yet
Development

No branches or pull requests

2 participants