Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection timeout not always enforced in NIO mode for SSL connections #719

Closed
mgrafl opened this issue Jan 12, 2022 · 3 comments
Closed
Assignees
Labels
Milestone

Comments

@mgrafl
Copy link

mgrafl commented Jan 12, 2022

RabbitMQ Java client version 5.14.0 connection creation can hang forever on defective socket in the following scenario:

  • ConnectionFactory configured with useNio(), useSslProtocol(), and non-zero timeouts,
  • socket channel that does not write any data.

Environment:

  • Java: OpenJDK 11
  • OS: Linux CentOS 7 or Windows 10
  • RabbitMQ server: any

The defective socket channel occasionally happened in our production environment (possibly caused a by firewall or proxy).
I was able to reproduce it locally with a socket channel where the write operation always returns 0 bytes written or simply blocks:

int amqpPort = 5671 // assumes RabbitMQ server running on localhost;
int amqpProxyPort = 15671;

int connectionTimeout = 5_000;
int handshakeTimeout = 2_000;

// Simulate a connection via that doesn't send anything 
// using https://github.com/NetCrusherOrg/netcrusher-java
TransformFilterFactory outgoingTransformFilterFactory = clientAddress -> 
	buf -> {
		try {
			TimeUnit.SECONDS.sleep(20);
		} catch (InterruptedException e) {}
	};

try (var reactor = new NioReactor(); 
		var tcpProxy = TcpCrusherBuilder.builder()
			.withReactor(reactor)
			.withBindAddress(new InetSocketAddress(amqpProxyPort))
			.withConnectAddress("localhost", amqpPort)
			.withOutgoingTransformFilterFactory(outgoingTransformFilterFactory)
			.build()) {

	tcpProxy.open();

	ConnectionFactory factory = new ConnectionFactory();
	factory.setHost("localhost");
	factory.setPort(amqpProxyPort);

	factory.useSslProtocol();
	factory.useNio();

	factory.setConnectionTimeout(connectionTimeout);
	factory.setHandshakeTimeout(handshakeTimeout);

	ExecutorService executorService = Executors.newSingleThreadExecutor();
	Future<Connection> futureConnection = executorService
		.submit(() -> factory.newConnection());
	
	futureConnection.get(15, TimeUnit.SECONDS);
	// Expected: Timeout from RabbitMQ after 5+2 seconds
	// Actual: Timeout from Future after 15 seconds
}

On initial connection creation, the workaround is to have a separate thread that interrupts connection creation after a timeout.
But for auto-recovery there is no workaround.

@acogoluegnes acogoluegnes self-assigned this Jan 12, 2022
@acogoluegnes
Copy link
Contributor

Thanks for the snippet to reproduce the issue. In this case the program blocks on ReadableByteChannel#read(ByteBuffer) during the TLS handshake. The TLS is done in blocking mode, so there's nothing we can do, as the read method does not honor the SO_TIMEOUT option.

A solution would be to refactor the TLS handshake in non-blocking mode, but this part is already a snake pit, so I'd prefer to avoid this. I'm investigating another solution that involves temporary channels for the handshake.

acogoluegnes added a commit that referenced this issue Jan 12, 2022
The handshake is in blocking mode and reading from a channel
in this mode does not honor so_timeout but using temporary
channels based on the socket input/output streams offers
a decent workaround for this stage.

Fixes #719

(cherry picked from commit 2751d46)
acogoluegnes added a commit that referenced this issue Jan 12, 2022
The handshake is in blocking mode and reading from a channel
in this mode does not honor so_timeout but using temporary
channels based on the socket input/output streams offers
a decent workaround for this stage.

Fixes #719

(cherry picked from commit 2751d46)

Conflicts:
	pom.xml
@acogoluegnes
Copy link
Contributor

@mgrafl A 5.14.1 snapshot is available with a fix, can you give it a try?

@mgrafl
Copy link
Author

mgrafl commented Jan 12, 2022

Thanks for resolving this so quickly! Looks good to me.

@acogoluegnes acogoluegnes added this to the 5.14.1 milestone Jan 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants