Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak related to TCP connection management with use of STOMP broker relay [SPR-11531] #16156

Closed
spring-projects-issues opened this issue Mar 8, 2014 · 20 comments
Assignees
Labels
in: messaging in: web type: bug
Milestone

Comments

@spring-projects-issues
Copy link
Collaborator

@spring-projects-issues spring-projects-issues commented Mar 8, 2014

Kevin Jordan opened SPR-11531 and commented

This may be more of a bug with reactor which Spring websockets uses, but it seems it builds up a lot of connections. I get almost 800MB in just a few short days. I don't have 65k connections active at any given time so this shouldn't be this way. In my configuration I'm using stomp with rabbitmq.

Configuration:

@Configuration
@EnableWebSocketMessageBroker
@EnableScheduling
@ComponentScan(basePackages = "com.practiware.desktop.server")
public class WebSocketConfig implements WebSocketMessageBrokerConfigurer, SchedulingConfigurer {

	Logger logger = LoggerFactory.getLogger(this.getClass());

	private ThreadPoolTaskScheduler scheduler;

	@Override
	public void configureClientInboundChannel(ChannelRegistration registration) {
		registration.setInterceptors(new ChannelInterceptor() {

			@Override
			public Message<?> postReceive(Message<?> message,
										  MessageChannel channel) {
				WebSocketConfig.this.logger.info("Post-receive on channel of type: " + channel.getClass().getName());
				WebSocketConfig.this.logger.info("Headers: " + message.getHeaders().toString());
				WebSocketConfig.this.logger.info("Payload: " + String.valueOf(message.getPayload()));
				return message;
			}

			@Override
			public void postSend(Message<?> message,
								 MessageChannel channel,
								 boolean sent) {
				WebSocketConfig.this.logger.info("Post-send on channel of type: " + channel.getClass().getName());
				WebSocketConfig.this.logger.info("Headers: " + message.getHeaders().toString());
				WebSocketConfig.this.logger.info("Payload: " + String.valueOf(message.getPayload()));
			}

			@Override
			public boolean preReceive(MessageChannel channel) {
				WebSocketConfig.this.logger.info("Pre-receive on channel of type: " + channel.getClass().getName());
				return true;
			}

			@SuppressWarnings("unchecked")
			@Override
			public Message<?> preSend(Message<?> message,
									  MessageChannel channel) {
				try {
					WebSocketConfig.this.logger.info("Pre-send on channel of type: " + channel.getClass().getName());
					WebSocketConfig.this.logger.info("Headers: " + message.getHeaders().toString());
					WebSocketConfig.this.logger.info("Payload: " + String.valueOf(message.getPayload()));
					/*Security checks for current user remove*/
				} catch (Exception e) {
					WebSocketConfig.this.logger.error("Error checking permissions",
													  e);
				}
				return message;
			}
		});
	}

	@Override
	public void configureClientOutboundChannel(ChannelRegistration registration) {
		registration.setInterceptors(new ChannelInterceptor() {

			@Override
			public Message<?> postReceive(Message<?> message,
										  MessageChannel channel) {
				WebSocketConfig.this.logger.info("Post-receive on channel of type: " + channel.getClass().getName());
				WebSocketConfig.this.logger.info("Headers: " + message.getHeaders().toString());
				WebSocketConfig.this.logger.info("Payload: " + String.valueOf(message.getPayload()));
				return message;
			}

			@Override
			public void postSend(Message<?> message,
								 MessageChannel channel,
								 boolean sent) {
				WebSocketConfig.this.logger.info("Post-send on channel of type: " + channel.getClass().getName());
				WebSocketConfig.this.logger.info("Headers: " + message.getHeaders().toString());
				WebSocketConfig.this.logger.info("Payload: " + String.valueOf(message.getPayload()));
			}

			@Override
			public boolean preReceive(MessageChannel channel) {
				WebSocketConfig.this.logger.info("Pre-receive on channel of type: " + channel.getClass().getName());
				return true;
			}

			@Override
			public Message<?> preSend(Message<?> message,
									  MessageChannel channel) {
				WebSocketConfig.this.logger.info("Pre-send on channel of type: " + channel.getClass().getName());
				WebSocketConfig.this.logger.info("Headers: " + message.getHeaders().toString());
				WebSocketConfig.this.logger.info("Payload: " + String.valueOf(message.getPayload()));
				return message;
			}
		});
	}

	@Override
	public void configureMessageBroker(MessageBrokerRegistry registry) {
		registry.enableStompBrokerRelay("/topic/",
										"/queue/");
		registry.setApplicationDestinationPrefixes("/app");
	}

	@Override
	public void configureTasks(ScheduledTaskRegistrar taskRegistrar) {
		taskRegistrar.setScheduler(this.scheduler);
	}

	@Override
	public void registerStompEndpoints(StompEndpointRegistry registry) {
		registry.addEndpoint("/subscriptions").withSockJS();
	}

	@PersistenceContext
	public void setEntityManager(EntityManager em) {
		this.em = em;
	}

	@Autowired
	@Qualifier("messageBrokerSockJsTaskScheduler")
	public void setScheduler(ThreadPoolTaskScheduler scheduler) {
		this.scheduler = scheduler;
	}
}
@Configuration
@EnableWebMvc
public class WebConfig extends WebMvcConfigurerAdapter {

	@Override
	public void configureDefaultServletHandling(DefaultServletHandlerConfigurer configurer) {
		configurer.enable();
	}
}

Could it somehow be opening connections for non-websocket connections? Do I have a mis-configuration? It works fine except for the memory leak.

I've attached a screenshot from MAT from a memory dump of my webapp.


Affects: 4.0 GA, 4.0.2

Attachments:

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Mar 10, 2014

Rossen Stoyanchev commented

After some testing and looking through the code, I found an issue with closing connections, which should be fixed now.

Despite this issue, due to the heartbeat support, the broker should eventually (10 seconds by default) notice the client is no longer active, and the connections should be getting closed. I am wondering if you're by any chance explicitly setting the heartbeat settings to 0,0 from the client side?

I've also added logging messages that show the number of active WebSocket sessions and the number of TCP connections to the broker to make it easier to debug. Could you try 4.0.3.BUILD-SNAPSHOT after this build succeeds to confirm if that fixes the issue?

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Mar 10, 2014

Kevin Jordan commented

I'll try the snapshot. I've got the hearbeat set to the default through stomp.js:
Web Socket Opened...

CONNECT
login:guest
passcode:guest
accept-version:1.1,1.0
heart-beat:10000,10000

I also wonder if it has something to do with going through an EC2 load balancer. Although since it's using websockets, I wouldn't think it should have more than one connection open.

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Mar 10, 2014

Rossen Stoyanchev commented

Is the load balancer configured to use sticky sessions? Indeed for the websocket transport it wouldn't matter, but for other transports (for IE < 10 clients for example) it would. See the comments on "load balancer" in the sockjs protocol test suite.

Do give this change a try. The added log messages could also be helpful for tracking down issues.

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Mar 11, 2014

Kevin Jordan commented

I'll run a new memory dump at the end of the day but I left a client up overnight and my server didn't run out of memory yet, so that's good so far.

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Mar 11, 2014

Kevin Jordan commented

I don't think that fixed it. I've attached a new memory analysis after 1 day of running.

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Mar 11, 2014

Rossen Stoyanchev commented

Alright, we'll keep looking for what's causing this. Could you provide a basic description of the test scenario? Also I wonder what logging is enabled, in particular the following log message appear on every new request. If that is enabled what numbers do you see?

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Mar 11, 2014

Rossen Stoyanchev commented

Also what browsers are clients executing from? Specifically I wonder if you know what SockJS transports are being used (e.g. IE 7/8/9 would imply HTTP-based transports). Also you mentioned EC2 load balancing, is that configured for sticky sessions?

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Mar 14, 2014

Rossen Stoyanchev commented

Kevin Jordan, any updates? Given a basic description I can try to reproduce this on my side.

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Mar 14, 2014

Kevin Jordan commented

The clients are Chrome and Firefox. The EC2 load balancer isn't set up for sticky sessions, but it also only has one instance running currently. It may be related somehow to Amazon EC2 and/or VPC structure. I don't know if our local testing box just gets so many webapp updates pushed to it that I don't see the memory problem since Tomcat is restarted several times a day or if it's something with Ubuntu 12.04 (latest version available in OpsWorks) or something inside AWS. I thought maybe it was somehow to do with the Java version since my testing box was behind on that (Oracle JDK 7.0.45 locally vs Oracle JDK 7.0.51 on Amazon), but after updating the testing box it doesn't show it at least overnight (jmap dump gives a size of about 300MB with none of the results showing a lot of NettyTcpConnections). The testing box is also outside of Amazon with a lot more memory than an m1.small provides. I may set up a duplicate environment and see if I can get it to do it there too. If possible, you might try doing an example under an EC2 environment as well.

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Mar 14, 2014

Rossen Stoyanchev commented

Okay thanks for that detail. What about the test itself, how many clients are you running, what are those clients doing, and what does the server do? In very general terms of course. I'm mainly interested in getting a sense of the overall message flow, and events such as clients connecting and disconnecting.

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Mar 14, 2014

Kevin Jordan commented

Not very many clients. Maybe 5-10 at most during the day. It seems like the connections build up regardless of whether someone is connected or not. So I'm assuming those are connections to my RabbitMQ server and not connections to the client.

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Mar 14, 2014

Rossen Stoyanchev commented

We maintain only one connection to RabbitMQ (it's called "system" connection in the configuration) for sending messages from within the application to the broker for broadcasting purposes. As clients connect we create additional connections so each client can have its own STOMP session to the broker but those connections are opened and closed in tandem with the client connections.

It seems like the connections build up regardless of whether someone is connected or not.

The "system" connection will try to reconnect if it gets closed, I wonder if that's somehow causing an issue. If you are able to confirm that the number of connections grows even without clients, that would be very helpful. This also just came in #16178 that may be related. Could the RabbitMQ service be becoming unavailable at times triggering retries in turn? BTW whenever the broker becomes available/unavailable an ApplicationContext event is triggered so you can monitor and log this. For example see the QuoteService in the portfolio sample.

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Mar 14, 2014

Kevin Jordan commented

I'd sort of doubt that the RabbitMQ service became unavailable as it's on the same host.

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Mar 14, 2014

Rossen Stoyanchev commented

I have it reproduced. Thanks.

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Mar 17, 2014

Rossen Stoyanchev commented

Kevin Jordan, there is a new reactor-tcp 1.0.1.RELEASE available that should address the issue. It does for me locally. You can get it immediately by referencing the http://repo.spring.io/libs-release repository. Later today it should also appear in Maven central. Please try that together with the latest Spring Framework 4.0.3.BUILD-SNAPSHOT and let me know.

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Mar 17, 2014

Rossen Stoyanchev commented

It's in Maven central now (reactor-tcp 1.0.1.RELEASE).

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Mar 17, 2014

Kevin Jordan commented

I've got a new version of my app going out and I'll let you know if I still get a memory leak or not.

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Mar 18, 2014

Kevin Jordan commented

I'll run a jmap on it later tonight, but if there is still a leak I think it's cut down quite a bit.

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Mar 20, 2014

Kevin Jordan commented

Ran another jmap tonight and no signs of a leak. I think this can be closed now.

@spring-projects-issues
Copy link
Collaborator Author

@spring-projects-issues spring-projects-issues commented Mar 20, 2014

Rossen Stoyanchev commented

Great to hear, thanks for confirming!

@spring-projects-issues spring-projects-issues added type: bug in: messaging in: web labels Jan 11, 2019
@spring-projects-issues spring-projects-issues added this to the 4.0.3 milestone Jan 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: messaging in: web type: bug
Projects
None yet
Development

No branches or pull requests

2 participants