Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.util.concurrent.TimeoutException for meta connect #843

Open
princejt opened this Issue Mar 8, 2019 · 16 comments

Comments

Projects
None yet
2 participants
@princejt
Copy link

princejt commented Mar 8, 2019

We are facing an issue where occassionally for meta-connect calls from client,
cometdserver is sending a 500 error.We enabled debug logging and on server logs we could see a concurrent timeout exception.

Logs are attached.

1)What is causing this exception? Is there a way to resolve this ?

2)From stack trace the exception thrown is TimeoutException which should have resulted in error code 408. But client is recieving a 500 internal server error

https://github.com/cometd/cometd/blob/4.0.x/cometd-java/cometd-java-server/src/main/java/org/cometd/server/transport/AsyncJSONTransport.java#L81

Note
AsyncJSONTransportForTomcat seen in log is extenstion of AsyncJSONTransport where asyncContext.setTimeout(0) is changed to asyncContext.setTimeout(90000) . this was done
since AsyncJSONTransport, disables the AsyncContext timer by explicitly setting it to zero (0),
and when running in Tomcat, this causes instances of Http11Processor to be placed in an "async wait queue indefinitely, leading to a memory leak.
logs.txt

@sbordet

This comment has been minimized.

Copy link
Member

sbordet commented Mar 8, 2019

Calling asyncContext.setTimeout(0) is explicitly supported by the Servlet specification, so it's a Tomcat bug if it leads to a memory leak.

Seems to me that you are hitting one or more Tomcat bugs.
Have you tried Jetty instead?

The logs are too short to be useful. If you attach the full logs, I may be able to interpret better what's going on.

@princejt

This comment has been minimized.

Copy link
Author

princejt commented Mar 8, 2019

Unfortunately we cant change tomcat,

I have added more logs. Thanks for looking into this
AdditionalLogs.txt

@sbordet

This comment has been minimized.

Copy link
Member

sbordet commented Mar 9, 2019

This is what I found from your logs.

Session 1s1dcv5j9q7x31s2563mtd13612 starts at 16:39:03,218. The server receives the /meta/handshake message, a /meta/subscribe message, a /meta/connect message with id=4, and normal message.

Since the /meta/connect message with id=4 has been written successfully, the client sends another /meta/connect message with id=5 and it is held by the server at 16:39:03,918.

A server event happens, the /meta/connect is woken up, written to the client, which sends another to the server with id=6.
Another few of these until the /meta/connect with id=11 is suspended at 16:39:16,670.

At 16:39:22,295, the client decides to send to the server a /meta/connect with id=4:

2019-03-07 16:39:22,295 DEBUG [DefaultThreadPool-23] org.cometd.server.transport.AsyncJSONTransportForTomcat Processing {clientId=1s1dcv5j9q7x31s2563mtd13612, advice={timeout=0}, channel=/meta/connect, id=4, connectionType=long-polling, timestamp=Fri, 08 Mar 2019 00:39:22 GMT}
2019-03-07 16:39:22,295 DEBUG [DefaultThreadPool-23] org.cometd.server.transport.AsyncJSONTransportForTomcat Cancelling {clientId=1s1dcv5j9q7x31s2563mtd13612, channel=/meta/connect, id=11, connectionType=long-polling, timestamp=Fri, 08 Mar 2019 00:39:16 GMT}

and the server has /meta/connect with id=11 suspended, so it cancels it and this results in the TimeoutException you are seeing.

Now, the message arriving from the client has the timestamp extension, and the time is right (00:39:22), so this message is not a copy that was leftover in the server buffers, it is really a message coming from the client that updated the timestamp but for some reason is a copy of the /meta/connect with id=4.

This seems a bug in the client.

What client are you using?
What exact version of CometD?

@princejt

This comment has been minimized.

Copy link
Author

princejt commented Mar 9, 2019

Client is javascript . Cometd version is "3.1.4"

@sbordet

This comment has been minimized.

Copy link
Member

sbordet commented Mar 9, 2019

Client is javascript . Cometd version is "3.1.4"

CometD is definitely 4.0.x from your logs, not 3.1.4.

"Client is javascript" means you are using the official CometD JavaScript client, or some other JavaScript client?

What exact version is your JavaScript client (the version is written at the beginning of the cometd.js file itself, so please make sure you report the version that is written in the file)?

Do you have the client-side DEBUG logs?

@princejt

This comment has been minimized.

Copy link
Author

princejt commented Mar 9, 2019

Sorry for the confusion caused by the earlier commet.

The client for our service is using javascript. They are using cometd library version 3.1.4
https://www.npmjs.com/package/cometd/v/3.1.4

The server(which is java) cometd version is 4.0.0.

@sbordet

This comment has been minimized.

Copy link
Member

sbordet commented Mar 9, 2019

@princejt can you please detail exactly how is your system architected?

I find strange that you have a server 4.0.x, but you serve to the client an old version of the CometD JavaScript library.
Can the client be upgraded to 4.0.x?

Is there any network element (reverse proxies, firewalls, load balancers, etc.) in between the client and the server that may cache or otherwise alter the request body, or delay responses, or change in some way the normal traffic?

Can you get the client DEBUG logs?

@princejt

This comment has been minimized.

Copy link
Author

princejt commented Mar 9, 2019

The client version/upgrade isn't something driven by us. But will check if that can happen and see if that resolves the issue.

I dont have the client side debug logs right now. Will share as soon as I have it.

Architecture:

The request from the browser goes to a node.js server which act as a proxy and forwards the request to the cometd java server. There is a loadbalancer between the node.js server and cometd server.

browser -->
node.js server[proxy] --->
load balancer: [with cookie persistence for stickiness]-->
cometdserver

@sbordet

This comment has been minimized.

Copy link
Member

sbordet commented Mar 9, 2019

Then the true client for the CometD server is a NodeJS application, not the browser.

Which one is it? Who wrote it?

Has it been written using the CometD NodeJS support?

@princejt

This comment has been minimized.

Copy link
Author

princejt commented Mar 11, 2019

We are working with client team to get the additional information.

Meanwhile trying to figure out why client is recieving 500 error.

From stack trace the exception thrown is TimeoutException which should have resulted in error code 408. But client is recieving a 500 internal server error

https://github.com/cometd/cometd/blob/4.0.x/cometd-java/cometd-java-server/src/main/java/org/cometd/server/transport/AsyncJSONTransport.java#L81

@sbordet

This comment has been minimized.

Copy link
Member

sbordet commented Mar 11, 2019

What "client" do you refer to? The NodeJS client, or the browser client?

@princejt

This comment has been minimized.

Copy link
Author

princejt commented Mar 11, 2019

Nodejs client

@sbordet

This comment has been minimized.

Copy link
Member

sbordet commented Mar 11, 2019

Well, we know nothing about your NodeJS client. Perhaps you have a bug there.

@princejt

This comment has been minimized.

Copy link
Author

princejt commented Mar 14, 2019

Figured out the reason for change in response code.

https://github.com/cometd/cometd/blob/4.0.x/cometd-java/cometd-java-server/src/main/java/org/cometd/server/transport/AsyncJSONTransport.java#L82

sendError we set

request.setAttribute(RequestDispatcher.ERROR_EXCEPTION, failure);
response.sendError(code);

Tomcat is searching for corresponding error page of the exception. Since no page is mapped status code is updated to 500.

Also identified this exception is occurring when browser is refreshed.

@sbordet

This comment has been minimized.

Copy link
Member

sbordet commented Mar 14, 2019

Uhm. I think I should skip sendError() and just call setStatus() then, as I don't want the error page mechanism to kick in for CometD.
Filed #848.

@sbordet

This comment has been minimized.

Copy link
Member

sbordet commented Mar 14, 2019

@princejt this should be fixed now. Are you able to test the latest CometD 4.0.x code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.