-
Notifications
You must be signed in to change notification settings - Fork 859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple calls to the RpcClient # getConnection method fail to keep the connection persistent #210
Comments
Hi @Synex-wh, we detect non-English characters in the issue. This comment is an auto translation by @sofastack-robot to help other users to understand this issue.
** Describe the bug ** ### Phenomenon-Because the background of the registration center needs to maintain the link between the two service nodes, after the system is full or other reasons cause the link to be broken, I hope to continue to call com.alipay.remoting.rpc.RpcClient #getConnection (com.alipay.remoting.Url, int) method to ensure the link-At present, it is found that the task of continuously calling this method has been performed, but the Connection object obtained is null, which reflects that the two nodes have broken the link and no longer Reconnect-Perform a memory dump to find out that com.alipay.remoting.DefaultConnectionManager # connTasks This object corresponds to the RunStateRecordedFutureTask with the linked ip port as key and the feedback Outcome object ConnectionPool is not null, but the Connection list in ConnectionPool There is no data isempty, so even if the connection object returned by calling the getConnection method always returns a null connection object, it will not trigger the re-establishment of the connection, because the connection method must be cleared in connTasks about the key task before triggering task.run again. Will trigger! [Image] (https://user-images.githubusercontent.com /8018119/70888770-08617800-201c-11ea-816f-54cfdf0adc6d.png)-View the code, this RunStateRecordedFutureTask object cleanup is only triggered by the IO broken link event com.alipay.remoting.DefaultConnectionManager # remove (com.alipay.remoting.Connection) Clean up, or RpcTaskScanner triggers com.alipay.remoting.DefaultConnectionManager # scan to clean up, but this cleanup task event is suspected to have a problem! [Image] (https://user-images.githubusercontent.com/8018119/70889032- 9c334400-201c-11ea-9a7a-3d9578778409.png)-Therefore, in summary, there is a certain chance that this task cannot be deleted, and the continuous call to the getConnection method does not normally guarantee that the link can be re-established -JVM version (eg |
@Synex-wh 你能够提供一些bolt的日志来帮忙定位这个问题吗?目前看确实存在bug导致连接池中的连接为空无法重连的情况,但正常情况下走断链事件会触发连接池清除的逻辑,不应该会出现空连接池的情况的,希望能够多提供一些日志方便排查下。 |
背景:用户的bolt版本是1.5.2版本,应用出现过FullGC
解决办法:
|
Describe the bug
现象
因为注册中心后台需要保持两个服务节点间的链接,因为系统内存满或者其他原因导致断链后,希望持续调用com.alipay.remoting.rpc.RpcClient#getConnection(com.alipay.remoting.Url, int)方法,保证链接
目前发现系统持续调用此方法的任务一直执行,但是获取到的Connection对象为null,体现为这两个节点断链,没有再进行重连
进行内存dump查明,com.alipay.remoting.DefaultConnectionManager#connTasks这个对象对应链接的ip+port为key的RunStateRecordedFutureTask存在,并且其中反馈的outcome对象ConnectionPool也不为null,但ConnectionPool中的Connection列表没有数据isempty,这样每次即使调用getConnection方法返回的一直是null的connection对象,并且不会触发重新建联,因为建联方式必须connTasks里面关于这个key的task清理掉才会再次触发task.run才会触发
Environment
java -version
):uname -a
):The text was updated successfully, but these errors were encountered: