Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

客户端时不时报org.apache.rocketmq.remoting.exception.RemotingTimeoutException: wait response on the channel <:10909> timeout, 1500(ms) #428

Closed
chenld opened this issue Aug 27, 2018 · 6 comments

Comments

@chenld
Copy link

chenld commented Aug 27, 2018

环境: 4.2.0, MQ为双主双从 设置1500ms为超时时间, 时不时报一下异常, 正常应答为1~5ms,

 sendKernelImpl exception, resend at once, InvokeID: 1069917335695137265, RT: 1501ms, Broker: MessageQueue [topic=AAA_BUSINESS, brokerName=broker-b, queueId=5]
org.apache.rocketmq.remoting.exception.RemotingTimeoutException: wait response on the channel <1:10909> timeout, 1500(ms)
	at org.apache.rocketmq.remoting.netty.NettyRemotingAbstract.invokeSyncImpl(NettyRemotingAbstract.java:386)
	at org.apache.rocketmq.remoting.netty.NettyRemotingClient.invokeSync(NettyRemotingClient.java:369)
	at org.apache.rocketmq.client.impl.MQClientAPIImpl.sendMessageSync(MQClientAPIImpl.java:351)
	at org.apache.rocketmq.client.impl.MQClientAPIImpl.sendMessage(MQClientAPIImpl.java:335)
	at org.apache.rocketmq.client.impl.MQClientAPIImpl.sendMessage(MQClientAPIImpl.java:298)

按字面的意思, 是服务端没有应答, 如果设置成3秒, 发现存在应答时间为2秒左右的比较多, 可以确认是服务端应答很慢, 那应该怎么定位这类问题, 在线急, 谢谢~


后面将vip channel 改为false, 又有报
org.apache.rocketmq.remoting.exception.RemotingTimeoutException: wait response on the channel <10.1.2.1:10911> timeout, 1500(ms)

后面跟了下原因,
image
中间间隔了20s, master 主动关闭了, 这20s内, master 隔5秒发消息给slave, 但slave 一直没给应答,
后端日志一直报HAClient, processReadEvent read socket < 0, 这个是什么原因? 随后slave判断超过了20s, 也关闭了连接.

@chenld
Copy link
Author

chenld commented Aug 28, 2018

问一下, 是不是只有master向slave 发送心跳数据, 如果他们没有偏差, 则slave不回应? 检查了下抓包, 没有发现有slave 向master上送心跳数据包!

@chenld
Copy link
Author

chenld commented Aug 29, 2018

问题找到了, 心跳问题, master 和slave 心跳间隔都是5秒, master 一直发心跳, salve 收到心跳后, 更新lastWriteTime的值, 导致slave 一直不发心跳, master在20s内, 没收到slave 心跳, 就认为slave 断开了, 就关闭了连接. 后面将master心跳 改为10S, slave不变, interval都改为30S, 问题解决

@chenld chenld closed this as completed Aug 29, 2018
@chenld chenld reopened this Aug 29, 2018
@chenld chenld closed this as completed Aug 29, 2018
@chenld chenld reopened this Aug 29, 2018
@Hellojungle
Copy link
Contributor

Welcome to make Improvements

@chuenfaiy
Copy link
Contributor

主从是同步双写?

@chengqipeng
Copy link

chenld您好,请问下有解决上面问题吗,现在用rocketmq双主双从,经常出现org.apache.rocketmq.remoting.exception.RemotingTimeoutException: wait response on the channel <10.111.xxx.xxx:10911> timeout, 3000(ms)

@leo-987
Copy link

leo-987 commented Aug 31, 2020

4.2.0遇到相同的问题,客户端等了10s后超时,服务端日志没有看出什么异常,请问有解决方案吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants