Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Raft Group [naming_persistent_service] did not find the Leader node(nacos1.4.0集群,服务注册) #4629

Closed
dyiwen opened this issue Jan 5, 2021 · 4 comments

Comments

@dyiwen
Copy link

dyiwen commented Jan 5, 2021

Describe the bug
A clear and concise description of what the bug is.

k8s下nacos1.4.0,集群3节点,跑一段时间后,就会有一个节点有如下报错:
导致3个节点中2个节点关于服务注册的同步正常,但有如下报错的节点似乎不在同步该服务的注册信息,频繁刷新负载均衡后服务注册的页面会发现该服务频繁上下线,也会影响其他依赖该服务的接口时而报错,目前只能重启解决。

2021-01-05 15:55:43,536 ERROR [NACOS-RAFT] error while notifying listener of key: com.alibaba.nacos.naming.domains.meta.xxxx-dev##DEFAULT_GROUP@@xxx-service

com.alibaba.nacos.api.exception.NacosException: com.alibaba.nacos.consistency.exception.ConsistencyException: com.alibaba.nacos.core.distributed.raft.exception.NoLeaderException: The Raft Group [naming_persistent_service] did not find the Leader node
at com.alibaba.nacos.naming.consistency.persistent.impl.PersistentServiceProcessor.remove(PersistentServiceProcessor.java:270)
at com.alibaba.nacos.naming.consistency.persistent.PersistentConsistencyServiceDelegateImpl.remove(PersistentConsistencyServiceDelegateImpl.java:63)
at com.alibaba.nacos.naming.consistency.DelegateConsistencyServiceImpl.remove(DelegateConsistencyServiceImpl.java:53)
at com.alibaba.nacos.naming.core.ServiceManager.onDelete(ServiceManager.java:240)
at com.alibaba.nacos.naming.consistency.persistent.PersistentNotifier.notify(PersistentNotifier.java:102)
at com.alibaba.nacos.naming.consistency.persistent.PersistentNotifier.onEvent(PersistentNotifier.java:132)
at com.alibaba.nacos.naming.consistency.persistent.PersistentNotifier.onEvent(PersistentNotifier.java:38)
at com.alibaba.nacos.common.notify.DefaultPublisher$1.run(DefaultPublisher.java:198)
at com.alibaba.nacos.common.notify.DefaultPublisher.notifySubscriber(DefaultPublisher.java:208)
at com.alibaba.nacos.common.notify.DefaultPublisher.receiveEvent(DefaultPublisher.java:186)
at com.alibaba.nacos.common.notify.DefaultPublisher.openEventHandler(DefaultPublisher.java:117)
at com.alibaba.nacos.common.notify.DefaultPublisher.run(DefaultPublisher.java:94)

Expected behavior
A clear and concise description of what you expected to happen.

希望能弄明白该报错的含义和规避方法,如果出现该报错无法自行恢复,希望至少该节点能够报错挂掉好通过其他机制重启。有这个问题,导致不敢上生产使用。

Acutally behavior
A clear and concise description of what you actually to happen.

How to Reproduce
Steps to reproduce the behavior:
暂无复现的方法,目前环境跑的nacos,上注册了上百个服务,没跑4,5天就会出现现在的情况

Desktop (please complete the following information):

  • OS: [e.g. Centos]
  • Version [e.g. nacos-server 1.3.1, nacos-client 1.3.1]
  • Module [e.g. naming/config]
  • SDK [e.g. original, spring-cloud-alibaba-nacos, dubbo]

Additional context
Add any other context about the problem here.

@KomachiSion
Copy link
Collaborator

麻烦提供下服务端日志,naming-raft protocol-raft alipay-raft等

@dyiwen
Copy link
Author

dyiwen commented Jan 15, 2021

@dyiwen
Copy link
Author

dyiwen commented Jan 15, 2021

naming-raft.log
protocol-raft.log
alipay-jraft.log
麻烦帮忙看看,要不然完全不敢上生产,尤其挂了一个节点要怎么恢复,服务注册这一块

@chuntaojun
Copy link
Collaborator

我看下

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants