nacos 1.4.1 集群中某节点下线，其余节点不能正确稳定感知状态 #4925

MajorHe1 · 2021-02-22T09:12:07Z

Describe the bug
我在 #4877 中提到过这个问题，现在再复现一遍
版本是1.4.1，用的是官方发布的release包
三台机器，分别是 172.21.1.134（记为134），172.21.1.150（记为150），172.21.1.153（记为153）

注意看图片中的时间戳

先杀掉134机器上的nacos进程

2.在153机器的控制台上查看134节点的状态

3.换成Postman访问OpenAPI提供的接口

MajorHe1 · 2021-02-23T08:12:47Z

关于B\C为什么没有感知到A挂掉，分析可能的原因：
A中的nacos并未优雅的退出，导致其在退出之前没有触发IPChangeEvent事件。而ServerMemberManager类中所维护的IP列表只有在监听到该事件时，才会进行写操作（nacos1.4.2，ServerMemberManager.java line181)。没有接到这个事件的话，就没有办法把IP列表中的A清除。

我印象中nacos的实现难道不应该是B\C主动去探测A的状态，如果A不活就把A的状态置为down的吗？
B\C是否感知A挂掉，应该不由A是否进行优雅退出、是否触发事件决定才对，因为A节点所在机器出现任何机器故障、断网、阻塞，B\C探测A的状态不通就应该更改状态。
是我印象错了还是代码改了？

realJackSun · 2021-02-23T11:56:52Z

@MajorHe1 这个暂时复现不了，可否提供一下您实验的详细信息？
以及使用的寻址模式是地址服务器方式还是cluster.conf配置文件的方式？

MajorHe1 · 2021-02-23T13:11:09Z

@MajorHe1 这个暂时复现不了，可否提供一下您实验的详细信息？
以及使用的寻址模式是地址服务器方式还是cluster.conf配置文件的方式？

请问您还需要何种详细的信息呢？
我用的是 1.4.1版本的release包，部署的机器是腾讯云的虚拟机，寻址模式是地址服务器
地址服务器是另外一个nacos进程的配置中心服务，集群内的nacos节点都是通过OpenAPI去获取这个配置
为了排除虚拟机的干扰，我会在物理机上再试一次，稍等给到您复现结果

…rmation changed

MajorHe1 · 2021-02-24T13:40:06Z

@JackSun-Developer 问题已经明确，已经提了PR

… changed (#4948)

KomachiSion added the kind/research label Feb 23, 2021

KomachiSion assigned realJackSun Feb 23, 2021

MajorHe1 added a commit to MajorHe1/nacos that referenced this issue Feb 24, 2021

[ISSUE alibaba#4925] correct the member's state when the cluster info…

6dad63d

…rmation changed

MajorHe1 mentioned this issue Feb 24, 2021

[ISSUE #4925] correct the member's state when the cluster information… #4948

Merged

5 tasks

KomachiSion pushed a commit that referenced this issue Mar 1, 2021

[ISSUE #4925] correct the member's state when the cluster information…

4191286

… changed (#4948)

KomachiSion added kind/bug Category issues or prs related to bug. and removed kind/research labels Mar 1, 2021

KomachiSion added this to the 1.4.2 milestone Mar 1, 2021

KomachiSion closed this as completed Mar 1, 2021

zrlw mentioned this issue Mar 7, 2021

nacos作为dubbo注册中心集群部署，客户端配置部分注册中心地址时配错了，dubbo服务provider健康数持续变化 #5040

Closed

MajorHe1 mentioned this issue Apr 12, 2021

Nacos1.4.1 集群选举失败 #5339

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nacos 1.4.1 集群中某节点下线，其余节点不能正确稳定感知状态 #4925

nacos 1.4.1 集群中某节点下线，其余节点不能正确稳定感知状态 #4925

MajorHe1 commented Feb 22, 2021

MajorHe1 commented Feb 23, 2021

realJackSun commented Feb 23, 2021

MajorHe1 commented Feb 23, 2021

MajorHe1 commented Feb 24, 2021

nacos 1.4.1 集群中某节点下线，其余节点不能正确稳定感知状态 #4925

nacos 1.4.1 集群中某节点下线，其余节点不能正确稳定感知状态 #4925

Comments

MajorHe1 commented Feb 22, 2021

MajorHe1 commented Feb 23, 2021

realJackSun commented Feb 23, 2021

MajorHe1 commented Feb 23, 2021

MajorHe1 commented Feb 24, 2021