Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

springCloudAlibaba+dubbo+nacos环境下,重启服务提供者或先启动服务消费者后启动服务提供者的情况下,消费者有时候会出现找不到服务的问题及解决方案(In the spring cloud Alibaba + Dubbo + Nacos environment, when the service provider is restarted or the service consumer is started first and then the service provider, sometimes the consumer can not find the service when calling) #1805

Closed
tianzeyong opened this issue Nov 5, 2020 · 68 comments
Milestone

Comments

@tianzeyong
Copy link

tianzeyong commented Nov 5, 2020

1.问题的直接表现(The direct manifestation of the problem):

org.apache.dubbo.rpc.RpcException: No provider available from registry localhost:9090 for service com.hxy.boot.ticket.articles.api.ArticleService on consumer 192.168.137.1 use dubbo version 2.7.8, please check status of providers(disabled, not registered or in blacklist). at org.apache.dubbo.registry.integration.RegistryDirectory.doList(RegistryDirectory.java:599) at org.apache.dubbo.rpc.cluster.directory.AbstractDirectory.list(AbstractDirectory.java:74) at org.apache.dubbo.rpc.cluster.support.AbstractClusterInvoker.list(AbstractClusterInvoker.java:292) at org.apache.dubbo.rpc.cluster.support.AbstractClusterInvoker.invoke(AbstractClusterInvoker.java:257) at org.apache.dubbo.rpc.cluster.interceptor.ClusterInterceptor.intercept(ClusterInterceptor.java:47) at org.apache.dubbo.rpc.cluster.support.wrapper.AbstractCluster$InterceptorInvokerNode.invoke(AbstractCluster.java:92) at org.apache.dubbo.rpc.cluster.support.wrapper.MockClusterInvoker.invoke(MockClusterInvoker.java:88) at org.apache.dubbo.rpc.proxy.InvokerInvocationHandler.invoke(InvokerInvocationHandler.java:74)

2.问题的直接原因(The direct cause of the problem):

调用服务提供者时,消费者的dubbo的服务目录 org.apache.dubbo.registry.integration.RegistryDirectoryforbidden 属性 为 true,如下图:
When the service provider is called, the forbidden property of org.apache.dubbo.registry.integration.RegistryDirectory is true in consumer side. as shown in the following figure:

直接原因1

3.问题的重现(Recurrence of the problem):

这个问题是偶尔出现的,不容易捕捉。经过分析,在服务提供者的 org.apache.dubbo.config.spring.context.DubboBootstrapApplicationListener#onContextRefreshedEvent(ContextRefreshedEvent event) 的 31行打上断点,并且suspend模式设为 Thread,然后重启服务提供者,就会一直重现此问题。如下图:
This problem occurs occasionally and is not easy to catch. After analysis, if a breakpoint is made on line 31 of the service provider's class org.apache.dubbo.config.spring.context.DubboBootstrapApplicationListener#onContextRefreshedEvent(ContextRefreshedEvent event), and the debug suspend mode is set to thread, and then the service provider is restarted, this problem will always recur. as shown in the following figure:

问题的重现

4.问题的根本原因(The root cause of the problem):

问题的根本原因是spring cloud alibaba框架启动nacos自动服务注册的时点比启动dubbo服务注册的时点早。前者的启动时点是监听到WebServerInitializedEvent事件时(org.springframework.cloud.client.serviceregistry.AbstractAutoServiceRegistration#bind(WebServerInitializedEvent event)),后者的启动时点是监听到ContextRefreshedEvent事件时(org.apache.dubbo.config.spring.context.DubboBootstrapApplicationListener#onContextRefreshedEvent(ContextRefreshedEvent event))。

The root cause of the problem is that the 'spring cloud Alibaba' framework starts Nacos automatic service registration earlier than Dubbo service registration. The starting time of the former is when the 'webserver initialized event' event (org.springframework.cloud.client.serviceregistry.AbstractAutoServiceRegistration#bind(WebServerInitializedEvent event)) is heard, while the latter is when the 'contextrefreshedevent' event (org.apache.dubbo.config.spring.context.DubboBootstrapApplicationListener#onContextRefreshedEvent(ContextRefreshedEvent event)) is monitored.

spring boot 2.2.xServletWebServerInitializedEvent事件的发布是在ContextRefreshedEvent事件之后,如图:
In 'spring boot 2.2. X', the 'servlet webserver initialized event' event is published after the 'contextrefreshedevent' event, as shown in the following figure:

springboot2 2 x

但在 spring boot 2.3.x 中改在了ContextRefreshedEvent事件前,如图:
However, in 'spring boot 2.3. X', it is changed before the 'contextrefreshedevent' event, as shown in the following figure:

springboot2 3 x

nacos服务端在处理了服务提供者的注册请求后向订阅者下发了实例变更通知,而在这个过程中提供者自身的dubbo服务暴露有可能还没有完成,最直接的表现就是服务提供者的 com.alibaba.cloud.dubbo.metadata.repository.DubboServiceMetadataRepositoryallExportedURLs 属性中还没有对应的dubbo服务的URL。
After processing the registration request of the service provider, the Nacos server sends an instance change notice to the subscriber. In this process, the provider's own Dubbo service exposure may not be completed, and the most direct performance is that: the allexportedurls property of class com.alibaba.cloud.dubbo.metadata.repository.Dubboservicemetadatarepository has no URL for the corresponding Dubbo service.

在第3条的问题重现里面,当程序跑到断点的时候,通过jprofiler查看此时的堆栈信息,可以看到allExportedURLs属性中没有期望的值。
In the problem recurrence in Item 3, when the program runs to a breakpoint, check the stack information through the 'jpprofiler'. You can see that there is no expected value in the 'allexportedurls' attribute.

因为spring cloud alibaba + dubbo 中dubbo的服务是暴露在本地的com.alibaba.cloud.dubbo.metadata.repository.DubboServiceMetadataRepository中的 allExportedURLs 属性中,不会传到注册中心服务端。所以最终暴露完成以后,nacos服务端无法感知到dubbo服务是否已准备妥当,也无法通知订阅者。这种情况下,提供者发起调用时通过泛化调用DubboMetadataService接口获取提供者暴露的服务时,从 allExportedURLs 中获取到的就是一个空的 List<Url>。然后消费者就会以为是没有提供者,于是在自己本地的dubbo服务目录 RegistryDirectory 中 把禁用属性 forbidden 的值更新为了 true
Because Dubbo's services in spring cloud Alibaba + Dubbo are exposed locally in allexportedurls property of the class com.alibaba.cloud . dubbo.metadata.repository.Dubboservicemetadatarepository. will not be transferred to the registry server.Therefore, after the final exposure is completed, the Nacos server cannot perceive whether the Dubbo service is ready or not, and cannot notify the subscriber.In this case, when a provider initiates a call to obtain the services exposed by the provider through a generalized call to the dubbometadataservice interface, an empty list < URL > is obtained from allexportedurls.Then, the consumer will think that there is no provider, so they update the value of disabled attribute forbidden to true in their local Dubbo service directory registrydirectory.

这时消费者调用提供者时就出现了第1条中的问题。
At this time, the problem in Article 1 arises when the consumer calls the provider.

5.1 应用端解决方案(Application side solutions):

  • 应用启动后,在 ApplicationRunner接口的run方法中,调用 springCloudAlibaba框架中的 NacosServiceRegistry类的setStatus 方法,更新一下在注册中心的实例状态:
    After the application is started, in the run method of the Applicationrunner interface, call the setstatus method of the
    Nacosserviceregistry class in the springcloudalibaba framework to update the instance status in the registry server.

`
@component
public class NacosServiceInstanceUpAndDownOperator implements ApplicationRunner, Closeable {
protected Logger logger = LoggerFactory.getLogger(this.getClass());

/**
 * nacos服务实例上线
 */
private static final String OPERATOR_UP = "UP";
/**
 * nacos服务实例下线
 */
private static final String OPERATOR_DOWN = "DOWN";

@Autowired
NacosServiceRegistry nacosServiceRegistry;

@Autowired
NacosRegistration nacosRegistration;

private ScheduledExecutorService executorService;


@PostConstruct
public void init() {
    int poolSize = 1;
    this.executorService = new ScheduledThreadPoolExecutor(poolSize, new ThreadFactory() {
        @Override
        public Thread newThread(Runnable r) {
            Thread thread = new Thread(r);
            thread.setDaemon(true);
            thread.setName("NacosServiceInstanceUpAndDownOperator");
            return thread;
        }
    });
}


@Override
public void run(ApplicationArguments args) throws Exception {
    long delay_down = 5000L;  //下线任务延迟
    long delay_up = 10000L;   // 上线任务延迟
    this.executorService.schedule(new InstanceDownAndUpTask(nacosServiceRegistry, nacosRegistration, OPERATOR_DOWN), delay_down, TimeUnit.MILLISECONDS);
    this.executorService.schedule(new InstanceDownAndUpTask(nacosServiceRegistry, nacosRegistration, OPERATOR_UP), delay_up, TimeUnit.MILLISECONDS);
}

@Override
public void shutdown() throws NacosException {
    ThreadUtils.shutdownThreadPool(executorService, logger);
}

/**
 * 服务实例上下线任务
 */
class InstanceDownAndUpTask implements Runnable {
    private NacosServiceRegistry nacosServiceRegistry;
    private NacosRegistration nacosRegistration;
    //更新服务实例的状态 :UP 、DOWN
    private String nacosServiceInstanceOperator;

    InstanceDownAndUpTask(NacosServiceRegistry nacosServiceRegistry, NacosRegistration nacosRegistration, String nacosServiceInstanceOperator) {
        this.nacosServiceRegistry = nacosServiceRegistry;
        this.nacosRegistration = nacosRegistration;
        this.nacosServiceInstanceOperator = nacosServiceInstanceOperator;
    }

    @Override
    public void run() {
        logger.info("===更新nacos服务实例的状态to:{}===start=", nacosServiceInstanceOperator);
        this.nacosServiceRegistry.setStatus(nacosRegistration, nacosServiceInstanceOperator);
        logger.info("===更新nacos服务实例的状态to:{}===end=", nacosServiceInstanceOperator);

        //上线后,关闭线程池
        if (NacosServiceInstanceUpAndDownOperator.OPERATOR_UP.equals(nacosServiceInstanceOperator)) {
            ThreadUtils.shutdownThreadPool(NacosServiceInstanceUpAndDownOperator.this.executorService, NacosServiceInstanceUpAndDownOperator.this.logger);
        }
    }
}

}
`

5.2 框架端解决方案的几点意见(Some suggestions on the solution of framework side):

  • a. 调换spring cloud的服务自动注册 和 dubbo服务注册的触发时点(Exchange the trigger time point of automatic service registration of 'spring cloud' and Dubbo service registration)

    让dubbo服务暴露的启动早于spring cloud的服务自动注册。这样的话就需要修改spring cloud commons的源码 和 dubbo 框架的源码,而且动的是根基,感觉不太舒服。

    Make Dubbo service exposed and automatically register services that start earlier than spring cloud.

  • b. spring cloud alibaba 中,dubbo服务暴露完成后向nacos注册中心发布一个更新通知(After the Dubbo service exposure is complete, an update notification is issued to the Nacos registry)

  • c. spring cloud alibaba 中,添加一个切面,切点为 spring cloud 的服务注册入口,然后在nacos服务注册之前先暴露dubbo服务(Add a aspect to the service registration portal of 'spring cloud'. Then start 'Dubbo' before 'Nacos' service registration to expose Dubbo service:)

    spring cloud alibaba框架中已经有一个现成的切面 DubboServiceRegistrationEventPublishingAspect#beforeRegister(Registration registration) ,可以在前置切点里面再加入dubbo服务的暴露就可以了,但对dubbo框架的服务暴露的过程需要做一些调整,避免在 ContextRefreshedEvent 事件后做一些重复的工作。

@tianzeyong tianzeyong changed the title springCloudAlibaba+dubbo+nacos环境下,重启服务提供者或先启动服务消费者后启动服务提供者的情况下,消费者有时候会出现找不到服务的问题及解决方案 springCloudAlibaba+dubbo+nacos环境下,重启服务提供者或先启动服务消费者后启动服务提供者的情况下,消费者有时候会出现找不到服务的问题及解决方案(In the spring cloud Alibaba + Dubbo + Nacos environment, when the service provider is restarted or the service consumer is started first and then the service provider, sometimes the consumer can not find the service when calling) Nov 5, 2020
@wghdir
Copy link

wghdir commented Nov 5, 2020

看来不少人碰到这个问题,感觉是不是解决方向上有问题,如果出现这种情况,重新从注册中心取一次数据更新会更合适。

@strugglingbird
Copy link

官方说解决了,其实根本没解决,我也发现这个问题,而且确实是有概率的,并不是毕现。

@wghdir
Copy link

wghdir commented Nov 5, 2020

可能是产生的原因很多~~,刚试出一次,然后怎么调都是这个提示,最后在nacos上把服务下线,再上线,就可以访问了。希望官方能不能有这个提示的时候从注册中心重新拉一下数据,明明服务是可用的。

@zjwon
Copy link

zjwon commented Nov 6, 2020

这个问题再k8s环境是必现的,感谢大佬的方案,准备按方案b试一下

@liu2811751
Copy link

b方法 刚试了下 没有效果啊。 但是在nacos上把服务下线后,然后再上线 这样是有效果的

@wghdir
Copy link

wghdir commented Nov 11, 2020

b方法 刚试了下 没有效果啊。 但是在nacos上把服务下线后,然后再上线 这样是有效果的

代码执行了没有?我测试提供者启动后下线再上线好像是可以的。不过我是直接把代码加到main里了,没像上面这么用。

@cqyisbug
Copy link

我有一个问题,为什么在心跳处理时不把这些问题解决一下.

还有,在注册了 ApplicationEventMulticaster 这个bean之后dubbo服务就几乎不可能暴露出来.

@cqyisbug
Copy link

我有一个问题,为什么在心跳处理时不把这些问题解决一下.

还有,在注册了 ApplicationEventMulticaster 这个bean之后dubbo服务就几乎不可能暴露出来.

正因为有异步的存在,所以答主的第一个解决方案感觉不是很可行

@zjwon
Copy link

zjwon commented Nov 12, 2020

在k8s环境,方案b没有效果,可以正常消费,但是依然会去尝试连接老的IP

@strugglingbird
Copy link

@mercyblitz 小马哥这个问题是不是后续版本都不打算解决了哇

@zjwon
Copy link

zjwon commented Nov 13, 2020

k8s环境下,使用以下版本,只会打印一次错误日志,可以说已经解决了问题

  • nacos: 1.3.2
  • spring-boot: 2.2.11.RELEASE
  • spring-cloud: Hoxton.SR9
  • spring-cloud-alibaba: 2.2.3.RELEASE
  • dubbo配置
dubbo:
  scan:
    base-packages: com.xxx # Dubbo 服务实现类的扫描基准包
  protocol:
    name: dubbo  # 协议名称
    port: -1  # -1表示自增,从20880开始
  registry:
    address: nacos://${spring.cloud.nacos.discovery.server-addr}?namespace=${spring.cloud.nacos.discovery.namespace}
    check: false #如果注册订阅失败时,也允许启动,需使用此选项,将在后台定时重试
  cloud:
    subscribed-services: com-xxx # 订阅的服务
  consumer:
    check: false 

关键日志如下

2020-11-13 17:14:28.369  INFO 6 --- [client.listener] o.a.d.remoting.transport.AbstractClient  :  [DUBBO] Successed connect to server /172.20.2.116:20880 from NettyClient 172.20.3.9 using dubbo version 2.7.8, channel is NettyChannel [channel=[id: 0x1c71191d, L:/172.20.3.9:38964 - R:/172.20.2.116:20880]], dubbo version: 2.7.8, current host: 172.20.3.9
2020-11-13 17:14:28.369  INFO 6 --- [lientWorker-1-1] o.a.d.r.t.netty4.NettyClientHandler      :  [DUBBO] The connection of /172.20.3.9:38964 -> /172.20.2.116:20880 is established., dubbo version: 2.7.8, current host: 172.20.3.9
2020-11-13 17:14:28.377  INFO 6 --- [client.listener] o.a.d.remoting.transport.AbstractClient  :  [DUBBO] Start NettyClient /172.20.3.9 connect to the server /172.20.2.116:20880, dubbo version: 2.7.8, current host: 172.20.3.9

......

2020-11-13 17:14:36.910  INFO 6 --- [client.listener] a.DubboServiceDiscoveryAutoConfiguration : The event of the service instances[name : com-sms-service , size : 2] change is about to be dispatched

......

2020-11-13 17:15:22.840  INFO 6 --- [lientWorker-1-2] o.a.d.r.t.netty4.NettyClientHandler      :  [DUBBO] The connection of /172.20.3.9:45860 -> /172.20.4.200:20880 is disconnected., dubbo version: 2.7.8, current host: 172.20.3.9

......

2020-11-13 17:15:40.420  INFO 6 --- [eCheck-thread-1] o.a.d.r.e.s.header.ReconnectTimerTask    :  [DUBBO] Initial connection to HeaderExchangeClient [channel=org.apache.dubbo.remoting.transport.netty4.NettyClient [/172.20.3.9:45860 -> /172.20.4.200:20880]], dubbo version: 2.7.8, current host: 172.20.3.9
2020-11-13 17:15:40.427  INFO 6 --- [eCheck-thread-1] o.a.d.r.transport.netty4.NettyChannel    :  [DUBBO] Close netty channel [id: 0x8bc3520d, L:/172.20.3.9:45860 ! R:/172.20.4.200:20880], dubbo version: 2.7.8, current host: 172.20.3.9
2020-11-13 17:15:40.444 ERROR 6 --- [eCheck-thread-1] o.a.d.r.e.s.header.ReconnectTimerTask    :  [DUBBO] Fail to connect to HeaderExchangeClient [channel=org.apache.dubbo.remoting.transport.netty4.NettyClient [/172.20.3.9:45860 -> /172.20.4.200:20880]], dubbo version: 2.7.8, current host: 172.20.3.9

org.apache.dubbo.remoting.RemotingException: client(url: dubbo://172.20.4.200:20880/com.alibaba.cloud.dubbo.service.DubboMetadataService?anyhost=true&application=com-message-service&bind.ip=172.20.4.200&bind.port=20880&check=false&codec=dubbo&deprecated=false&dubbo=2.0.2&dynamic=true&generic=true&group=com-sms-service&heartbeat=60000&interface=com.alibaba.cloud.dubbo.service.DubboMetadataService&metadata-type=remote&methods=getAllServiceKeys,getServiceRestMetadata,getExportedURLs,getAllExportedURLs&pid=6&qos.enable=false&register=true&register.ip=172.20.3.9&release=2.7.3&remote.application=com-sms-service&revision=2.1.1.RELEASE&side=consumer&sticky=false&timestamp=1604298942645&version=1.0.0) failed to connect to server /172.20.4.200:20880, error message is:No route to host: /172.20.4.200:20880
	at org.apache.dubbo.remoting.transport.netty4.NettyClient.doConnect(NettyClient.java:169)
	at org.apache.dubbo.remoting.transport.AbstractClient.connect(AbstractClient.java:191)
	at org.apache.dubbo.remoting.transport.AbstractClient.reconnect(AbstractClient.java:247)
	at org.apache.dubbo.remoting.exchange.support.header.HeaderExchangeClient.reconnect(HeaderExchangeClient.java:166)
	at org.apache.dubbo.remoting.exchange.support.header.ReconnectTimerTask.doTask(ReconnectTimerTask.java:49)
	at org.apache.dubbo.remoting.exchange.support.header.AbstractTimerTask.run(AbstractTimerTask.java:87)
	at org.apache.dubbo.common.timer.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:648)
	at org.apache.dubbo.common.timer.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:727)
	at org.apache.dubbo.common.timer.HashedWheelTimer$Worker.run(HashedWheelTimer.java:449)
	at java.lang.Thread.run(Thread.java:748)
Caused by: io.netty.channel.AbstractChannel$AnnotatedNoRouteToHostException: No route to host: /172.20.4.200:20880
Caused by: java.net.NoRouteToHostException: No route to host
	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(Thread.java:748)

2020-11-13 17:15:46.939  INFO 6 --- [client.listener] a.DubboServiceDiscoveryAutoConfiguration : The event of the service instances[name : com-sms-service , size : 1] change is about to be dispatched

......

2020-11-13 17:15:56.942  INFO 6 --- [client.listener] a.DubboServiceDiscoveryAutoConfiguration : The event of the service instances[name : com-sms-service , size : 1] change is about to be dispatched

......

2020-11-13 17:15:58.993  INFO 6 --- [client.listener] o.a.d.r.transport.netty4.NettyChannel    :  [DUBBO] Close netty channel [id: 0x8bc3520d, L:/172.20.3.9:45860 ! R:/172.20.4.200:20880], dubbo version: 2.7.8, current host: 172.20.3.9

@zhaoziji
Copy link

我发现,如果在nacos的控制台页面去手动执行服务提供者的下线操作,服务消费者的 ReferenceCountExchangeClient#replaceWithLazyClient() 方法会被触发,之后只要消费者端不重启,服务提供者端任意重启都不会出现该情况。

关键点似乎在于“dubbo.metadata-service.urls”的url参数值如果能包含"lazy=true",该问题便不会出现(可尝试先启动服务提供者,在nacos控制台页面手动调整服务提供者的该值,然后再启动服务消费者,然后任意重启服务提供者)。

image

@HuangDayu
Copy link

emmm...而且还出现同一个服务,一部分dubbo接口能用,一部分dubbo接口不能用的情况.

@pangshuqiang
Copy link

已经困惑开发组人员很久的问题,当SpringCloud两个微服务相互成为提供者、消费者时,似乎无解!之前开发组把nacos://改成spring-cloud://连接前缀后得1、2天,之后该问题又再重现。希望早点有新版解决该问题。

@HuangDayu
Copy link

已经困惑开发组人员很久的问题,当SpringCloud两个微服务相互成为提供者、消费者时,似乎无解!之前开发组把nacos://改成spring-cloud://连接前缀后得1、2天,之后该问题又再重现。希望早点有新版解决该问题。

亲,我这边建议您弃坑。

@pangshuqiang
Copy link

已经困惑开发组人员很久的问题,当SpringCloud两个微服务相互成为提供者、消费者时,似乎无解!之前开发组把nacos://改成spring-cloud://连接前缀后得1、2天,之后该问题又再重现。希望早点有新版解决该问题。

亲,我这边建议您弃坑。

信仰要充值!

其实再提供者重启完成时,消费者是能接收到的,在消费者的控制台会有下面的内容输出:
[15:35:18:083] [INFO] - com.alibaba.nacos.client.naming.core.PushReceiver.run(PushReceiver.java:86) - received push data: {"type":"dom","data":"{\"hosts\":[{\"ip\":\"192.168.100.2\",\"port\":9011,\"valid\":true,\"healthy\":true,\"marked\":false,\"instanceId\":\"192.168.100.2#9011#DEFAULT#DEFAULT_GROUP@@secret-server\",\"metadata\":{\"dubbo.metadata-service.urls\":\"[ \\\"dubbo://192.168.100.2:29011/com.alibaba.cloud.dubbo.service.DubboMetadataService?anyhost=false&application=secret-server&bind.ip=192.168.100.2&bind.port=29011&deprecated=false&dubbo=2.0.2&dynamic=true&generic=false&group=secret-server&interface=com.alibaba.cloud.dubbo.service.DubboMetadataService&methods=getAllServiceKeys,getServiceRestMetadata,getExportedURLs,getAllExportedURLs&pid=19812&qos.enable=false&release=2.7.8&revision=2.2.3.RELEASE&side=provider&timestamp=1608190515329&version=1.0.0\\\" ]\",\"dubbo.protocols.dubbo.port\":\"29011\",\"preserved.register.source\":\"SPRING_CLOUD\"},\"enabled\":true,\"weight\":1.0,\"clusterName\":\"DEFAULT\",\"serviceName\":\"DEFAULT_GROUP@@secret-server\",\"ephemeral\":true}],\"dom\":\"DEFAULT_GROUP@@secret-server\",\"name\":\"DEFAULT_GROUP@@secret-server\",\"cacheMillis\":10000,\"lastRefTime\":1608190518452,\"checksum\":\"1971e7cb61623924e7407e8206da46e5\",\"useSpecifiedURL\":false,\"clusters\":\"\",\"env\":\"\",\"metadata\":{}}","lastRefTime":104528484199741} from /192.168.10.4 [15:35:18:083] [INFO] - com.alibaba.nacos.client.naming.core.HostReactor.processServiceJson(HostReactor.java:191) - new ips(1) service: DEFAULT_GROUP@@secret-server -> [{"instanceId":"192.168.100.2#9011#DEFAULT#DEFAULT_GROUP@@secret-server","ip":"192.168.100.2","port":9011,"weight":1.0,"healthy":true,"enabled":true,"ephemeral":true,"clusterName":"DEFAULT","serviceName":"DEFAULT_GROUP@@secret-server","metadata":{"dubbo.metadata-service.urls":"[ \"dubbo://192.168.100.2:29011/com.alibaba.cloud.dubbo.service.DubboMetadataService?anyhost=false&application=secret-server&bind.ip=192.168.100.2&bind.port=29011&deprecated=false&dubbo=2.0.2&dynamic=true&generic=false&group=secret-server&interface=com.alibaba.cloud.dubbo.service.DubboMetadataService&methods=getAllServiceKeys,getServiceRestMetadata,getExportedURLs,getAllExportedURLs&pid=19812&qos.enable=false&release=2.7.8&revision=2.2.3.RELEASE&side=provider&timestamp=1608190515329&version=1.0.0\" ]","dubbo.protocols.dubbo.port":"29011","preserved.register.source":"SPRING_CLOUD"},"ipDeleteTimeout":30000,"instanceHeartBeatTimeOut":15000,"instanceHeartBeatInterval":5000}] [15:35:18:084] [INFO] - com.alibaba.cloud.dubbo.autoconfigure.DubboServiceDiscoveryAutoConfiguration.dispatchServiceInstancesChangedEvent(DubboServiceDiscoveryAutoConfiguration.java:171) - The event of the service instances[name : secret-server , size : 1] change is about to be dispatched [15:35:18:087] [INFO] - com.alibaba.nacos.client.naming.core.HostReactor.processServiceJson(HostReactor.java:228) - current ips:(1) service: DEFAULT_GROUP@@secret-server -> [{"instanceId":"192.168.100.2#9011#DEFAULT#DEFAULT_GROUP@@secret-server","ip":"192.168.100.2","port":9011,"weight":1.0,"healthy":true,"enabled":true,"ephemeral":true,"clusterName":"DEFAULT","serviceName":"DEFAULT_GROUP@@secret-server","metadata":{"dubbo.metadata-service.urls":"[ \"dubbo://192.168.100.2:29011/com.alibaba.cloud.dubbo.service.DubboMetadataService?anyhost=false&application=secret-server&bind.ip=192.168.100.2&bind.port=29011&deprecated=false&dubbo=2.0.2&dynamic=true&generic=false&group=secret-server&interface=com.alibaba.cloud.dubbo.service.DubboMetadataService&methods=getAllServiceKeys,getServiceRestMetadata,getExportedURLs,getAllExportedURLs&pid=19812&qos.enable=false&release=2.7.8&revision=2.2.3.RELEASE&side=provider&timestamp=1608190515329&version=1.0.0\" ]","dubbo.protocols.dubbo.port":"29011","preserved.register.source":"SPRING_CLOUD"},"ipDeleteTimeout":30000,"instanceHeartBeatTimeOut":15000,"instanceHeartBeatInterval":5000}] [15:35:18:097] [INFO] - com.alibaba.cloud.dubbo.service.DubboMetadataServiceProxy.createProxy(DubboMetadataServiceProxy.java:187) - The metadata of Dubbo service[name : secret-server] is about to be initialized [15:35:18:102] [INFO] - org.apache.dubbo.registry.support.AbstractRegistry.register(AbstractRegistry.java:288) - [DUBBO] Register: consumer://192.168.100.2/org.apache.dubbo.rpc.service.GenericService?application=example-server&category=consumers&check=false&dubbo=2.0.2&generic=true&group=secret-server&interface=com.alibaba.cloud.dubbo.service.DubboMetadataService&pid=25360&qos.enable=false&release=2.7.8&side=consumer&sticky=false&timestamp=1608190518100&version=1.0.0, dubbo version: 2.7.8, current host: 192.168.100.2 [15:35:18:102] [INFO] - org.apache.dubbo.registry.support.AbstractRegistry.subscribe(AbstractRegistry.java:313) - [DUBBO] Subscribe: consumer://192.168.100.2/org.apache.dubbo.rpc.service.GenericService?application=example-server&category=providers,configurators,routers&check=false&dubbo=2.0.2&generic=true&group=secret-server&interface=com.alibaba.cloud.dubbo.service.DubboMetadataService&pid=25360&qos.enable=false&release=2.7.8&side=consumer&sticky=false&timestamp=1608190518100&version=1.0.0, dubbo version: 2.7.8, current host: 192.168.100.2 [15:35:18:103] [INFO] - org.apache.dubbo.config.ReferenceConfig.createProxy(ReferenceConfig.java:392) - [DUBBO] Refer dubbo service org.apache.dubbo.rpc.service.GenericService from url spring-cloud://192.168.10.4:8848/org.apache.dubbo.registry.RegistryService?anyhost=false&application=example-server&bind.ip=192.168.100.2&bind.port=29011&check=false&deprecated=false&dubbo=2.0.2&dynamic=true&generic=true&group=secret-server&interface=com.alibaba.cloud.dubbo.service.DubboMetadataService&methods=getAllServiceKeys,getServiceRestMetadata,getExportedURLs,getAllExportedURLs&pid=25360&qos.enable=false&register.ip=192.168.100.2&release=2.7.8&remote.application=secret-server&revision=2.2.3.RELEASE&side=consumer&sticky=false&timestamp=1608190518100&version=1.0.0, dubbo version: 2.7.8, current host: 192.168.100.2 [15:35:18:112] [INFO] - org.apache.dubbo.remoting.transport.netty4.NettyClient.doConnect(NettyClient.java:145) - [DUBBO] Close old netty channel [id: 0x642e871e, L:/192.168.100.2:61113 ! R:/192.168.100.2:29011] on create new netty channel [id: 0x3c6ebc28, L:/192.168.100.2:62816 - R:/192.168.100.2:29011], dubbo version: 2.7.8, current host: 192.168.100.2 [15:35:18:112] [INFO] - org.apache.dubbo.remoting.transport.netty4.NettyClientHandler.channelActive(NettyClientHandler.java:62) - [DUBBO] The connection of /192.168.100.2:62816 -> /192.168.100.2:29011 is established., dubbo version: 2.7.8, current host: 192.168.100.2 **[15:35:18:112] [INFO] - org.apache.dubbo.remoting.transport.AbstractClient.connect(AbstractClient.java:200) - [DUBBO] Successed connect to server /192.168.100.2:29011 from NettyClient 192.168.100.2 using dubbo version 2.7.8, channel is NettyChannel [channel=[id: 0x3c6ebc28, L:/192.168.100.2:62816 - R:/192.168.100.2:29011]], dubbo version: 2.7.8, current host: 192.168.100.2**

最后一句提示:
[15:35:18:112] [INFO] - org.apache.dubbo.remoting.transport.AbstractClient.connect(AbstractClient.java:200) - [DUBBO] Successed connect to server /192.168.100.2:29011 from NettyClient 192.168.100.2 using dubbo version 2.7.8, channel is NettyChannel [channel=[id: 0x3c6ebc28, L:/192.168.100.2:62816 - R:/192.168.100.2:29011]], dubbo version: 2.7.8, current host: 192.168.100.2

只是不知道为啥消费者还是认不出提供者已经存活,调用时直接认为服务不存在:
[15:36:01:912] [WARN] - org.apache.dubbo.rpc.cluster.support.wrapper.MockClusterInvoker.invoke(MockClusterInvoker.java:116) - [DUBBO] fail-mock: checkKey fail-mock enabled , url : spring-cloud://192.168.10.4:8848/org.apache.dubbo.registry.RegistryService?application=example-server&check=false&cluster=failfast&dubbo=2.0.2&group=WF&init=false&interface=api.IVaultApi&methods=savePwd,saveKey,checkKey,checkPwd&mock=true&pid=25360&qos.enable=false&register.ip=192.168.100.2&release=2.7.8&revision=1.1.0&side=consumer&sticky=false&timestamp=1608182275814&version=1.1.0, dubbo version: 2.7.8, current host: 192.168.100.2 org.apache.dubbo.rpc.RpcException: No provider available from registry 192.168.10.4:8848 for service WF/api.IVaultApi:1.1.0 on consumer 192.168.100.2 use dubbo version 2.7.8, please check status of providers(disabled, not registered or in blacklist).

开发还是要继续,项目还是要继续,所以也只能通过手动下线提供者再重新上线方式,让消费者重新和提供者握手!

@liukp0210
Copy link

按照上面的例子我想本地模拟一下,但是好像不能复现

@pangshuqiang
Copy link

但是好像不能复现

我说说一下开发组的环境:
1、在虚拟机192.168.10.4(系统CentOS8.1)下的Docker里拉取并部署Nacos注册中心,并映射出8848端口供开发组用;
2、开发组成员在自己电脑上基于SpringCloud、SpringcloudAlibaba、Dubbo架构中开发微服务提供者A项目和消费者B项目;
3、比如在我的电脑192.168.100.2上通过IDEA开发工具先后启动提供者A项目和B项目,此时B项目消费者可以正常调用A项目的Dubbo接口;
4、问题复现:
重启提供者A项目,
然后B项目会出现上述提到的消息 https://github.com/alibaba/spring-cloud-alibaba/issues/1805#issuecomment-747277036, 但是即便A项目启动完成,并在Nacos里也能看到是已经上线激活的, 然而B项目还是无法请求A项目的Dubbo接口,提示如下: org.apache.dubbo.rpc.RpcException: No provider available from registry 192.168.10.4:8848 for service WF/express.api.IExpressApi:1.1.0 on consumer 192.168.100.2 use dubbo version 2.7.8, please check status of providers(disabled, not registered or in blacklist)., dubbo version: 2.7.8, current host: 192.168.100.2
org.apache.dubbo.rpc.RpcException: No provider available from registry 192.168.10.4:8848 for service WF/express.api.IExpressApi:1.1.0 on consumer 192.168.100.2 use dubbo version 2.7.8, please check status of providers(disabled, not registered or in blacklist).`

服务:
Nacos 1.4.0(部署在Docker)
Sentinel 1.8.0(部署在Docker)
项目:
SpringBoot 2.3.4 + SpringCloud Hoxton.SR8 + spring-cloud-alibaba 2.2.2.RELEASE (附带的Dubbo版本为 2.7.8)

@pangshuqiang
Copy link

pangshuqiang commented Dec 22, 2020

已经困惑开发组人员很久的问题,当SpringCloud两个微服务相互成为提供者、消费者时,似乎无解!之前开发组把nacos://改成spring-cloud://连接前缀后得1、2天,之后该问题又再重现。希望早点有新版解决该问题。

亲,我这边建议您弃坑。

我又来刷屏了!

开发组确定了问题所在,之前项目用的版本是:
SpringBoot 2.3.0 + SpringCloud Hoxton.SR4 + spring-cloud-alibaba 2.2.1.RELEASE (附带的Dubbo版本为 2.7.6)
提供者服务重启之后,消费者会收到:
``
即不会出现消费者找不到重启后的提供者Dubbo接口服务;

再升级到新版:
SpringBoot 2.3.4 + SpringCloud Hoxton.SR8 + spring-cloud-alibaba 2.2.2.RELEASE (附带的Dubbo版本为 2.7.8)
提示Service和Reference,该类已经过时,启用DubboService和DubboReference代替,
Issue:
然后该版本升级后,引发重启之后找不到提供者的问题,各位看官,问题到此结束!

@lgp547
Copy link

lgp547 commented Dec 28, 2020

k8s环境下,用的是文档推荐的最新毕业版本,nacos是1.3.2
还是出现了这个错误, 并一直在重复打印(原因就是172服务我以及重启下线了,但54服务没有即使的更新)

Spring Cloud Version | Spring Cloud Alibaba Version | Spring Boot Version
Spring Cloud Hoxton.SR8 | 2.2.3.RELEASE | 2.3.2.RELEASE

2020-12-28 17:44:06.811 [dubbo-client-idleCheck-thread-1] [] ERROR org.apache.dubbo.remoting.exchange.support.header.ReconnectTimerTask - [DUBBO] Fail to connect to HeaderExchangeClient [channel=org.apache.dubbo.remoting.transport.netty4.NettyClient [/10.xx.xx.54:55378 -> /10.xx.xx.172:20880]], dubbo version: 2.7.8, current host: 10.xx.xx.54
org.apache.dubbo.remoting.RemotingException: client(url: dubbo://10.xx.xx.172:20880/com.alibaba.cloud.dubbo.service.DubboMetadataService?anyhost=true&application=question-service&bind.ip=10.xx.xx.172&bind.port=20880&check=false&codec=dubbo&deprecated=false&dubbo=2.0.2&dynamic=true&generic=true&group=privilege-service&heartbeat=60000&interface=com.alibaba.cloud.dubbo.service.DubboMetadataService&methods=getAllServiceKeys,getServiceRestMetadata,getExportedURLs,getAllExportedURLs&pid=376&qos.enable=false&register.ip=10.xx.xx.54&release=2.7.8&remote.application=privilege-service&revision=2.2.3.RELEASE&side=consumer&sticky=false&timeout=60000&timestamp=1608692297155&version=1.0.0) failed to connect to server /10.xx.xx.172:20880 client-side timeout 3000ms (elapsed: 3001ms) from netty client 10.xx.xx.54 using dubbo version 2.7.8
at org.apache.dubbo.remoting.transport.netty4.NettyClient.doConnect(NettyClient.java:174)
at org.apache.dubbo.remoting.transport.AbstractClient.connect(AbstractClient.java:191)
at org.apache.dubbo.remoting.transport.AbstractClient.reconnect(AbstractClient.java:247)
at org.apache.dubbo.remoting.exchange.support.header.HeaderExchangeClient.reconnect(HeaderExchangeClient.java:166)
at org.apache.dubbo.remoting.exchange.support.header.ReconnectTimerTask.doTask(ReconnectTimerTask.java:49)
at org.apache.dubbo.remoting.exchange.support.header.AbstractTimerTask.run(AbstractTimerTask.java:87)
at org.apache.dubbo.common.timer.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:648)
at org.apache.dubbo.common.timer.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:727)
at org.apache.dubbo.common.timer.HashedWheelTimer$Worker.run(HashedWheelTimer.java:449)
at java.lang.Thread.run(Thread.java:748)

能不能来个版本解决下呀。


还有一个问题,操作发版,导致服务启动完后,服务A不能调用服务B,一直在报错
org.apache.dubbo.rpc.RpcException: Failed to invoke the method getUserNameById in the service com.xxx.privilege.application.PrivilegeApplication. Tried 3 times of the providers ....

@cxdhefei
Copy link

cxdhefei commented Jan 4, 2021

已经困惑开发组人员很久的问题,当SpringCloud两个微服务相互成为提供者、消费者时,似乎无解!之前开发组把nacos://改成spring-cloud://连接前缀后得1、2天,之后该问题又再重现。希望早点有新版解决该问题。

亲,我这边建议您弃坑。

我又来刷屏了!

开发组确定了问题所在,之前项目用的版本是:
SpringBoot 2.3.0 + SpringCloud Hoxton.SR4 + spring-cloud-alibaba 2.2.1.RELEASE (附带的Dubbo版本为 2.7.6)
提供者服务重启之后,消费者会收到:
``
即不会出现消费者找不到重启后的提供者Dubbo接口服务;

再升级到新版:
SpringBoot 2.3.4 + SpringCloud Hoxton.SR8 + spring-cloud-alibaba 2.2.2.RELEASE (附带的Dubbo版本为 2.7.8)
提示Service和Reference,该类已经过时,启用DubboService和DubboReference代替,
Issue:
然后该版本升级后,引发重启之后找不到提供者的问题,各位看官,问题到此结束!

我用的这个版本,依然存在消费者找不到提供者问题!
nacos-server: 1.3.1
spring-boot: 2.2.6.RELEASE
spring-cloud-alibaba: 2.2.1.RELEASE (dubbo: 2.7.6、nacos-client: 1.2.1)
spring-cloud: Hoxton.SR4

@cxdhefei
Copy link

cxdhefei commented Jan 4, 2021

@pangshuqiang
微信号 coding4joy

@mostcool
Copy link
Contributor

mostcool commented Jan 6, 2021

@yuhuangbin @theonefx 这个问题官方有计划解决吗?

@theonefx
Copy link
Collaborator

theonefx commented Jan 7, 2021

We are dealing with this issue

@theonefx
Copy link
Collaborator

theonefx commented Jan 8, 2021

由于 spring cloud 对服务注册的概念和 dubbo 是不太一样的。
目前的想法是,通过两次注册的方式来实现 dubbo 和 sc 注册的适配:

  • 第一次,只注册应用,没有 DubboMetadataService;
  • 第二次,在 dubbo 初始化完成以后,再做注册信息更新,此时带上 DubboMetadataService。

theonefx pushed a commit to theonefx/spring-cloud-alibaba that referenced this issue Jan 8, 2021
@winterallen
Copy link

2.2.5.RELEASE和2.1.4.RELEASE和2.0.4.RELEASE已解决了此问题。

如果您仍然遇到此问题,请打开一个新问题。

2.2.5.RELEASE问题依旧存在

@lizhuquan0769
Copy link

我用spring cloud alibaba 2.2.5.RELEASE,也一直是这个问题

@BuYi-Feng
Copy link

我用spring cloud alibaba 2.2.5.RELEASE,也一直是这个问题

你可以看下2079,这已经成为一个官方暂时无法解决的BUG了

@lgh731
Copy link

lgh731 commented Jun 29, 2021

我升级到最新版本,但是问题还是存在,是还没有解决吗?
spring cloud :2020.0.0
spring cloud alibaba :2021.1
Dubbo:2.7.8
nacos:1.4.1

@theonefx
Copy link
Collaborator

我升级到最新版本,但是问题还是存在,是还没有解决吗?
spring cloud :2020.0.0
spring cloud alibaba :2021.1
Dubbo:2.7.8
nacos:1.4.1

抱歉,之前的修复确实有一些问题,导致解决的不够彻底。
我们已经用了新的方案来解决这个问题了,敬请期待2.2.6.RELEASE版本。或者如果想抢先体验的话,可以使用2.2.6-bugfix5-SNAPSHOT

@theonefx
Copy link
Collaborator

我用spring cloud alibaba 2.2.5.RELEASE,也一直是这个问题

你可以看下2079,这已经成为一个官方暂时无法解决的BUG了

抱歉,之前的修复确实有一些问题,导致解决的不够彻底。
我们已经用了新的方案来解决这个问题了,敬请期待2.2.6.RELEASE版本。或者如果想抢先体验的话,可以使用2.2.6-bugfix5-SNAPSHOT

@theonefx
Copy link
Collaborator

我用spring cloud alibaba 2.2.5.RELEASE,也一直是这个问题

抱歉,之前的修复确实有一些问题,导致解决的不够彻底。
我们已经用了新的方案来解决这个问题了,敬请期待2.2.6.RELEASE版本。或者如果想抢先体验的话,可以使用2.2.6-bugfix5-SNAPSHOT

@lgh731
Copy link

lgh731 commented Jun 30, 2021

2.2.6.RELEASE大概什么时候能发布

@theonefx
Copy link
Collaborator

2.2.6.RELEASE大概什么时候能发布

近期就会发布,原计划是这个月发布的,因为测试可能稍微有些delay

@lizhuquan0769
Copy link

我用spring cloud alibaba 2.2.5.RELEASE,也一直是这个问题

抱歉,之前的修复确实有一些问题,导致解决的不够彻底。
我们已经用了新的方案来解决这个问题了,敬请期待2.2.6.RELEASE版本。或者如果想抢先体验的话,可以使用2.2.6-bugfix5-SNAPSHOT

感谢,辛苦了

theonefx added a commit to theonefx/spring-cloud-alibaba that referenced this issue Aug 2, 2021
@hangzhou492
Copy link

我升级到最新版本,但是问题还是存在,是还没有解决吗?
spring cloud :2020.0.0
spring cloud alibaba :2021.1
Dubbo:2.7.8
nacos:1.4.1

抱歉,之前的修复确实有一些问题,导致解决的不够彻底。
我们已经用了新的方案来解决这个问题了,敬请期待2.2.6.RELEASE版本。或者如果想抢先体验的话,可以使用2.2.6-bugfix5-SNAPSHOT

spring cloud alibaba 2.2.6.RELEASE
spring boot 2.3.2.RELEASE
nacos 1.4.2
依然出现这种问题

@yuezhenyu0208
Copy link

spring cloud alibaba 2.2.6.RELEASE
spring boot 2.2.6.RELEASE
nacos 1.4.2
依然出现这种问题

@tan-zhuo
Copy link

tan-zhuo commented Aug 25, 2021

你们是不是在调试时几个相互订阅服务同时启动?

根据我调试结果得出、几个服务在同时启动时、先启动好的服务注册入nacos之后、发生服务变更事件、nacos并不会通知正在注册中的服务、从而导致后注册上的服务没有获取到最新的订阅服务信息、如果每个服务顺序启动是不会有此问题的。

为了避免在注册中途有订阅服务注册上nacos却没有通知到本服务、故此我在服务启动之后会主动去发起一次订阅服务更新事件、经过测试之后已经解决此问题。

环境:
spring-cloud-alibaba 2.2.6.RELEASE
nacos-service 2.0.3

@Slf4j
@Component
public class CustomDubboActiveProbeSubscriptionService implements ApplicationListener<ApplicationReadyEvent> {

    @Override
    public void onApplicationEvent(ApplicationReadyEvent event) {
        DubboServiceMetadataRepository dubboServiceMetadataRepository = event.getApplicationContext().getBean(DubboServiceMetadataRepository.class);
        Set<String> subscribedServices = dubboServiceMetadataRepository.getSubscribedServices();
        if (!ObjectUtils.isEmpty(subscribedServices)) {
            DiscoveryClient DiscoveryClient = event.getApplicationContext().getBean(DiscoveryClient.class);
            for (String subscribedService : subscribedServices) {
                log.info("主动探测订阅服务:" + subscribedService);
                ServiceInstancesChangedEvent changedEvent = new ServiceInstancesChangedEvent(subscribedService, DiscoveryClient.getInstances(subscribedService));
                event.getApplicationContext().publishEvent(changedEvent);
            }
        }
    }

}

@NominationP
Copy link

spring cloud alibaba 2.2.6.RELEASE
spring boot 2.3.2.RELEASE
nacos 1.4.2
依然出现这种问题

@pangshuqiang
Copy link

pangshuqiang commented Nov 20, 2021

重申一下这个问题,不要再掉坑里啦!

第一:不要用、不要用、不要用 spring-cloud-alibaba 自带的 Dubbo 版本

第二:一定要、一定要、一定要 Apache 管理的 Dubbo 版本,version >= 2.7.10 (2.7.9版本后基本解决找不到服务问题)

    dependency>
        <groupId>org.apache.dubbo</groupId>
        <artifactId>dubbo-spring-boot-starter</artifactId>
        <version>${dubbo.version}</version>
        <scope>compile</scope>
    </dependency>

第三:把 Nacos 升级到 2.0 版本,不是必须的,但建议升级。

### (题外话:spring-cloud-alibaba的seata一样,不要用自带的,单独引入seata的新包!!!)

@lizhuquan0769
Copy link

重申一下这个问题,不要再掉坑里啦!

第一:不要用、不要用、不要用 spring-cloud-alibaba 自带的 Dubbo 版本

第二:一定要、一定要、一定要 Apache 管理的 Dubbo 版本,version >= 2.7.10 (2.7.9版本后基本解决找不到服务问题)

    dependency>
        <groupId>org.apache.dubbo</groupId>
        <artifactId>dubbo-spring-boot-starter</artifactId>
        <version>${dubbo.version}</version>
        <scope>compile</scope>
    </dependency>

第三:把 Nacos 升级到 2.0 版本,不是必须的,但建议升级。

### (题外话:spring-cloud-alibaba的seata一样,不要用自带的,单独引入seata的新包!!!)

兄弟,方便贴一下你的版本依赖吗

@seanpoke
Copy link

2.2.6.RELEASE大概什么时候能发布

近期就会发布,原计划是这个月发布的,因为测试可能稍微有些delay

你好,请问这个问题在2021.1中会修复吗?因为我们版本依赖的为springboot 2.4.2,不太想降版本
1642497803(1)

@js1688
Copy link

js1688 commented Mar 7, 2022

我有一个服务提供程序,和一个服务消费程序
服务提供程序配置:
protocol:
port: 20880 # dubbo协议端口,默认20880 -1 表示自增端口,从 20880 开始
name: dubbo # dubbo协议名称
host: 192.168.20.168 #使用内网ip
服务消费程序并未配置protocol,因为我的服务消费程序并不会提供dubbo接口服务所以未设置
就发生了只要服务提供程序重启后,服务消费程序则找不到服务提供,抛异常
No provider available from registry localhost:9090 for service segi.open.dm.api.DemoAPI on consumer 192.168.20.168 use dubbo version 2.7.8, please check status of providers(disabled, not registered or in blacklist).
需要在nacos中将服务提供程序手动设置下线,再设置上线,之后则正常可以调用到服务提供程序
后面我在服务消费程序也加上了
protocol:
port: 30880 # dubbo协议端口,默认20880 -1 表示自增端口,从 20880 开始
name: dubbo # dubbo协议名称
host: 192.168.20.168 #使用内网ip
我再次测试,无论是消费程序先启动还是,还是服务提供程序重启后,都可以调通接口,不会再抛异常了,我不知道是否跟这个有关

@Johnson-Jia
Copy link

2.2.6.RELEASE大概什么时候能发布

近期就会发布,原计划是这个月发布的,因为测试可能稍微有些delay

spring cloud :2020.0.5
spring cloud alibaba : 2021.1
spring boot : 2.4.13

也出现此类问题,同问 spring cloud alibaba :2021.1 新版本什么时候发布。几十个 项目刚升级到springboot 2.4.13,不想再降到 2.3.x 太麻烦了

@ucfjepl
Copy link

ucfjepl commented Oct 21, 2022

你们是不是在调试时几个相互订阅服务同时启动?

根据我调试结果得出、几个服务在同时启动时、先启动好的服务注册入nacos之后、发生服务变更事件、nacos并不会通知正在注册中的服务、从而导致后注册上的服务没有获取到最新的订阅服务信息、如果每个服务顺序启动是不会有此问题的。

为了避免在注册中途有订阅服务注册上nacos却没有通知到本服务、故此我在服务启动之后会主动去发起一次订阅服务更新事件、经过测试之后已经解决此问题。

环境: spring-cloud-alibaba 2.2.6.RELEASE nacos-service 2.0.3

@Slf4j
@Component
public class CustomDubboActiveProbeSubscriptionService implements ApplicationListener<ApplicationReadyEvent> {

    @Override
    public void onApplicationEvent(ApplicationReadyEvent event) {
        DubboServiceMetadataRepository dubboServiceMetadataRepository = event.getApplicationContext().getBean(DubboServiceMetadataRepository.class);
        Set<String> subscribedServices = dubboServiceMetadataRepository.getSubscribedServices();
        if (!ObjectUtils.isEmpty(subscribedServices)) {
            DiscoveryClient DiscoveryClient = event.getApplicationContext().getBean(DiscoveryClient.class);
            for (String subscribedService : subscribedServices) {
                log.info("主动探测订阅服务:" + subscribedService);
                ServiceInstancesChangedEvent changedEvent = new ServiceInstancesChangedEvent(subscribedService, DiscoveryClient.getInstances(subscribedService));
                event.getApplicationContext().publishEvent(changedEvent);
            }
        }
    }

}

666,使用这个方法解决问题,赞!!!

@v2hoping
Copy link

你们是不是在调试时几个相互订阅服务同时启动?
根据我调试结果得出、几个服务在同时启动时、先启动好的服务注册入nacos之后、发生服务变更事件、nacos并不会通知正在注册中的服务、从而导致后注册上的服务没有获取到最新的订阅服务信息、如果每个服务顺序启动是不会有此问题的。
为了避免在注册中途有订阅服务注册上nacos却没有通知到本服务、故此我在服务启动之后会主动去发起一次订阅服务更新事件、经过测试之后已经解决此问题。
环境: spring-cloud-alibaba 2.2.6.RELEASE nacos-service 2.0.3

@Slf4j
@Component
public class CustomDubboActiveProbeSubscriptionService implements ApplicationListener<ApplicationReadyEvent> {

    @Override
    public void onApplicationEvent(ApplicationReadyEvent event) {
        DubboServiceMetadataRepository dubboServiceMetadataRepository = event.getApplicationContext().getBean(DubboServiceMetadataRepository.class);
        Set<String> subscribedServices = dubboServiceMetadataRepository.getSubscribedServices();
        if (!ObjectUtils.isEmpty(subscribedServices)) {
            DiscoveryClient DiscoveryClient = event.getApplicationContext().getBean(DiscoveryClient.class);
            for (String subscribedService : subscribedServices) {
                log.info("主动探测订阅服务:" + subscribedService);
                ServiceInstancesChangedEvent changedEvent = new ServiceInstancesChangedEvent(subscribedService, DiscoveryClient.getInstances(subscribedService));
                event.getApplicationContext().publishEvent(changedEvent);
            }
        }
    }

}

666,使用这个方法解决问题,赞!!!

这个还有种情况不行,当消费者、提供者同时启动,都处于注册过程中,消费者先启动成功,执行CustomDubboActiveProbeSubscriptionService,此时提供者未注册完成,所以获得改变0。之后提供者注册成功,但是消费者仍然订阅不到变更。

@ucfjepl
Copy link

ucfjepl commented Mar 14, 2023

你们是不是在调试时几个相互订阅服务同时启动?
根据我调试结果得出、几个服务在同时启动时、先启动好的服务注册入nacos之后、发生服务变更事件、nacos并不会通知正在注册中的服务、从而导致后注册上的服务没有获取到最新的订阅服务信息、如果每个服务顺序启动是不会有此问题的。
为了避免在注册中途有订阅服务注册上nacos却没有通知到本服务、故此我在服务启动之后会主动去发起一次订阅服务更新事件、经过测试之后已经解决此问题。
环境: spring-cloud-alibaba 2.2.6.RELEASE nacos-service 2.0.3

@Slf4j
@Component
public class CustomDubboActiveProbeSubscriptionService implements ApplicationListener<ApplicationReadyEvent> {

    @Override
    public void onApplicationEvent(ApplicationReadyEvent event) {
        DubboServiceMetadataRepository dubboServiceMetadataRepository = event.getApplicationContext().getBean(DubboServiceMetadataRepository.class);
        Set<String> subscribedServices = dubboServiceMetadataRepository.getSubscribedServices();
        if (!ObjectUtils.isEmpty(subscribedServices)) {
            DiscoveryClient DiscoveryClient = event.getApplicationContext().getBean(DiscoveryClient.class);
            for (String subscribedService : subscribedServices) {
                log.info("主动探测订阅服务:" + subscribedService);
                ServiceInstancesChangedEvent changedEvent = new ServiceInstancesChangedEvent(subscribedService, DiscoveryClient.getInstances(subscribedService));
                event.getApplicationContext().publishEvent(changedEvent);
            }
        }
    }

}

666,使用这个方法解决问题,赞!!!

这个还有种情况不行,当消费者、提供者同时启动,都处于注册过程中,消费者先启动成功,执行CustomDubboActiveProbeSubscriptionService,此时提供者未注册完成,所以获得改变0。之后提供者注册成功,但是消费者仍然订阅不到变更。

2023-03-14 15:54:49.024 26212 [,,] WARN APP member instance changed, size changed zero!!!
2023-03-14 15:54:52.653 26212 [,,] INFO APP member instance changed, size changed to 1

可能是我们的服务虽然是同时启动,但是启动耗时不一致,消费者和提供者没有在同一时刻注册,所以没有出现这种情况。以上是我本地测试的日志

@Andy81135
Copy link

使用5.1的方案解决了问题,但是setStatus的方法里用了updateInstance(serviceId, instance)方法来更新实例状态,如果自定义了groupName,就会报错,而这个解决方案没有对异常做处理,所以看不出来有错误,自己重新了一下setStatus方法就好了

@tan-zhuo
Copy link

你们是不是在调试时几个相互订阅服务同时启动?
根据我调试结果得出、几个服务在同时启动时、先启动好的服务注册入nacos之后、发生服务变更事件、nacos并不会通知正在注册中的服务、从而导致后注册上的服务没有获取到最新的订阅服务信息、如果每个服务顺序启动是不会有此问题的。
为了避免在注册中途有订阅服务注册上nacos却没有通知到本服务、故此我在服务启动之后会主动去发起一次订阅服务更新事件、经过测试之后已经解决此问题。
环境: spring-cloud-alibaba 2.2.6.RELEASE nacos-service 2.0.3

@Slf4j
@Component
public class CustomDubboActiveProbeSubscriptionService implements ApplicationListener<ApplicationReadyEvent> {

    @Override
    public void onApplicationEvent(ApplicationReadyEvent event) {
        DubboServiceMetadataRepository dubboServiceMetadataRepository = event.getApplicationContext().getBean(DubboServiceMetadataRepository.class);
        Set<String> subscribedServices = dubboServiceMetadataRepository.getSubscribedServices();
        if (!ObjectUtils.isEmpty(subscribedServices)) {
            DiscoveryClient DiscoveryClient = event.getApplicationContext().getBean(DiscoveryClient.class);
            for (String subscribedService : subscribedServices) {
                log.info("主动探测订阅服务:" + subscribedService);
                ServiceInstancesChangedEvent changedEvent = new ServiceInstancesChangedEvent(subscribedService, DiscoveryClient.getInstances(subscribedService));
                event.getApplicationContext().publishEvent(changedEvent);
            }
        }
    }

}

666,使用这个方法解决问题,赞!!!

这个还有种情况不行,当消费者、提供者同时启动,都处于注册过程中,消费者先启动成功,执行CustomDubboActiveProbeSubscriptionService,此时提供者未注册完成,所以获得改变0。之后提供者注册成功,但是消费者仍然订阅不到变更。

在这种极端场景下,确实有这种情况。解决思路的话,可以在对没有获取到元数据的订阅服务列表做一些补偿逻辑(例如:部分订阅服务没有获取到元数据信息则进行重试,重试次数等),应该就能解决掉此问题。

@Andy81135
Copy link

Andy81135 commented May 10, 2023 via email

@aillamsun
Copy link

k8s环境下,用的是文档推荐的最新毕业版本,nacos是1.3.2 还是出现了这个错误, 并一直在重复打印(原因就是172服务我以及重启下线了,但54服务没有即使的更新)

Spring Cloud Version | Spring Cloud Alibaba Version | Spring Boot Version Spring Cloud Hoxton.SR8 | 2.2.3.RELEASE | 2.3.2.RELEASE

2020-12-28 17:44:06.811 [dubbo-client-idleCheck-thread-1] [] ERROR org.apache.dubbo.remoting.exchange.support.header.ReconnectTimerTask - [DUBBO] Fail to connect to HeaderExchangeClient [channel=org.apache.dubbo.remoting.transport.netty4.NettyClient [/10.xx.xx.54:55378 -> /10.xx.xx.172:20880]], dubbo version: 2.7.8, current host: 10.xx.xx.54 org.apache.dubbo.remoting.RemotingException: client(url: dubbo://10.xx.xx.172:20880/com.alibaba.cloud.dubbo.service.DubboMetadataService?anyhost=true&application=question-service&bind.ip=10.xx.xx.172&bind.port=20880&check=false&codec=dubbo&deprecated=false&dubbo=2.0.2&dynamic=true&generic=true&group=privilege-service&heartbeat=60000&interface=com.alibaba.cloud.dubbo.service.DubboMetadataService&methods=getAllServiceKeys,getServiceRestMetadata,getExportedURLs,getAllExportedURLs&pid=376&qos.enable=false&register.ip=10.xx.xx.54&release=2.7.8&remote.application=privilege-service&revision=2.2.3.RELEASE&side=consumer&sticky=false&timeout=60000&timestamp=1608692297155&version=1.0.0) failed to connect to server /10.xx.xx.172:20880 client-side timeout 3000ms (elapsed: 3001ms) from netty client 10.xx.xx.54 using dubbo version 2.7.8 at org.apache.dubbo.remoting.transport.netty4.NettyClient.doConnect(NettyClient.java:174) at org.apache.dubbo.remoting.transport.AbstractClient.connect(AbstractClient.java:191) at org.apache.dubbo.remoting.transport.AbstractClient.reconnect(AbstractClient.java:247) at org.apache.dubbo.remoting.exchange.support.header.HeaderExchangeClient.reconnect(HeaderExchangeClient.java:166) at org.apache.dubbo.remoting.exchange.support.header.ReconnectTimerTask.doTask(ReconnectTimerTask.java:49) at org.apache.dubbo.remoting.exchange.support.header.AbstractTimerTask.run(AbstractTimerTask.java:87) at org.apache.dubbo.common.timer.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:648) at org.apache.dubbo.common.timer.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:727) at org.apache.dubbo.common.timer.HashedWheelTimer$Worker.run(HashedWheelTimer.java:449) at java.lang.Thread.run(Thread.java:748)

能不能来个版本解决下呀。

还有一个问题,操作发版,导致服务启动完后,服务A不能调用服务B,一直在报错 org.apache.dubbo.rpc.RpcException: Failed to invoke the method getUserNameById in the service com.xxx.privilege.application.PrivilegeApplication. Tried 3 times of the providers ....

这个生产弄我几次事故了。。。我们是k8s 支持自动HPA机制

@Andy81135
Copy link

Andy81135 commented Dec 19, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests