Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to ensure reliable delivery of rabbitmq? #730

Closed
wdloveyy opened this issue Nov 29, 2020 · 7 comments · Fixed by #749
Closed

How to ensure reliable delivery of rabbitmq? #730

wdloveyy opened this issue Nov 29, 2020 · 7 comments · Fixed by #749

Comments

@wdloveyy
Copy link

Environment


  • dotnet : 3.1
  • cap version: v 3.1.1
  • mq: rabbit mq
  • db: sql server

Test Code


public IActionResult WithoutTransaction()
{
    var a = 1;
    for (int i = 0; i <= 10000; i++)
    {
        _capBus.Publish("publish.services.show.time", a++);
    }
    return Ok();
}

Issue


在执行Test Code中的测试代码时,对docker中的rabbitmq执行重启命令(反复几次,模拟网络抖动等一些异常情况):

docker restart 0619 #重启rabbitmq服务

在等待将所有消息消费完成后,检查数据库中的消息数量,其中:

  • Published表消息数量:10001(正常)
  • Received表消息数量:9853(异常)

随后查看cap发布消息时的代码:

SendAsync方法核心的发布代码为:

channel.BasicPublish(_exchange, message.GetName(), props, message.Body);

并没有发现rabbitmq官网建议的发布确认相关逻辑:

rabbitmq官网对发布确认的描述

如果我的测试方法和测试代码有问题,请及时纠正我。

  1. 是我错过了什么吗?

  2. 请问cap时根据什么保证rabbitmq发布消息时的可靠性的呢?

@yang-xiaodong
Copy link
Member

yang-xiaodong commented Nov 30, 2020

Hello,

Due to some historical issues (for performance), we did not use release confirmation when publishing the message.

At present, I think that ensuring the reliability of messages is one of the core goals of CAP, so I think this is a feature that needs improvement (although doing so may cause a lot of performance degradation). After enabling the sending confirmation, I did a performance test, which can reach up to 86pcs/s, which I think is acceptable.

What do you think? @xiangxiren @wdloveyy

NOTE: Since the issue uses Docker for forced restart, it(rabbitmq) may not be able to recover (not sure if Docker has file persistence enabled), I think the installation and tests on physical machines the results should be better

@wdloveyy
Copy link
Author

wdloveyy commented Nov 30, 2020

Hello,

Due to some historical issues (for performance), we did not use release confirmation when publishing the message.

At present, I think that ensuring the reliability of messages is one of the core goals of CAP, so I think this is a feature that needs improvement (although doing so may cause a lot of performance degradation). After enabling the sending confirmation, I did a performance test, which can reach up to 86pcs/s, which I think is acceptable.

What do you think? @xiangxiren @wdloveyy

NOTE: Since the issue uses Docker for forced restart, it(rabbitmq) may not be able to recover (not sure if Docker has file persistence enabled), I think the installation and tests on physical machines the results should be better

你好,
首先感谢您为.net生态做出的贡献,目前看来,cap可能是.net core平台下最好用的开源分布式事务组件
其次,由于英文掌握程度有限,请允许我用中文:relaxed:
由于channel.BasicPublish(_exchange, message.GetName(), props, message.Body);方法是异步发布消息,所以我认为在rabbitmq单点故障的情况下是有丢失消息的可能的。

我只是使用docker重启rabbitmq,并没有删除容器和镜像,所以数据是完全被保留的(这个我已经验证过了),其次我在物理机上也进行过测试,只调用channel.BasicPublish(_exchange, message.GetName(), props, message.Body);方法(不开启publish ack的情况下),在高并发的情况下模拟rabbitmq单点故障,同样会丢失一部分消息。

最后,由于保证发布消息的可靠性是cap组件的核心之一,所以我建议您考虑增加一下rabbitmq的publish ack,保障消息确实被rabbitmq接收。
😉

@xiangxiren
Copy link
Contributor

This just inspired me to think about the problems in our system a few days ago. Due to the unexpected downtime of RabbitMQ, there is a situation in which the sending table exists but the receiving table does not. Therefore, I think it should be confirmed.

@yang-xiaodong
Copy link
Member

This just inspired me to think about the problems in our system a few days ago. Due to the unexpected downtime of RabbitMQ, there is a situation in which the sending table exists but the receiving table does not. Therefore, I think it should be confirmed.

Ok, let's change it in v3.2, @wdloveyy would you like to submit a PR ?

@wdloveyy
Copy link
Author

wdloveyy commented Dec 3, 2020

This just inspired me to think about the problems in our system a few days ago. Due to the unexpected downtime of RabbitMQ, there is a situation in which the sending table exists but the receiving table does not. Therefore, I think it should be confirmed.

Ok, let's change it in v3.2, @wdloveyy would you like to submit a PR ?

我对您的项目整体理解不够深,在提交PR之前,需要好好研究一下您的项目,我必须谨慎,以免引起bug。我会在足够了解之后为该项目做出我力所能及的贡献,谢谢,还需要向您学习:smiley:

@yang-xiaodong
Copy link
Member

@wdloveyy We will improve this feature in a few days. Would you like to submit PR ?

@wdloveyy
Copy link
Author

@wdloveyy We will improve this feature in a few days. Would you like to submit PR ?

不好意思,最近一直在忙,暂时没有时间,期待您对该功能的改进😃。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants