Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Registration & Heartbeat Performance Optimization #429

Closed
chuntaojun opened this issue Jun 13, 2022 Discussed in #428 · 1 comment · Fixed by #440
Closed

Registration & Heartbeat Performance Optimization #429

chuntaojun opened this issue Jun 13, 2022 Discussed in #428 · 1 comment · Fixed by #440
Assignees
Labels
enhancement New feature or request service Service registration discovery, service governance

Comments

@chuntaojun
Copy link
Member

Discussed in #428

Originally posted by chuntaojun June 13, 2022

注册当前性能问题

  • 每次实例注册,都需要从数据库中查询一下这个实例对应的服务信息是否存在以及是否是别名服务
  • 批量写机制,但是每个注册请求依然要等待数据库中具体的处理结果才能确定实例注册是否成功,

优化方案

  • 由于实现了注册服务实例时连带自动创建服务,因此无需在进行判断实例对应的服务数据是否存在,减少数据库的select操作
  • 调整注册整体流程,将注册动作与数据落数据库动作进行拆分,主要想法如下

服务端处理

  • 来自客户端的注册实例请求,如果实例开启了心跳上报,则服务端直接异步注册实例
  • 原本的 future.Wait 放在一个协程池中异步处理消费,记录 future 的结果进行打印
  • 需要提前设置实例的 id 信息,返回给客户端

客户端处理

  • 调用客户端注册接口时
    • 如果发现实例开启了心跳上报 & 心跳动作交由 SDK 处理,则将该注册实例信息 & 心跳信息创建一个心跳任务
    • 心跳任务需要记录在一个 map 中
  • 调用客户端心跳接口进行上报实例心跳
    • 如果上报心跳的返回结果,出现两次实例不存在,则需要客户端主动根据缓存进行实例重注册动作
  • 调用客户端反注册接口时
    • 根据实例信息,判断是否在 SDK 内维护了心跳任务,如果存在,则对心跳任务进行关闭动作
    • 执行反注册请求发送至服务端

Screen Shot 2022-06-13 at 15 38 58

心跳当前性能问题

  • 当前心跳动作,必须等到server将心跳数据写入 redis 之后,才能回包给客户端,对于客户端来说,只需要服务端接收到客户端的心跳包即可,至于服务端怎么处理,客户端不应当关心,背后的逻辑应该由服务端自行兜底承担

优化方案

  • 心跳直接异步写入redis,写入的 future 结果放在一个协程池中异步消费打印心跳结果
@chuntaojun chuntaojun added enhancement New feature or request healthcheck service Service registration discovery, service governance labels Jun 13, 2022
@chuntaojun chuntaojun self-assigned this Jun 13, 2022
chuntaojun added a commit that referenced this issue Jun 21, 2022
* feat: none

* perf: perf regis instance to async

* fix: fix duplicate healthcheck plugin init

* chore: remove polaris-server.yaml from docker

* refactor: remove unuseless config

* fix: fix config_center uint test

* refactor: remove async report heartbeat

* feat: update cache refactor

* test: fix unit test

* test: add user-group unit test

* test: fix unit test stability

* test: fix unit test stability

* test: fix unit test stability

* refactor: fix pr issue

* fix: fix pr issue

\

* fix: fix instanceFuture not set needWait

* test: add unit test for auth
@chuntaojun
Copy link
Member Author

chuntaojun commented Jun 28, 2022

问题

延迟注册 -> 任务处理不过来 -> heartbeat cahce没有更新数据 -> 客户段心跳失败 -> 达到阈值触发SDK重注册 -> 可能带来重注册风暴

//TODO

  • SDK重注册动作
    • 方案一:心跳失败次数 + 距离改实例上次注册时间多久之后开始才能触发重注册
    • 方案二:或者服务端提供查询实例的接口,如果多次查询不到,触发重注册
  • 服务端对实例注册任务去重
  • 延迟注册动作,任务超时兜底处理(服务端可配置任务超时时间)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request service Service registration discovery, service governance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant