Skip to content

Saturn Job Gray Upgrade Process

Duff Qiu edited this page Dec 16, 2016 · 4 revisions

Saturn作业灰度过程说明

1. 灰度的目的

灰度的目的是为了在升级过程中减少对于现有业务的影响,通过升级一台Execuot先验证是否没有问题后,再逐步按照批次升级。同时在升级过程中,不要影响原有其它Executor上作业的运行。

2. 灰度的类型

需要支持的灰度操作的类型分为以下几类

  • 变更了Saturn Executor的代码,或者变更了现有作业的代码,需要验证旧作业在新代码上的运行
  • 需要验证新增作业实现,即在升级Executor过程中,同时增加了新的作业

3. 过程图

假定我们有两台运行的Executor,都在运行着作业。这个时候需要去升级含有新作业实现的executor版本,在这个过程中需要验证原有的作业在新版本上运行是否完好,然后配置新的作业,验证新的作业实现是否完好。

以下为该流程图: process

Plantuml Code
@startuml
title "Job Gray Upgrade"
actor User
participant SaturnConsole as UI
participant ShardingService as sd
participant Zookeeper as zk
participant Executor1 as e1
participant Executor2 as e2
User -> e1: Offline executor
e1 -> e1: Shutdown jobs gracefully
sd -> zk: Detect executor offline
sd -> zk: Re-sharding service
sd -> e2: move all jobs in Executor1
e2 -> e2: Schedule jobs from Executor1
User -> e1: Upgrade executor with new business code
User -> e1: Online executor
sd -> zk: Detect the executor online
sd -> zk: Re-sharding service
sd -> e1: rebalance jobs, some old jobs will move to Executor1
e1 -> e1: Schedule jobs from Executor2
User -> UI: Verify old jobs in new executor
User -> e1: Verify old jobs in new executor
User -> UI: Create a new job which the needs new business code and startup it
sd -> zk: Detect the new job started
sd -> e1: Only sharding the new job in this executor which cotains new business code
e1 -> e1: Schedule the new job
User -> UI: Verify the new job in the new executor
User -> e1: Verify the new job in the new executor
@enduml