Skip to content

Commit

Permalink
Translate failover.en.md & misfire.en.md
Browse files Browse the repository at this point in the history
  • Loading branch information
Technoboy- committed Jul 18, 2020
1 parent fd95d09 commit 883c410
Show file tree
Hide file tree
Showing 2 changed files with 73 additions and 2 deletions.
48 changes: 47 additions & 1 deletion docs/content/features/failover.en.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,50 @@ weight = 4
chapter = true
+++

TODO
ElasticJob will not re-shard during this execution, but wait for the next scheduling before starting the re-sharding process.
When the server is down during job execution, failover allows the unfinished task to be compensated and executed on another job node.

## Concept

Failover is a temporary compensation execution mechanism for the currently executed job. When the next job is run, the current job allocation will be adjusted through resharding.
For example, if the job is executed at an hourly interval, each execution will take 30 minutes. As shown below.

![Job](https://shardingsphere.apache.org/elasticjob/current/img/failover/job.png)

The figure shows that the jobs are executed at 12:00, 13:00 and 14:00 respectively. The current time point shown in the figure is the job execution at 13:00.

If one of the shard servers of the job goes down at 13:10, the remaining 20 minutes of the business that should be processed are not executed, and the next job can only be executed at 14:00.
In other words, if failover is not turned on, there is a 50-minute idle period in this shard. As shown below.

![Job Crash](https://shardingsphere.apache.org/elasticjob/current/img/failover/job-crash.png)

After the failover is enabled, other ElasticJob servers can compensate for the execution of the sharding job after sensing the down job server. As shown below.

![Job Failover](https://shardingsphere.apache.org/elasticjob/current/img/failover/job-failover.png)

With sufficient resources, the job can still be executed completely at 13:30.

## Execution mechanism

When the job execution node goes down, the failover process will be triggered. ElasticJob determines the execution timing of the failover according to the different conditions of the distributed job execution when it is triggered。

### Notification execution

When other servers perceive that a failover job needs to be processed, and the job server has completed this task, it will pull the items to be failed over in real time and start compensation execution.
Also called real-time execution.

### Enquiry execution

After the execution of this task, the job service will inquire about the failover items to be executed from the registry, and if there are any, the compensation execution will start.
Also called asynchronous execution.

## Scenarios:

With the failover enabled, ElasticJob will monitor the execution status of each shard of the job and write it to the registry for other nodes to perceive.

In a job scenario that takes a long time to run and has a long interval, failover is an effective means to improve the real-time operation of the job;
For short-interval jobs, a large number of network communications with the registry will be generated, which will affect the performance of the cluster;
Moreover, short-interval jobs do not necessarily pay attention to the real-time performance of a single job. You can use the re-shard of the next job execution to make all the items execute correctly. Therefore, it is not recommended to enable failover for short-interval jobs.

Another thing to note is that the idempotence of the job itself is a prerequisite to ensure the correctness of failover.

27 changes: 26 additions & 1 deletion docs/content/features/misfire.en.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,29 @@ weight = 5
chapter = true
+++

TODO
ElasticJob does not allow jobs to be executed at the same time.
When the execution time of a job exceeds its running interval, re-executing the missed task can ensure that the job continues to execute the overdue job after completing the last task.

## Concept

The misfire function enables the overdue tasks to be executed immediately after the completion of the previous tasks.
For example, if the job is executed at an hourly interval, each execution will take 30 minutes. As shown below.

![Job](https://shardingsphere.apache.org/elasticjob/current/img/misfire/job.png)

The figure shows that the jobs are executed at 12:00, 13:00 and 14:00 respectively. The current time point shown in the figure is the job execution at 13:00.

If the job executed at 12:00 is finished at 13:10, then the job that should have been triggered by 13:00 missed the trigger time and needs to wait until the next job trigger at 14:00. As shown below.

![Job Missed](https://shardingsphere.apache.org/elasticjob/current/img/misfire/job-missed.png)

After the misfire is enabled, ElasticJob will trigger the execution of the missed job immediately after the last job is executed. As shown below.

![Job Misfire](https://shardingsphere.apache.org/elasticjob/current/img/misfire/job-misfire.png)

Missed jobs between 13:00 and 14:00 will be executed again.

## Scenarios

In a job scenario that takes a long time to run and has a long interval, misfire is an effective means to improve the real-time operation of the job;
For short-interval jobs that do not necessarily pay attention to the real-time performance of a single job, it is not necessary to turn on the misfire to re-execute.

0 comments on commit 883c410

Please sign in to comment.