Remove replication job supervisor by nickva · Pull Request #5036 · apache/couchdb

nickva · 2024-04-24T04:53:50Z

Use the scheduler as the job supervisor, since the scheduler is already a fancy supervisor, with its own backoff logic, process monitoring, etc.

This simplifies the job starting/stopping logic and fixes a bug where the simple_one_for_one supervisor could restart a job, but the scheduler would consider it not running, and try to start another job with the same replication ID on the same node. Since jobs register themselves in pg, the second job would keep crashing with duplicate_job error the first time it tried to checkpoint.

nickva · 2024-04-24T04:55:10Z

@@ -1,34 +0,0 @@
-% Licensed under the Apache License, Version 2.0 (the "License"); you may not


This is a left-over supervisor from many years ago when we switched to the scheduling replicator and forgot to remove it. Since we're cleaning up supervisors, removing the extra junk as well.

rnewson

a nice reduction in complexity.

Use the scheduler as the job supervisor, since the scheduler is already a fancy supervisor, with its own backoff logic, process monitoring, etc. This simplifies the job starting/stopping logic and fixes a bug where the simple_one_for_one supervisor could restart a job, but the scheduler would consider it not running, and try to start another job with the same replication ID on the same node. Since jobs register themselves in pg, the second job would keep crashing with duplicate_job error the first time it tried to checkpoint.

nickva commented Apr 24, 2024

View reviewed changes

rnewson approved these changes Apr 24, 2024

View reviewed changes

Comment thread src/couch_replicator/src/couch_replicator_scheduler.erl Outdated

nickva force-pushed the prevent-local-duplicate-jobs-error branch from aa5c5c8 to b90a591 Compare April 24, 2024 13:59

nickva force-pushed the prevent-local-duplicate-jobs-error branch from b90a591 to 770faed Compare April 24, 2024 14:26

nickva merged commit 7388b52 into main Apr 24, 2024

nickva deleted the prevent-local-duplicate-jobs-error branch April 24, 2024 15:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove replication job supervisor#5036

Remove replication job supervisor#5036
nickva merged 1 commit into
mainfrom
prevent-local-duplicate-jobs-error

nickva commented Apr 24, 2024

Uh oh!

nickva Apr 24, 2024

Uh oh!

rnewson left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -1,34 +0,0 @@
		% Licensed under the Apache License, Version 2.0 (the "License"); you may not

Conversation

nickva commented Apr 24, 2024

Uh oh!

nickva Apr 24, 2024

Choose a reason for hiding this comment

Uh oh!

rnewson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants