Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL] pg_cron: Use stateful service to assign cron leader #22360

Closed
yugabyte-ci opened this issue May 10, 2024 · 0 comments
Closed

[YSQL] pg_cron: Use stateful service to assign cron leader #22360

yugabyte-ci opened this issue May 10, 2024 · 0 comments
Assignees
Labels
jira-originated kind/enhancement This is an enhancement of an existing feature priority/low Low priority

Comments

@yugabyte-ci
Copy link
Contributor

yugabyte-ci commented May 10, 2024

Jira Link: DB-11263

@yugabyte-ci yugabyte-ci added jira-originated kind/enhancement This is an enhancement of an existing feature priority/low Low priority labels May 10, 2024
hari90 added a commit that referenced this issue May 15, 2024
Summary:
Use shared memory `is_cron_leader` field to indicate which node will be the cron leader. Only the leader will schedule and run cron jobs. This ensures that cron jobs are run at most once. Distributed job scheduling and execution will happen as part of #22336.

If we had two or more leaders in the same minute we would not run the task multiple times since cron does not run tasks for the first minute of leadership. Interval jobs do not have this guarantee since their timer starts when we become the leader. Non Yugabyte pg_cron has the same behavior when pg restarts. Check comments in `YbCheckLeadership` and `StartAllPendingRuns` for more details.

For testing purposes we set `is_cron_leader` via `FLAG_TEST_is_ysql_cron_leader`. `TabletServer` uses a flag callback to set the shared memory to the value of the flag.
#22360 will add a StatefulService that will ensure one and only one tserver is always picked as the pg_cron leader.

- Moved `YBCPgResetCatalogReadTime` inside `RefreshTaskHash` as we do not need to do this every time
- Added `YbUpdateCatalogCacheVersion` since without it we hit `The catalog snapshot used for this transaction has been invalidated`.
- Replaced `YBCDeleteSysCatalogTuple` with `CatalogTupleDelete` so that indexes entries are also deleted.

Fixes #22260
Jira: DB-11179

Test Plan:
PgCronTest.AtMostOnceTest
PgCronTest.PerMinuteTask
yb_pg_cron-test

Reviewers: tnayak, fizaa

Reviewed By: tnayak

Subscribers: yql, ybase

Differential Revision: https://phorge.dev.yugabyte.com/D34926
@hari90 hari90 closed this as completed in 560ebd3 May 21, 2024
jharveysmith pushed a commit that referenced this issue May 24, 2024
Summary:
Use shared memory `is_cron_leader` field to indicate which node will be the cron leader. Only the leader will schedule and run cron jobs. This ensures that cron jobs are run at most once. Distributed job scheduling and execution will happen as part of #22336.

If we had two or more leaders in the same minute we would not run the task multiple times since cron does not run tasks for the first minute of leadership. Interval jobs do not have this guarantee since their timer starts when we become the leader. Non Yugabyte pg_cron has the same behavior when pg restarts. Check comments in `YbCheckLeadership` and `StartAllPendingRuns` for more details.

For testing purposes we set `is_cron_leader` via `FLAG_TEST_is_ysql_cron_leader`. `TabletServer` uses a flag callback to set the shared memory to the value of the flag.
#22360 will add a StatefulService that will ensure one and only one tserver is always picked as the pg_cron leader.

- Moved `YBCPgResetCatalogReadTime` inside `RefreshTaskHash` as we do not need to do this every time
- Added `YbUpdateCatalogCacheVersion` since without it we hit `The catalog snapshot used for this transaction has been invalidated`.
- Replaced `YBCDeleteSysCatalogTuple` with `CatalogTupleDelete` so that indexes entries are also deleted.

Fixes #22260
Jira: DB-11179

Test Plan:
PgCronTest.AtMostOnceTest
PgCronTest.PerMinuteTask
yb_pg_cron-test

Reviewers: tnayak, fizaa

Reviewed By: tnayak

Subscribers: yql, ybase

Differential Revision: https://phorge.dev.yugabyte.com/D34926
jharveysmith pushed a commit that referenced this issue May 24, 2024
Summary:
Adding PG_CRON_LEADER Stateful service to ensure there is only one cron leader running in the universe at any time.
Yb-master will create the Stateful service when the `cron.job` table is created. It never gets dropped once created. The same Stateful service will be used if cron extension is dropped and recreated, even on another db.

The Stateful service is activated after its underlying raft peer becomes a leader. It will be deactivated when the peer loses the leadership. There can be an overlap between the deactivation on one node and activation on another node. To protect against it Stateful services check their term around critical sections.
The cron leader Stateful service sets a 60s lease (FLAGS_ pg_cron_leader_lease_sec). This is refreshed every 10s (cron_leadership_refresh_sec) after checking if the term is valid. It sets the shared memory
`cron_leader_lease_` to indicate for how long the lease will be valid for. This ensures pg cron runs only when both Tserver and pg are healthy and it safely stops if the Stateful service or raft gets stuck (like during high Cpu).
When the Stateful service gets activated in a new node it first waits out the lease period of 60s before setting its local shared memory.
Pg_cron launcher(a pg backend) makes sure the specified lease time in the shared memory has not expired inorder to act as the leader.

- We reset the catalog cache version in the cron background worker to handle cases when the job is scheduled on a different node.
- Converted `enable_pg_cron` to a preview flag.
- Using a generic (id int64, data jsonb) schema for the PG_CRON_LEADER Stateful service tablet. This is not currently used.
- Added gFlag `ysql_cron_database_name` which will update the `cron.database_name` guc. This is NON_RUNTIME since the change requires restart of pg_cron to kill the inflight jobs.
- Cherry picking commit 19f8ebf9349b6a3642e81a4d19dd0ea967d3f357 from pg_cron.

**Upgrade/Downgrade safety**
New service is only enabled if the flag `enable_pg_cron` is enabled. This flag

Fixes #22360
Jira: DB-11263

Test Plan:
PgCronTest.GracefulLeaderMove
PgCronTest.LeaderCrash
PgCronTest.TaskOnDifferentDB
PgCronTest.ChangeCronDB

Reviewers: tnayak, fizaa

Reviewed By: tnayak

Subscribers: jason, yql, ybase

Differential Revision: https://phorge.dev.yugabyte.com/D35009
svarnau pushed a commit that referenced this issue May 25, 2024
Summary:
Use shared memory `is_cron_leader` field to indicate which node will be the cron leader. Only the leader will schedule and run cron jobs. This ensures that cron jobs are run at most once. Distributed job scheduling and execution will happen as part of #22336.

If we had two or more leaders in the same minute we would not run the task multiple times since cron does not run tasks for the first minute of leadership. Interval jobs do not have this guarantee since their timer starts when we become the leader. Non Yugabyte pg_cron has the same behavior when pg restarts. Check comments in `YbCheckLeadership` and `StartAllPendingRuns` for more details.

For testing purposes we set `is_cron_leader` via `FLAG_TEST_is_ysql_cron_leader`. `TabletServer` uses a flag callback to set the shared memory to the value of the flag.
#22360 will add a StatefulService that will ensure one and only one tserver is always picked as the pg_cron leader.

- Moved `YBCPgResetCatalogReadTime` inside `RefreshTaskHash` as we do not need to do this every time
- Added `YbUpdateCatalogCacheVersion` since without it we hit `The catalog snapshot used for this transaction has been invalidated`.
- Replaced `YBCDeleteSysCatalogTuple` with `CatalogTupleDelete` so that indexes entries are also deleted.

Fixes #22260
Jira: DB-11179

Test Plan:
PgCronTest.AtMostOnceTest
PgCronTest.PerMinuteTask
yb_pg_cron-test

Reviewers: tnayak, fizaa

Reviewed By: tnayak

Subscribers: yql, ybase

Differential Revision: https://phorge.dev.yugabyte.com/D34926
svarnau pushed a commit that referenced this issue May 25, 2024
Summary:
Adding PG_CRON_LEADER Stateful service to ensure there is only one cron leader running in the universe at any time.
Yb-master will create the Stateful service when the `cron.job` table is created. It never gets dropped once created. The same Stateful service will be used if cron extension is dropped and recreated, even on another db.

The Stateful service is activated after its underlying raft peer becomes a leader. It will be deactivated when the peer loses the leadership. There can be an overlap between the deactivation on one node and activation on another node. To protect against it Stateful services check their term around critical sections.
The cron leader Stateful service sets a 60s lease (FLAGS_ pg_cron_leader_lease_sec). This is refreshed every 10s (cron_leadership_refresh_sec) after checking if the term is valid. It sets the shared memory
`cron_leader_lease_` to indicate for how long the lease will be valid for. This ensures pg cron runs only when both Tserver and pg are healthy and it safely stops if the Stateful service or raft gets stuck (like during high Cpu).
When the Stateful service gets activated in a new node it first waits out the lease period of 60s before setting its local shared memory.
Pg_cron launcher(a pg backend) makes sure the specified lease time in the shared memory has not expired inorder to act as the leader.

- We reset the catalog cache version in the cron background worker to handle cases when the job is scheduled on a different node.
- Converted `enable_pg_cron` to a preview flag.
- Using a generic (id int64, data jsonb) schema for the PG_CRON_LEADER Stateful service tablet. This is not currently used.
- Added gFlag `ysql_cron_database_name` which will update the `cron.database_name` guc. This is NON_RUNTIME since the change requires restart of pg_cron to kill the inflight jobs.
- Cherry picking commit 19f8ebf9349b6a3642e81a4d19dd0ea967d3f357 from pg_cron.

**Upgrade/Downgrade safety**
New service is only enabled if the flag `enable_pg_cron` is enabled. This flag

Fixes #22360
Jira: DB-11263

Test Plan:
PgCronTest.GracefulLeaderMove
PgCronTest.LeaderCrash
PgCronTest.TaskOnDifferentDB
PgCronTest.ChangeCronDB

Reviewers: tnayak, fizaa

Reviewed By: tnayak

Subscribers: jason, yql, ybase

Differential Revision: https://phorge.dev.yugabyte.com/D35009
svarnau pushed a commit that referenced this issue May 25, 2024
Summary:
Use shared memory `is_cron_leader` field to indicate which node will be the cron leader. Only the leader will schedule and run cron jobs. This ensures that cron jobs are run at most once. Distributed job scheduling and execution will happen as part of #22336.

If we had two or more leaders in the same minute we would not run the task multiple times since cron does not run tasks for the first minute of leadership. Interval jobs do not have this guarantee since their timer starts when we become the leader. Non Yugabyte pg_cron has the same behavior when pg restarts. Check comments in `YbCheckLeadership` and `StartAllPendingRuns` for more details.

For testing purposes we set `is_cron_leader` via `FLAG_TEST_is_ysql_cron_leader`. `TabletServer` uses a flag callback to set the shared memory to the value of the flag.
#22360 will add a StatefulService that will ensure one and only one tserver is always picked as the pg_cron leader.

- Moved `YBCPgResetCatalogReadTime` inside `RefreshTaskHash` as we do not need to do this every time
- Added `YbUpdateCatalogCacheVersion` since without it we hit `The catalog snapshot used for this transaction has been invalidated`.
- Replaced `YBCDeleteSysCatalogTuple` with `CatalogTupleDelete` so that indexes entries are also deleted.

Fixes #22260
Jira: DB-11179

Test Plan:
PgCronTest.AtMostOnceTest
PgCronTest.PerMinuteTask
yb_pg_cron-test

Reviewers: tnayak, fizaa

Reviewed By: tnayak

Subscribers: yql, ybase

Differential Revision: https://phorge.dev.yugabyte.com/D34926
svarnau pushed a commit that referenced this issue May 25, 2024
Summary:
Adding PG_CRON_LEADER Stateful service to ensure there is only one cron leader running in the universe at any time.
Yb-master will create the Stateful service when the `cron.job` table is created. It never gets dropped once created. The same Stateful service will be used if cron extension is dropped and recreated, even on another db.

The Stateful service is activated after its underlying raft peer becomes a leader. It will be deactivated when the peer loses the leadership. There can be an overlap between the deactivation on one node and activation on another node. To protect against it Stateful services check their term around critical sections.
The cron leader Stateful service sets a 60s lease (FLAGS_ pg_cron_leader_lease_sec). This is refreshed every 10s (cron_leadership_refresh_sec) after checking if the term is valid. It sets the shared memory
`cron_leader_lease_` to indicate for how long the lease will be valid for. This ensures pg cron runs only when both Tserver and pg are healthy and it safely stops if the Stateful service or raft gets stuck (like during high Cpu).
When the Stateful service gets activated in a new node it first waits out the lease period of 60s before setting its local shared memory.
Pg_cron launcher(a pg backend) makes sure the specified lease time in the shared memory has not expired inorder to act as the leader.

- We reset the catalog cache version in the cron background worker to handle cases when the job is scheduled on a different node.
- Converted `enable_pg_cron` to a preview flag.
- Using a generic (id int64, data jsonb) schema for the PG_CRON_LEADER Stateful service tablet. This is not currently used.
- Added gFlag `ysql_cron_database_name` which will update the `cron.database_name` guc. This is NON_RUNTIME since the change requires restart of pg_cron to kill the inflight jobs.
- Cherry picking commit 19f8ebf9349b6a3642e81a4d19dd0ea967d3f357 from pg_cron.

**Upgrade/Downgrade safety**
New service is only enabled if the flag `enable_pg_cron` is enabled. This flag

Fixes #22360
Jira: DB-11263

Test Plan:
PgCronTest.GracefulLeaderMove
PgCronTest.LeaderCrash
PgCronTest.TaskOnDifferentDB
PgCronTest.ChangeCronDB

Reviewers: tnayak, fizaa

Reviewed By: tnayak

Subscribers: jason, yql, ybase

Differential Revision: https://phorge.dev.yugabyte.com/D35009
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira-originated kind/enhancement This is an enhancement of an existing feature priority/low Low priority
Projects
None yet
Development

No branches or pull requests

2 participants