Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: Add the CREATE SCHEDULE FOR BACKUP grammar #50631

Merged
merged 1 commit into from
Jul 7, 2020

Conversation

miretskiy
Copy link
Contributor

@miretskiy miretskiy commented Jun 24, 2020

Informs #28266

Add a new CREATE SCHEDULE FOR BACKUP statement grammar which allows
the specification of the BACKUP that needs to be executed periodically.

Perform full cluster backup daily:

CREATE SCHEDULE 'full cluster'
FOR BACKUP OF CLUSTER TO 'scheme://uri/for/backup'
RECURRING '@daily'

The frequency and the time of the schedule can be specified using
crontab format.

The RECURRING expression specifies when we backup. By default these are
incremental backups that capture changes since the last backup,
writing to the "current" backup. We will choose a resonable default for
the frequency of full backups. However, the users may also specify
when and how often to run full backups explicitly.

For example, the following will perform incremental backups daily,
and full backup on Sundays, at 4 am (EST)

CREATE SCHEDULE 'full cluster'
FOR BACKUP OF CLUSTER TO 'scheme://uri/for/backup'
RECURRING '@daily'
FULL BACKUP 'TZ=America/New_York Sun * * 4 0'

The full list of options supported by this statement:

  CREATE SCHEDULE [<description>] FOR BACKUP [OF <targets>] TO  <location...>
  [WITH <backup_option> = [<value>] [, ...]]
  RECURRING [crontab|NEVER] [FULL BACKUP <crontab|ALWAYS>]
  [WITH EXPERIMENTAL SCHEDULE OPTIONS <schedule_option>[= <value>] [, ...] ]

Release Note (sql change): Add CREATE SCHEDULE FOR BACKUP grammar.
This statement can be used to specify the recurring backup schedules.

@miretskiy miretskiy requested review from dt, pbardea and rohany June 24, 2020 22:38
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@rohany rohany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like there is a good amount of extra logic/validation happening in the parser that should be done in later validation passes over the statement.

// Options:
// See BACKUP Options.
// %SeeAlso: BACKUP
create_schedule_for_backup_stmt:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these changes need corresponding tests in parse_test.go

| backup_schedule_options APPENDING CHANGES EVERY interval_value
{
opts := $1.backupScheduleOptions()
if opts.ChangeCapturePeriod != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't feel like the right place to do this. Either the grammar shouldn't allow it, or later validation code should reject it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you made this rule schedule_options APPENDING CHANGES EVERY interval_value, then it wouldn't be possible to have multiple APPENDING CHANGES options. But why is this a separate case than from the case below?

resultsCh <- tree.Datums{
tree.NewDInt(tree.DInt(0)),
tree.DNull,
tree.NewDString(fmtCtx.String()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit]: You can just do tree.AsString(schedule) here.

| backup_schedule_options APPENDING CHANGES EVERY interval_value
{
opts := $1.backupScheduleOptions()
if opts.ChangeCapturePeriod != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you made this rule schedule_options APPENDING CHANGES EVERY interval_value, then it wouldn't be possible to have multiple APPENDING CHANGES options. But why is this a separate case than from the case below?

Copy link
Contributor Author

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There we many changes done to this PR, mostly due to the discussion in https://docs.google.com/document/d/1TeYYEGKomTRiWIZf_R0tYqnOEsnC45HM_2p5CxHXF_0/edit?ts=5ef4e10e#

Parsing tests added.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @pbardea, and @rohany)


pkg/ccl/jobsccl/create_schedule_stmt.go, line 39 at r1 (raw file):

Previously, rohany (Rohan Yadav) wrote…

[nit]: You can just do tree.AsString(schedule) here.

Thanks. Done.


pkg/sql/parser/sql.y, line 2107 at r1 (raw file):

Previously, rohany (Rohan Yadav) wrote…

All these changes need corresponding tests in parse_test.go

Done.


pkg/sql/parser/sql.y, line 2146 at r1 (raw file):

Previously, rohany (Rohan Yadav) wrote…

If you made this rule schedule_options APPENDING CHANGES EVERY interval_value, then it wouldn't be possible to have multiple APPENDING CHANGES options. But why is this a separate case than from the case below?

Done.


pkg/sql/parser/sql.y, line 2163 at r1 (raw file):

Quoted 5 lines of code…
    opts := $1.scheduleOptions()
    if opts.FirstRun != nil {
      return setErr(sqllex,  errors.New("STARTING option specified multiple times"))
    }
    opts.FirstRun = $3.expr()

per our discussion: moved this validation to schedule.go

Copy link
Contributor

@rohany rohany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @miretskiy, @pbardea, and @rohany)


pkg/ccl/jobsccl/create_schedule_stmt_test.go, line 23 at r2 (raw file):

)

func TestCreateNoopSchedule(t *testing.T) {

If this is just a parsing test, then we have those already -- do we need this test?


pkg/sql/parser/sql.y, line 2245 at r2 (raw file):

  }

sconst_or_placeholder:

Why do you need this rule? Wouldn't just taking an expr where you are using it be fine? I'm guessing you don't want to validate it later

@miretskiy miretskiy changed the title sql: Add the CREATE SCHEDULE FOR BACKUP statement sql: Add the CREATE SCHEDULE FOR BACKUP grammar Jul 2, 2020
@miretskiy miretskiy requested a review from rohany July 2, 2020 18:16
Copy link
Contributor Author

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @miretskiy, @pbardea, and @rohany)


pkg/sql/parser/sql.y, line 2245 at r2 (raw file):

Previously, rohany (Rohan Yadav) wrote…

Why do you need this rule? Wouldn't just taking an expr where you are using it be fine? I'm guessing you don't want to validate it later

I wanted to use string_or_placeholder (in cron_expr) rule. Alas, string_or_placeholder also matches unreserved keywords -- which results in reduce/reduce conflict.

@miretskiy miretskiy force-pushed the crond branch 2 times, most recently from e21bd4b to f03d0f3 Compare July 2, 2020 21:25
@rohany
Copy link
Contributor

rohany commented Jul 4, 2020

If this is just a parsing test, then we have those already -- do we need this test?

LGTM, but bumping this question:

Copy link
Contributor Author

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @pbardea, and @rohany)


pkg/ccl/jobsccl/create_schedule_stmt_test.go, line 23 at r2 (raw file):

Previously, rohany (Rohan Yadav) wrote…

If this is just a parsing test, then we have those already -- do we need this test?

Good point. Removing

{`CREATE SCHEDULE FOR BACKUP TABLE foo, bar, buz TO 'bar' RECURRING '@daily' FULL BACKUP ALWAYS`},
{`CREATE SCHEDULE FOR BACKUP TABLE foo, bar, buz TO 'bar' RECURRING '@daily' FULL BACKUP '@weekly'`},
{`CREATE SCHEDULE FOR BACKUP TABLE foo, bar, buz TO 'bar' WITH revision_history RECURRING '@daily' FULL BACKUP '@weekly'`},
{`CREATE SCHEDULE FOR BACKUP TO 'bar' WITH revision_history RECURRING '@daily' FULL BACKUP '@weekly' WITH EXPERIMENTAL SCHEDULE OPTIONS foo = 'bar'`},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well also add a schedule description in here too?

Copy link
Contributor Author

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @pbardea, and @rohany)


pkg/sql/parser/parse_test.go, line 1439 at r3 (raw file):

Previously, pbardea (Paul Bardea) wrote…

Might as well also add a schedule description in here too?

Ooops -- Just noticed; adding

Add a new `CREATE SCHEDULE FOR BACKUP` statement which allows
the specification of the BACKUP that needs to be executed periodically.

Perform full cluster backup daily:
```
CREATE SCHEDULE 'full cluster'
FOR BACKUP OF CLUSTER TO 'scheme://uri/for/backup'
RECURRING '@daily'
```

The frequency and the time of the schedule can be specified using
crontab format.

It is possible to request both full as well as delta backups.
The following will perform full table backup daily (midnight), but
produce delta backups every 15 minutes:

```
CREATE SCHEDULE 'table user_data backup'
FOR BACKUP OF TABLE user_data TO 'scheme://uri/for/backup'
RECURRING '@daily'
APPENDING CHANGES EVERY INTERVAL '15 minutes'
```

The full list of options supported by this statement:
```
  CREATE SCHEDULE [<description>] FOR BACKUP [OF <targets>] TO  <location...>
    [WITH <backup_option> = [<value>] [, ...]]
    [STARTING <time_expr>] [RECURRING <crontab>]
    [ON EXECUTION FAILURE <PAUSE|RETRY|CONTINUE>]
    [APPENDING CHANGES EVERY INTERVAL <interval>]
    [ON PREVIOUS RUNNING (START|SKIP|WAIT)]
```

Release Note (sql change): Add `CREATE SCHEDULE FOR BACKUP` statement.
This statement can be used to specify the recurring backup schedules.
@miretskiy
Copy link
Contributor Author

bors r+

@craig
Copy link
Contributor

craig bot commented Jul 7, 2020

Build succeeded

@craig craig bot merged commit 708df8b into cockroachdb:master Jul 7, 2020
aayushshah15 pushed a commit to aayushshah15/cockroach that referenced this pull request Jul 8, 2020
50631: sql: Add the `CREATE SCHEDULE FOR BACKUP` grammar r=miretskiy a=miretskiy

Informs cockroachdb#28266

Add a new CREATE SCHEDULE FOR BACKUP statement grammar which allows
the specification of the BACKUP that needs to be executed periodically.

Perform full cluster backup daily:

```
CREATE SCHEDULE 'full cluster'
FOR BACKUP OF CLUSTER TO 'scheme://uri/for/backup'
RECURRING '@daily'
```

The frequency and the time of the schedule can be specified using
crontab format.

The RECURRING expression specifies when we backup.  By default these are
incremental backups that capture changes since the last backup,
writing to the "current" backup.  We will choose a resonable default for
the frequency of full backups.  However, the users may also specify
when and how often to run full backups explicitly.

For example, the following will perform incremental backups daily,
and full backup on Sundays, at 4 am (EST)

```
CREATE SCHEDULE 'full cluster'
FOR BACKUP OF CLUSTER TO 'scheme://uri/for/backup'
RECURRING '@daily'
FULL BACKUP 'TZ=America/New_York Sun * * 4 0'
```

The full list of options supported by this statement:
```
  CREATE SCHEDULE [<description>] FOR BACKUP [OF <targets>] TO  <location...>
  [WITH <backup_option> = [<value>] [, ...]]
  RECURRING [crontab|NEVER] [FULL BACKUP <crontab|ALWAYS>]
  [WITH EXPERIMENTAL SCHEDULE OPTIONS <schedule_option>[= <value>] [, ...] ]
```

Release Note (sql change): Add CREATE SCHEDULE FOR BACKUP grammar.
This statement can be used to specify the recurring backup schedules.

Co-authored-by: Yevgeniy Miretskiy <yevgeniy@cockroachlabs.com>

Release note (<category, see below>): <what> <show> <why>
@miretskiy miretskiy deleted the crond branch July 12, 2020 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants