Skip to content

add backup api that can be run incrementally#1116

Merged
kewde merged 1 commit intoTryGhost:masterfrom
paulfitz:backup
Feb 21, 2019
Merged

add backup api that can be run incrementally#1116
kewde merged 1 commit intoTryGhost:masterfrom
paulfitz:backup

Conversation

@paulfitz
Copy link
Copy Markdown
Contributor

This exposes the sqlite3 backup api as described at https://sqlite.org/backup.html.

This implementation draws on #883, extending it to create a backup object that can be used in the background, without leaving the database locked for an extended period of time. This is crucial for making backups of large live databases in a non-disruptive manner. Example usage:

var db = new sqlite3.Database('live.db');
var backup = db.backup('backup.db');
...
// in event loop, move backup forward when we have time.
if (backup.idle) { backup.step(NPAGES); }
if (backup.completed) { /* success! backup made */  }
if (backup.failed)    { /* sadness! backup broke */ }
// do other work in event loop - fine to modify live.db
...

Here is how sqlite's backup api is exposed:

  • sqlite3_backup_init: This is implemented as db.backup(filename, [callback])
    or db.backup(filename, destDbName, sourceDbName, filenameIsDest, [callback]).
  • sqlite3_backup_step: This is implemented as backup.step(pages, [callback]).
  • sqlite3_backup_finish: This is implemented as backup.finish([callback]).
  • sqlite3_backup_remaining: This is implemented as a backup.remaining getter.
  • sqlite3_backup_pagecount: This is implemented as a backup.pageCount getter.

Some conveniences are added in the node api.

There are the following read-only properties:

  • backup.completed is set to true when the backup succeeeds.
  • backup.failed is set to true when the backup has a fatal error.
  • backup.idle is set to true when no operation is currently in progress or
    queued for the backup.
  • backup.remaining is an integer with the remaining number of pages after the
    last call to backup.step (-1 if step not yet called).
  • backup.pageCount is an integer with the total number of pages measured during
    the last call to backup.step (-1 if step not yet called).

There is the following writable property:

  • backup.retryErrors: an array of sqlite3 error codes that are treated as non-fatal - meaning, if they occur, backup.failed is not set, and the backup may continue. By default, this is [sqlite3.BUSY, sqlite3.LOCKED].

The db.backup(filename, [callback]) shorthand is sufficient for making a backup of a database opened by node-sqlite3. If using attached or temporary databases, or moving data in the opposite direction, the more complete (but daunting) db.backup(filename, destDbName, sourceDbName, filenameIsDest, [callback]) signature is provided.

A backup will finish automatically when it succeeds or a fatal error occurs, meaning it is not necessary to call db.finish(). By default, SQLITE_LOCKED and SQLITE_BUSY errors are not treated as failures, and the backup will continue if they occur. The set of errors that are tolerated can be controlled by setting backup.retryErrors. To disable automatic finishing and stick strictly to sqlite's raw api,
set backup.retryErrors to []. In that case, it is necessary to call backup.finish().

In the same way as node-sqlite3 databases and statements, backup methods can be called safely without callbacks, due to an internal call queue. So for example this naive code will correctly back up a db, if there are no errors:

var backup = db.backup('backup.db');
backup.step(-1);
backup.finish();

This exposes the sqlite3 backup api as described at https://sqlite.org/backup.html.

This implementation draws on TryGhost#883,
extending it to create a backup object that can be used in the background,
without leaving the database locked for an extended period of time.  This is
crucial for making backups of large live databases in a non-disruptive manner.
Example usage:

```
var db = new sqlite3.Database('live.db');
var backup = db.backup('backup.db');
...
// in event loop, move backup forward when we have time.
if (backup.idle) { backup.step(NPAGES); }
if (backup.completed) { /* success! backup made */  }
if (backup.failed)    { /* sadness! backup broke */ }
// do other work in event loop - fine to modify live.db
...
```

Here is how sqlite's backup api is exposed:

 * `sqlite3_backup_init`: This is implemented as `db.backup(filename, [callback])`
   or `db.backup(filename, destDbName, sourceDbName, filenameIsDest, [callback])`.
 * `sqlite3_backup_step`: This is implemented as `backup.step(pages, [callback])`.
 * `sqlite3_backup_finish`: This is implemented as `backup.finish([callback])`.
 * `sqlite3_backup_remaining`: This is implemented as a `backup.remaining` getter.
 * `sqlite3_backup_pagecount`: This is implemented as a `backup.pageCount` getter.

Some conveniences are added in the node api.

There are the following read-only properties:
 * `backup.completed` is set to `true` when the backup succeeeds.
 * `backup.failed` is set to `true` when the backup has a fatal error.
 * `backup.idle` is set to `true` when no operation is currently in progress or
   queued for the backup.
 * `backup.remaining` is an integer with the remaining number of pages after the
   last call to `backup.step` (-1 if `step` not yet called).
 * `backup.pageCount` is an integer with the total number of pages measured during
   the last call to `backup.step` (-1 if `step` not yet called).

There is the following writable property:
 * `backup.retryErrors`: an array of sqlite3 error codes that are treated as
   non-fatal - meaning, if they occur, backup.failed is not set, and the backup
   may continue.  By default, this is `[sqlite3.BUSY, sqlite3.LOCKED]`.

The `db.backup(filename, [callback])` shorthand is sufficient for making a
backup of a database opened by node-sqlite3.  If using attached or temporary
databases, or moving data in the opposite direction, the more complete
(but daunting) `db.backup(filename, destDbName, sourceDbName, filenameIsDest, [callback])`
signature is provided.

A backup will finish automatically when it succeeds or a fatal error
occurs, meaning it is not necessary to call `db.finish()`.
By default, SQLITE_LOCKED and SQLITE_BUSY errors are not treated as
failures, and the backup will continue if they occur.  The set of errors
that are tolerated can be controlled by setting `backup.retryErrors`.
To disable automatic finishing and stick strictly to sqlite's raw api,
set `backup.retryErrors` to `[]`.  In that case, it is necessary to call
`backup.finish()`.

In the same way as node-sqlite3 databases and statements, backup methods
can be called safely without callbacks, due to an internal call queue.  So
for example this naive code will correctly back up a db, if there are
no errors:
```
var backup = db.backup('backup.db');
backup.step(-1);
backup.finish();
```
@kewde
Copy link
Copy Markdown
Collaborator

kewde commented Feb 1, 2019

Hi @paulfitz,

Thank you for the pull request, it is appreciated.
I'm okay with merging without docs, we'll hash that out in another PR as it's extensively detailed already.

A quick remark, the CI reported a segfault in the test for Node V5 with SQLite3 (v3.24). I've restarted it.
Another CI issue is unrelated and I will be investigating that.

I'll take a more extensive look this weekend, again thank you.

@dasilva-bruno
Copy link
Copy Markdown

Really hope this gets merged soon. Would be a massive help.

@kewde
Copy link
Copy Markdown
Collaborator

kewde commented Feb 9, 2019

cc @mapsam @springmeyer any outright objections?

I'll need to read the SQLite spec regarding backups a second time, but the described behavior does seem to conform to the spec (internally).

@mapsam
Copy link
Copy Markdown
Contributor

mapsam commented Feb 12, 2019

@kewde @paulfitz wow, super impressive. This looks really great - thank you for putting it together. I'm not too familiar with the backup API myself but am 👍 on getting this into a release so folks can start using/experimenting with it.

@paulfitz
Copy link
Copy Markdown
Contributor Author

Thanks for the feedback @mapsam, @kewde! Appreciate your time looking it over. Happy to add/change anything needed, just let me know.

@kewde kewde merged commit 2bd051d into TryGhost:master Feb 21, 2019
@kewde
Copy link
Copy Markdown
Collaborator

kewde commented Feb 21, 2019

We'll give it a trial by fire. I'll add the docs in a second PR.
Will be included in 4.0.7!

@mapsam
Copy link
Copy Markdown
Contributor

mapsam commented Feb 21, 2019

Thanks @kewde!

@k-fish
Copy link
Copy Markdown

k-fish commented Mar 1, 2019

This is awesome! Any idea when 4.0.7 might be published? 😄

@kewde
Copy link
Copy Markdown
Collaborator

kewde commented Mar 1, 2019

@k-fish I'll give it a spin this weekend.

@BrendanBall
Copy link
Copy Markdown

So as far as I can see this API is much more limiting than the native C API. This API doesn't allow you to have both source and destination be :memory: dbs, compared to the C API that all it cares about is that you give it a sqlite3 handle so it's much more flexible. I'm investigating using this functionality to improve test performance. It would be much nicer if this API takes in a sqlite3.Database instance, so something like:

let srcDB = new sqlite3.Database(':memory:')
// bunch of inserts
// ...
let destDB =  new sqlite3.Database(':memory:')
let backup = srcDB.backup(destDB)
backup.step(-1);
backup.finish();

I would also personally be happy if the backup method was just a plain function that took 2 db instances, ie. it's closer to the C API.

@jeroenvollenbrock
Copy link
Copy Markdown

@kewde We'd be very interested in using this functionality, but when referencing sqlite3 by git repo, node-pre-gyp pulls an older precompiled binary which would require us to make a temporary change in build pipelines. Is there any estimation regarding a (maybe alpha tagged) NPM release date?

@paulfitz
Copy link
Copy Markdown
Contributor Author

paulfitz commented Mar 27, 2019

@BrendanBall I think that should be doable, I'll take a look. Edit: I think paulfitz@24d8781 does what you want. Kind of handy for making a copy of an in-memory db during unit tests as you say. Will clean this up and make a pull request when I have a chance.

@BrendanBall
Copy link
Copy Markdown

@paulfitz wow awesome dude, thanks for the quick response 😄

@BrendanBall
Copy link
Copy Markdown

BrendanBall commented Mar 28, 2019

tried testing your branch, but I think master is broken related to this: #1130 because I get TypeError: Cannot read property 'prototype' of undefined just trying to import sqlite3 from 'sqlite3'

EDIT: @paulfitz I temporarily fixed that locally according to the issue.
I'm currently getting TypeError: Backup is not a constructor when trying to run srcDB.backup(destDB)

@SmartArray
Copy link
Copy Markdown

@BrendanBall I think that should be doable, I'll take a look. Edit: I think paulfitz@24d8781 does what you want. Kind of handy for making a copy of an in-memory db during unit tests as you say. Will clean this up and make a pull request when I have a chance.

Hey @paulfitz,

What was the reason your work on the Backup function was never integrated?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants