add backup api that can be run incrementally#1116
Conversation
This exposes the sqlite3 backup api as described at https://sqlite.org/backup.html. This implementation draws on TryGhost#883, extending it to create a backup object that can be used in the background, without leaving the database locked for an extended period of time. This is crucial for making backups of large live databases in a non-disruptive manner. Example usage: ``` var db = new sqlite3.Database('live.db'); var backup = db.backup('backup.db'); ... // in event loop, move backup forward when we have time. if (backup.idle) { backup.step(NPAGES); } if (backup.completed) { /* success! backup made */ } if (backup.failed) { /* sadness! backup broke */ } // do other work in event loop - fine to modify live.db ... ``` Here is how sqlite's backup api is exposed: * `sqlite3_backup_init`: This is implemented as `db.backup(filename, [callback])` or `db.backup(filename, destDbName, sourceDbName, filenameIsDest, [callback])`. * `sqlite3_backup_step`: This is implemented as `backup.step(pages, [callback])`. * `sqlite3_backup_finish`: This is implemented as `backup.finish([callback])`. * `sqlite3_backup_remaining`: This is implemented as a `backup.remaining` getter. * `sqlite3_backup_pagecount`: This is implemented as a `backup.pageCount` getter. Some conveniences are added in the node api. There are the following read-only properties: * `backup.completed` is set to `true` when the backup succeeeds. * `backup.failed` is set to `true` when the backup has a fatal error. * `backup.idle` is set to `true` when no operation is currently in progress or queued for the backup. * `backup.remaining` is an integer with the remaining number of pages after the last call to `backup.step` (-1 if `step` not yet called). * `backup.pageCount` is an integer with the total number of pages measured during the last call to `backup.step` (-1 if `step` not yet called). There is the following writable property: * `backup.retryErrors`: an array of sqlite3 error codes that are treated as non-fatal - meaning, if they occur, backup.failed is not set, and the backup may continue. By default, this is `[sqlite3.BUSY, sqlite3.LOCKED]`. The `db.backup(filename, [callback])` shorthand is sufficient for making a backup of a database opened by node-sqlite3. If using attached or temporary databases, or moving data in the opposite direction, the more complete (but daunting) `db.backup(filename, destDbName, sourceDbName, filenameIsDest, [callback])` signature is provided. A backup will finish automatically when it succeeds or a fatal error occurs, meaning it is not necessary to call `db.finish()`. By default, SQLITE_LOCKED and SQLITE_BUSY errors are not treated as failures, and the backup will continue if they occur. The set of errors that are tolerated can be controlled by setting `backup.retryErrors`. To disable automatic finishing and stick strictly to sqlite's raw api, set `backup.retryErrors` to `[]`. In that case, it is necessary to call `backup.finish()`. In the same way as node-sqlite3 databases and statements, backup methods can be called safely without callbacks, due to an internal call queue. So for example this naive code will correctly back up a db, if there are no errors: ``` var backup = db.backup('backup.db'); backup.step(-1); backup.finish(); ```
|
Hi @paulfitz, Thank you for the pull request, it is appreciated. A quick remark, the CI reported a segfault in the test for Node V5 with SQLite3 (v3.24). I've restarted it. I'll take a more extensive look this weekend, again thank you. |
|
Really hope this gets merged soon. Would be a massive help. |
|
cc @mapsam @springmeyer any outright objections? I'll need to read the SQLite spec regarding backups a second time, but the described behavior does seem to conform to the spec (internally). |
|
We'll give it a trial by fire. I'll add the docs in a second PR. |
|
Thanks @kewde! |
|
This is awesome! Any idea when 4.0.7 might be published? 😄 |
|
@k-fish I'll give it a spin this weekend. |
|
So as far as I can see this API is much more limiting than the native C API. This API doesn't allow you to have both source and destination be I would also personally be happy if the backup method was just a plain function that took 2 db instances, ie. it's closer to the C API. |
|
@kewde We'd be very interested in using this functionality, but when referencing sqlite3 by git repo, node-pre-gyp pulls an older precompiled binary which would require us to make a temporary change in build pipelines. Is there any estimation regarding a (maybe alpha tagged) NPM release date? |
|
@BrendanBall I think that should be doable, I'll take a look. Edit: I think paulfitz@24d8781 does what you want. Kind of handy for making a copy of an in-memory db during unit tests as you say. Will clean this up and make a pull request when I have a chance. |
|
@paulfitz wow awesome dude, thanks for the quick response 😄 |
|
tried testing your branch, but I think master is broken related to this: #1130 because I get EDIT: @paulfitz I temporarily fixed that locally according to the issue. |
Hey @paulfitz, What was the reason your work on the Backup function was never integrated? |
This exposes the sqlite3 backup api as described at https://sqlite.org/backup.html.
This implementation draws on #883, extending it to create a backup object that can be used in the background, without leaving the database locked for an extended period of time. This is crucial for making backups of large live databases in a non-disruptive manner. Example usage:
Here is how sqlite's backup api is exposed:
sqlite3_backup_init: This is implemented asdb.backup(filename, [callback])or
db.backup(filename, destDbName, sourceDbName, filenameIsDest, [callback]).sqlite3_backup_step: This is implemented asbackup.step(pages, [callback]).sqlite3_backup_finish: This is implemented asbackup.finish([callback]).sqlite3_backup_remaining: This is implemented as abackup.remaininggetter.sqlite3_backup_pagecount: This is implemented as abackup.pageCountgetter.Some conveniences are added in the node api.
There are the following read-only properties:
backup.completedis set totruewhen the backup succeeeds.backup.failedis set totruewhen the backup has a fatal error.backup.idleis set totruewhen no operation is currently in progress orqueued for the backup.
backup.remainingis an integer with the remaining number of pages after thelast call to
backup.step(-1 ifstepnot yet called).backup.pageCountis an integer with the total number of pages measured duringthe last call to
backup.step(-1 ifstepnot yet called).There is the following writable property:
backup.retryErrors: an array of sqlite3 error codes that are treated as non-fatal - meaning, if they occur, backup.failed is not set, and the backup may continue. By default, this is[sqlite3.BUSY, sqlite3.LOCKED].The
db.backup(filename, [callback])shorthand is sufficient for making a backup of a database opened by node-sqlite3. If using attached or temporary databases, or moving data in the opposite direction, the more complete (but daunting)db.backup(filename, destDbName, sourceDbName, filenameIsDest, [callback])signature is provided.A backup will finish automatically when it succeeds or a fatal error occurs, meaning it is not necessary to call
db.finish(). By default, SQLITE_LOCKED and SQLITE_BUSY errors are not treated as failures, and the backup will continue if they occur. The set of errors that are tolerated can be controlled by settingbackup.retryErrors. To disable automatic finishing and stick strictly to sqlite's raw api,set
backup.retryErrorsto[]. In that case, it is necessary to callbackup.finish().In the same way as node-sqlite3 databases and statements, backup methods can be called safely without callbacks, due to an internal call queue. So for example this naive code will correctly back up a db, if there are no errors: