tags | |||
---|---|---|---|
|
import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem';
Since ScalarDB provides transaction capabilities on top of non-transactional or transactional databases non-invasively, you need to take special care to back up and restore the databases in a transactionally consistent way.
This guide describes how to back up and restore the databases that ScalarDB supports.
How you create a backup depends on which database you're using and whether or not you're using multiple databases. The following decision tree shows which approach you should take.
flowchart TD
A[Are you using a single database with ScalarDB?]
A -->|Yes| B[Does the database have transaction support?]
B -->|Yes| C[Perform <a href='#back-up-without-explicit-pausing'>back up without explicit pausing</a>]
B ---->|No| D[Perform <a href='#back-up-with-explicit-pausing'>back up with explicit pausing</a>]
A ---->|No| D
If you're using ScalarDB with a single database with support for transactions, you can create a backup of the database even while ScalarDB continues to accept transactions.
:::warning
Before creating a backup, you should consider the safest way to create a transactionally consistent backup of your databases and understand any risks that are associated with the backup process.
:::
One requirement for creating a backup in ScalarDB is that backups for all the ScalarDB-managed tables (including the Coordinator table) need to be transactionally consistent or automatically recoverable to a transactionally consistent state. That means that you need to create a consistent backup by dumping all tables in a single transaction.
How you create a transactionally consistent backup depends on the type of database that you're using. Select a database to see how to create a transactionally consistent backup for ScalarDB.
:::note
The backup methods by database listed below are just examples of some of the databases that ScalarDB supports.
:::
You can restore to any point within the backup retention period by using the automated backup feature. Use the `mysqldump` command with the `--single-transaction` option. Use the `pg_dump` command. Use the `.backup` command with the `.timeout` command as specified in [Special commands to sqlite3 (dot-commands)](https://www.sqlite.org/cli.html#special_commands_to_sqlite3_dot_commands_)For an example, see [BASH: SQLite3 .backup command](https://stackoverflow.com/questions/23164445/bash-sqlite3-backup-command).
Another way to create a transactionally consistent backup is to create a backup while a cluster of ScalarDB instances does not have any outstanding transactions. Creating the backup depends on the following:
- If the underlying database has a point-in-time snapshot or backup feature, you can create a backup during the period when no outstanding transactions exist.
- If the underlying database has a point-in-time restore or recovery (PITR) feature, you can set a restore point to a time (preferably the mid-time) in the pause duration period when no outstanding transactions exist.
:::note
When using a PITR feature, you should minimize the clock drifts between clients and servers by using clock synchronization, such as NTP. Otherwise, the time you get as the paused duration might be too different from the time in which the pause was actually conducted, which could restore the backup to a point where ongoing transactions exist.
In addition, you should pause for a sufficient amount of time (for example, five seconds) and use the mid-time of the paused duration as a restore point since clock synchronization cannot perfectly synchronize clocks between nodes.
:::
To make ScalarDB drain outstanding requests and stop accepting new requests so that a pause duration can be created, you should implement the Scalar Admin interface properly in your application that uses ScalarDB or use ScalarDB Cluster, which implements the Scalar Admin interface.
By using the Scalar Admin client tool, you can pause nodes, servers, or applications that implement the Scalar Admin interface without losing ongoing transactions.
How you create a transactionally consistent backup depends on the type of database that you're using. Select a database to see how to create a transactionally consistent backup for ScalarDB.
:::note
The backup methods by database listed below are just examples of some of the databases that ScalarDB supports.
:::
You must enable the PITR feature for DynamoDB tables. If you're using [ScalarDB Schema Loader](schema-loader.mdx) to create schemas, the tool enables the PITR feature for tables by default.To specify a transactionally consistent restore point, pause your application that is using ScalarDB with DynamoDB as described in [Back up with explicit pausing](#back-up-with-explicit-pausing).
To specify a transactionally consistent restore point, pause your application that is using ScalarDB with Cosmos DB for NoSQL as described in [Back up with explicit pausing](#back-up-with-explicit-pausing).
However, if the quorum of cluster nodes loses their data, you will need a transactionally consistent backup (snapshot) to restore the cluster to a certain transactionally consistent point.
To create a transactionally consistent cluster-wide backup (snapshot), pause the application that is using ScalarDB or [ScalarDB Cluster](scalardb-cluster/index.mdx) and create backups (snapshots) of the nodes as described in [Back up with explicit pausing](#back-up-with-explicit-pausing) or stop the Cassandra cluster, take copies of all the data in the nodes, and start the cluster.
How you restore a transactionally consistent backup depends on the type of database that you're using. Select a database to see how to create a transactionally consistent backup for ScalarDB.
:::note
The restore methods by database listed below are just examples of some of the databases that ScalarDB supports.
:::
You can restore to any point within the backup retention period by using the automated backup feature. First, stop all the nodes of the Cassandra cluster. Then, clean the `data`, `commitlog`, and `hints` directories, and place the backups (snapshots) in each node.After placing the backups (snapshots) in each node, start all the nodes of the Cassandra Cluster.
ScalarDB implements the Cosmos DB adapter by using its stored procedures, which are installed when creating schemas by using ScalarDB Schema Loader. However, the PITR feature of Cosmos DB doesn't restore stored procedures. Because of this, you need to re-install the required stored procedures for all tables after restoration. You can do this by using ScalarDB Schema Loader with the `--repair-all` option. For details, see [Repair tables](schema-loader.mdx#repair-tables).
To do this procedure:
1. Create a backup.
1. Select the mid-time of the paused duration as the restore point.
2. Restore by using the PITR of table A to table B.
3. Create a backup of the restored table B (assuming that the backup is named backup B).
4. Remove table B.
2. Restore the backup.
1. Remove table A.
2. Create a table named A by using backup B.
:::note
- You must do the steps mentioned above for each table because tables can only be restored one at a time.
- Configurations such as PITR and auto-scaling policies are reset to the default values for restored tables, so you must manually configure the required settings. For details, see the official AWS documentation for How to restore DynamoDB tables with DynamoDB.
:::
If you used mysqldump
to create the backup file, use the mysql
command to restore the backup as specified in Reloading SQL-Format Backups.
If you used pg_dump
to create the backup file, use the psql
command to restore the backup as specified in Restoring the Dump.
Use the .restore
command as specified in Special commands to sqlite3 (dot-commands).