diff --git a/docs/configurations.md b/docs/configurations.md index 123360f13a..4fab54e9d7 100644 --- a/docs/configurations.md +++ b/docs/configurations.md @@ -1,38 +1,46 @@ # ScalarDB Configurations -This document describes the configurations for ScalarDB. +This page describes the available configurations for ScalarDB. -## Transaction manager configurations +## Transaction managers -ScalarDB has several transaction manager implementations, such as Consensus Commit, gRPC, and JDBC, and you can specify one of the implementations with the `scalar.db.transaction_manager` property. -This section describes the configurations for each transaction manager. +Implemented within ScalarDB are two transaction managers: Consensus Commit and gRPC. You can specify one of the transaction managers by using the `scalar.db.transaction_manager` property and then specify the configurations for the transaction manager. ### Consensus Commit -Consensus Commit is the default transaction manager in ScalarDB. -If you do not specify the `scalar.db.transaction_manager` property, or if you specify `consensus-commit` for the property, Consensus Commit will be used. +Consensus Commit is the default transaction manager type in ScalarDB. To use the Consensus Commit transaction manager, add the following to the ScalarDB properties file: -This section describes the configurations for Consensus Commit. +```properties +scalar.db.transaction_manager=consensus-commit +``` + +{% capture notice--info %} +**Note** + +If you don't specify the `scalar.db.transaction_manager` property, `consensus-commit` will be the default value. +{% endcapture %} + +
{{ notice--info | markdownify }}
#### Basic configurations -The basic configurations for Consensus Commit are as follows: +The following basic configurations are available for the Consensus Commit transaction manager: -| Name | Description | Default | +| Name | Description | Default | |-------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------| | `scalar.db.transaction_manager` | `consensus-commit` should be specified. | - | | `scalar.db.consensus_commit.isolation_level` | Isolation level used for Consensus Commit. Either `SNAPSHOT` or `SERIALIZABLE` can be specified. | `SNAPSHOT` | -| `scalar.db.consensus_commit.serializable_strategy` | Serializable strategy used for Consensus Commit. Either `EXTRA_READ` or `EXTRA_WRITE` can be specified. If `SNAPSHOT` is specified in the property `scalar.db.consensus_commit.isolation_level`, this is ignored. | `EXTRA_READ` | -| `scalar.db.consensus_commit.coordinator.namespace` | Namespace name of coordinator tables. | `coordinator` | -| `scalar.db.consensus_commit.include_metadata.enabled` | If set to `true`, Get and Scan operations results will contain transaction metadata. To see the transaction metadata columns details for a given table, you can use the `DistributedTransactionAdmin.getTableMetadata()` method which will return the table metadata augmented with the transaction metadata columns. Using this configuration can be useful to investigate transaction related issues. | `false` | +| `scalar.db.consensus_commit.serializable_strategy` | Serializable strategy used for Consensus Commit. Either `EXTRA_READ` or `EXTRA_WRITE` can be specified. If `SNAPSHOT` is specified in the property `scalar.db.consensus_commit.isolation_level`, this configuration will be ignored. | `EXTRA_READ` | +| `scalar.db.consensus_commit.coordinator.namespace` | Namespace name of Coordinator tables. | `coordinator` | +| `scalar.db.consensus_commit.include_metadata.enabled` | If set to `true`, `Get` and `Scan` operations results will contain transaction metadata. To see the transaction metadata columns details for a given table, you can use the `DistributedTransactionAdmin.getTableMetadata()` method, which will return the table metadata augmented with the transaction metadata columns. Using this configuration can be useful to investigate transaction-related issues. | `false` | #### Performance-related configurations -The performance-related configurations for Consensus Commit are as follows: +The following performance-related configurations are available for the Consensus Commit transaction manager: -| Name | Description | Default | +| Name | Description | Default | |-----------------------------------------------------------|--------------------------------------------------------------------------------|-------------------------------------------------------------------| -| `scalar.db.consensus_commit.parallel_executor_count` | The number of the executors (threads) for the parallel execution. | `128` | +| `scalar.db.consensus_commit.parallel_executor_count` | Number of executors (threads) for parallel execution. | `128` | | `scalar.db.consensus_commit.parallel_preparation.enabled` | Whether or not the preparation phase is executed in parallel. | `true` | | `scalar.db.consensus_commit.parallel_validation.enabled` | Whether or not the validation phase (in `EXTRA_READ`) is executed in parallel. | The value of `scalar.db.consensus_commit.parallel_commit.enabled` | | `scalar.db.consensus_commit.parallel_commit.enabled` | Whether or not the commit phase is executed in parallel. | `true` | @@ -40,14 +48,23 @@ The performance-related configurations for Consensus Commit are as follows: | `scalar.db.consensus_commit.async_commit.enabled` | Whether or not the commit phase is executed asynchronously. | `false` | | `scalar.db.consensus_commit.async_rollback.enabled` | Whether or not the rollback phase is executed asynchronously. | The value of `scalar.db.consensus_commit.async_commit.enabled` | -#### Underlying storage/database configurations +#### Underlying storage or database configurations + +Consensus Commit has a storage abstraction layer and supports multiple underlying storages. You can specify the storage implementation by using the `scalar.db.storage` property. -Consensus Commit has a storage abstraction layer and supports multiple underlying storages. -You can specify the storage implementation by using the `scalar.db.storage` property. +Select a database to see the configurations available for each storage. -The following describes the configurations available for each storage. +
+
+ + + + +
-- For Cassandra, the following configurations are available: +
+ +The following configurations are available for Cassandra: | Name | Description | Default | |-----------------------------------------|-----------------------------------------------------------------------|------------| @@ -58,7 +75,10 @@ The following describes the configurations available for each storage. | `scalar.db.password` | Password to access the database. | | | `scalar.db.cassandra.metadata.keyspace` | Keyspace name for the namespace and table metadata used for ScalarDB. | `scalardb` | -- For Cosmos DB for NoSQL, the following configurations are available: +
+
+ +The following configurations are available for CosmosDB for NoSQL: | Name | Description | Default | |--------------------------------------|----------------------------------------------------------------------------------------------------------|------------| @@ -67,7 +87,10 @@ The following describes the configurations available for each storage. | `scalar.db.password` | Either a master or read-only key used to perform authentication for accessing Azure Cosmos DB for NoSQL. | | | `scalar.db.cosmos.metadata.database` | Database name for the namespace and table metadata used for ScalarDB. | `scalardb` | -- For DynamoDB, the following configurations are available: +
+
+ +The following configurations are available for DynamoDB: | Name | Description | Default | |---------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------| @@ -79,7 +102,10 @@ The following describes the configurations available for each storage. | `scalar.db.dynamo.metadata.namespace` | Namespace name for the namespace and table metadata used for ScalarDB. | `scalardb` | | `scalar.db.dynamo.namespace.prefix` | Prefix for the user namespaces and metadata namespace names. Since AWS requires having unique tables names in a single AWS region, this is useful if you want to use multiple ScalarDB environments (development, production, etc.) in a single AWS region. | | -- For JDBC databases, the following configurations are available: +
+
+ +The following configurations are available for JDBC databases: | Name | Description | Default | |-----------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------| @@ -90,8 +116,8 @@ The following describes the configurations available for each storage. | `scalar.db.jdbc.connection_pool.min_idle` | Minimum number of idle connections in the connection pool. | `20` | | `scalar.db.jdbc.connection_pool.max_idle` | Maximum number of connections that can remain idle in the connection pool. | `50` | | `scalar.db.jdbc.connection_pool.max_total` | Maximum total number of idle and borrowed connections that can be active at the same time for the connection pool. Use a negative value for no limit. | `100` | -| `scalar.db.jdbc.prepared_statements_pool.enabled` | Setting this property to `true` enables prepared statement pooling. | `false` | -| `scalar.db.jdbc.prepared_statements_pool.max_open` | Maximum number of open statements that can be allocated from the statement pool at the same time, or negative for no limit. | `-1` | +| `scalar.db.jdbc.prepared_statements_pool.enabled` | Setting this property to `true` enables prepared-statement pooling. | `false` | +| `scalar.db.jdbc.prepared_statements_pool.max_open` | Maximum number of open statements that can be allocated from the statement pool at the same time. Use a negative value for no limit. | `-1` | | `scalar.db.jdbc.isolation_level` | Isolation level for JDBC. `READ_UNCOMMITTED`, `READ_COMMITTED`, `REPEATABLE_READ`, or `SERIALIZABLE` can be specified. | Underlying-database specific | | `scalar.db.jdbc.metadata.schema` | Schema name for the namespace and table metadata used for ScalarDB. | `scalardb` | | `scalar.db.jdbc.table_metadata.connection_pool.min_idle` | Minimum number of idle connections in the connection pool for the table metadata. | `5` | @@ -101,164 +127,102 @@ The following describes the configurations available for each storage. | `scalar.db.jdbc.admin.connection_pool.max_idle` | Maximum number of connections that can remain idle in the connection pool for admin. | `10` | | `scalar.db.jdbc.admin.connection_pool.max_total` | Maximum total number of idle and borrowed connections that can be active at the same time for the connection pool for admin. Use a negative value for no limit. | `25` | -If you use SQLite3 as a JDBC database, you must set `scalar.db.contact_points` as follows. +{% capture notice--info %} +**Note** + +If you use SQLite3 as a JDBC database, you must set `scalar.db.contact_points` as follows, replacing `YOUR_DB` with the URL of your SQLite3 database: ```properties scalar.db.contact_points=jdbc:sqlite:.sqlite3?busy_timeout=10000 ``` -Unlike other JDBC databases, [SQLite3 does not fully support concurrent access](https://www.sqlite.org/lang_transaction.html). -To avoid frequent errors caused internally by [`SQLITE_BUSY`](https://www.sqlite.org/rescode.html#busy), we recommend setting a [`busy_timeout`](https://www.sqlite.org/c3ref/busy_timeout.html) parameter. +In addition, unlike other JDBC databases, [SQLite3 does not fully support concurrent access](https://www.sqlite.org/lang_transaction.html). To avoid frequent errors caused internally by [`SQLITE_BUSY`](https://www.sqlite.org/rescode.html#busy), we recommend setting a [`busy_timeout`](https://www.sqlite.org/c3ref/busy_timeout.html) parameter. +{% endcapture %} + +
{{ notice--info | markdownify }}
+
+
##### Multi-storage support -ScalarDB supports using multiple storage implementations at the same time. -You can use multiple storages by specifying `multi-storage` for the `scalar.db.storage` property. +ScalarDB supports using multiple storage implementations simultaneously. You can use multiple storages by specifying `multi-storage` as the value for the `scalar.db.storage` property. -For details about using multiple storages, see [Multi-storage Transactions](multi-storage-transactions.md). +For details about using multiple storages, see [Multi-Storage Transactions](multi-storage-transactions.md). ### ScalarDB Server (gRPC) -[ScalarDB Server](scalardb-server.md) is a standalone server that provides a gRPC interface to ScalarDB. -To interact with ScalarDB Server, you must specify `grpc` for the `scalar.db.transaction_manager` property. +[ScalarDB Server](scalardb-server.md) is a standalone server that provides a gRPC interface to ScalarDB. To interact with ScalarDB Server, you must specify `grpc` as the value for the `scalar.db.transaction_manager` property. -The following configurations are available for ScalarDB Server: +The following configurations are available for the gRPC transaction manager for ScalarDB Server: -| Name | Description | Default | -|--------------------------------------------|-------------------------------------------------------------|------------------------| -| `scalar.db.transaction_manager` | `grpc` should be specified. | - | -| `scalar.db.contact_points` | ScalarDB Server host. | | -| `scalar.db.contact_port` | Port number for ScalarDB Server. | `60051` | -| `scalar.db.grpc.deadline_duration_millis` | The deadline duration for gRPC connections in milliseconds. | `60000` (60 seconds) | -| `scalar.db.grpc.max_inbound_message_size` | The maximum message size allowed for a single gRPC frame. | The gRPC default value | -| `scalar.db.grpc.max_inbound_metadata_size` | The maximum size of metadata allowed to be received. | The gRPC default value | +| Name | Description | Default | +|--------------------------------------------|-------------------------------------------------------------|-------------------------| +| `scalar.db.transaction_manager` | `grpc` should be specified. | - | +| `scalar.db.contact_points` | ScalarDB Server host. | | +| `scalar.db.contact_port` | Port number for ScalarDB Server. | `60051` | +| `scalar.db.grpc.deadline_duration_millis` | The deadline duration for gRPC connections in milliseconds. | `60000` (60 seconds) | +| `scalar.db.grpc.max_inbound_message_size` | The maximum message size allowed for a single gRPC frame. | The gRPC default value. | +| `scalar.db.grpc.max_inbound_metadata_size` | The maximum size of metadata allowed to be received. | The gRPC default value. | For details about ScalarDB Server, see [ScalarDB Server](scalardb-server.md). -### JDBC transactions - -You can also use native JDBC transactions through ScalarDB when you only interact with one JDBC database. -However, you cannot use most of ScalarDB features when you use JDBC transactions, which might defeat the purpose of using ScalarDB. -So, please carefully consider your use case. - -To use JDBC transactions, you must specify `jdbc` for the `scalar.db.transaction_manager` property. - -The following configurations are available for JDBC transactions: - -| Name | Description | Default | -|-----------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------| -| `scalar.db.transaction_manager` | `jdbc` must be specified. | - | -| `scalar.db.contact_points` | JDBC connection URL. | | -| `scalar.db.username` | Username to access the database. | | -| `scalar.db.password` | Password to access the database. | | -| `scalar.db.jdbc.connection_pool.min_idle` | Minimum number of idle connections in the connection pool. | `20` | -| `scalar.db.jdbc.connection_pool.max_idle` | Maximum number of connections that can remain idle in the connection pool. | `50` | -| `scalar.db.jdbc.connection_pool.max_total` | Maximum total number of idle and borrowed connections that can be active at the same time for the connection pool. Use a negative value for no limit. | `100` | -| `scalar.db.jdbc.prepared_statements_pool.enabled` | Setting true to this property enables prepared statement pooling. | `false` | -| `scalar.db.jdbc.prepared_statements_pool.max_open` | Maximum number of open statements that can be allocated from the statement pool at the same time, or negative for no limit. | `-1` | -| `scalar.db.jdbc.isolation_level` | Isolation level for JDBC. `READ_UNCOMMITTED`, `READ_COMMITTED`, `REPEATABLE_READ`, or `SERIALIZABLE` can be specified. | Underlying-database specific | -| `scalar.db.jdbc.table_metadata.schema` | Schema name for the table metadata used for ScalarDB. | `scalardb` | -| `scalar.db.jdbc.table_metadata.connection_pool.min_idle` | Minimum number of idle connections in the connection pool for the table metadata. | `5` | -| `scalar.db.jdbc.table_metadata.connection_pool.max_idle` | Maximum number of connections that can remain idle in the connection pool for the table metadata. | `10` | -| `scalar.db.jdbc.table_metadata.connection_pool.max_total` | Maximum total number of idle and borrowed connections that can be active at the same time for the connection pool for the table metadata. Use a negative value for no limit. | `25` | -| `scalar.db.jdbc.admin.connection_pool.min_idle` | Minimum number of idle connections in the connection pool for admin. | `5` | -| `scalar.db.jdbc.admin.connection_pool.max_idle` | Maximum number of connections that can remain idle in the connection pool for admin. | `10` | -| `scalar.db.jdbc.admin.connection_pool.max_total` | Maximum total number of idle and borrowed connections that can be active at the same time for the connection pool for admin. Use a negative value for no limit. | `25` | - -If you use SQLite3 as a JDBC database, you must set `scalar.db.contact_points` as follows. - -```properties -scalar.db.contact_points=jdbc:sqlite:.sqlite3?busy_timeout=10000 -``` - -Unlike other JDBC databases, [SQLite3 does not fully support concurrent access](https://www.sqlite.org/lang_transaction.html). -To avoid frequent errors caused internally by [`SQLITE_BUSY`](https://www.sqlite.org/rescode.html#busy), we recommend setting a [`busy_timeout`](https://www.sqlite.org/c3ref/busy_timeout.html) parameter. - ## ScalarDB Server configurations -[ScalarDB Server](scalardb-server.md) is a standalone server that provides a gRPC interface to ScalarDB. -This section explains ScalarDB Server configurations. +[ScalarDB Server](scalardb-server.md) is a standalone server that provides a gRPC interface to ScalarDB. This section explains ScalarDB Server configurations. -In addition to [transaction manager configurations](#transaction-manager-configurations) and [other configurations](#other-configurations), the following configurations are available for ScalarDB Server: +In addition to [transaction managers](#transaction-managers) and [other configurations](#other-configurations), the following configurations are available for ScalarDB Server: -| Name | Description | Default | -|---------------------------------------------------|--------------------------------------------------------------------------------------------------|------------------------| -| `scalar.db.server.port` | Port number for ScalarDB Server. | `60051` | -| `scalar.db.server.prometheus_exporter_port` | Prometheus exporter port. Prometheus exporter will not be started if a negative number is given. | `8080` | -| `scalar.db.server.grpc.max_inbound_message_size` | The maximum message size allowed to be received. | The gRPC default value | -| `scalar.db.server.grpc.max_inbound_metadata_size` | The maximum size of metadata allowed to be received. | The gRPC default value | -| `scalar.db.server.decommissioning_duration_secs` | The decommissioning duration in seconds. | `30` | +| Name | Description | Default | +|---------------------------------------------------|--------------------------------------------------------------------------------------------------|-------------------------| +| `scalar.db.server.port` | Port number for ScalarDB Server. | `60051` | +| `scalar.db.server.prometheus_exporter_port` | Prometheus exporter port. Prometheus exporter will not be started if a negative number is given. | `8080` | +| `scalar.db.server.grpc.max_inbound_message_size` | The maximum message size allowed to be received. | The gRPC default value. | +| `scalar.db.server.grpc.max_inbound_metadata_size` | The maximum size of metadata allowed to be received. | The gRPC default value. | +| `scalar.db.server.decommissioning_duration_secs` | The decommissioning duration in seconds. | `30` | For details about ScalarDB Server, see [ScalarDB Server](scalardb-server.md). -## Other configurations - -This section explains other configurations. +## Other ScalarDB configurations -Other configurations are available for ScalarDB: +The following are additional configurations available for ScalarDB: | Name | Description | Default | |------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------| | `scalar.db.metadata.cache_expiration_time_secs` | ScalarDB has a metadata cache to reduce the number of requests to the database. This setting specifies the expiration time of the cache in seconds. | `-1` (no expiration) | | `scalar.db.active_transaction_management.expiration_time_millis` | ScalarDB maintains ongoing transactions, which can be resumed by using a transaction ID. This setting specifies the expiration time of this transaction management feature in milliseconds. | `-1` (no expiration) | -| `scalar.db.default_namespace_name` | The given namespace name will be used by operations that do not already specify a namespace. If you would like to use this setting with ScalarDB Server, configure this setting on the client-side configuration. | | +| `scalar.db.default_namespace_name` | The given namespace name will be used by operations that do not already specify a namespace. If you would like to use this setting with ScalarDB Server, configure this setting in the client-side configuration. | | -## Configuration examples +### Two-phase commit support -This section shows several configuration examples. +ScalarDB supports transactions with a two-phase commit interface. With transactions with a two-phase commit interface, you can execute a transaction that spans multiple processes or applications, like in a microservice architecture. -### Example 1 +For details about using two-phase commit, see [Transactions with a Two-Phase Commit Interface](two-phase-commit-transactions.md). -``` -[App (ScalarDB Library with Consensus Commit)] ---> [Underlying storage/database] -``` +## Configuration example - App and database -In this configuration, the app (ScalarDB Library with Consensus Commit) connects to the underlying storage/database (in this case, Cassandra) directly. -Note that this configuration exists only for development purposes and is not recommended for production use. -This is because the app needs to implement the [scalar-admin](https://github.com/scalar-labs/scalar-admin) interface to take transactionally consistent backups for ScalarDB, which requires an extra burden for users. - -In this case, an example of the configurations in the app is as follows: - -```properties -# Transaction manager implementation. -scalar.db.transaction_manager=consensus-commit - -# Storage implementation. -scalar.db.storage=cassandra - -# Comma-separated contact points. -scalar.db.contact_points= - -# Credential information to access the database. -scalar.db.username= -scalar.db.password= +```mermaid +flowchart LR + app["App
(ScalarDB library with
Consensus Commit)"] + db[(Underlying storage or database)] + app --> db ``` -### Example 2 +In this example configuration, the app (ScalarDB library with Consensus Commit) connects to an underlying storage or database (in this case, Cassandra) directly. -``` -[App (ScalarDB Library with gRPC)] ---> [ScalarDB Server (ScalarDB Library with Consensus Commit)] ---> [Underlying storage/database] -``` +{% capture notice--warning %} +**Attention** + +This configuration exists only for development purposes and isn’t suitable for a production environment. This is because the app needs to implement the [scalar-admin](https://github.com/scalar-labs/scalar-admin) interface to take transactionally consistent backups for ScalarDB, which requires additional configurations. +{% endcapture %} -In this configuration, the app (ScalarDB Library with gRPC) connects to an underlying storage/database through ScalarDB Server. -This configuration is recommended for production use because ScalarDB Server implements the [scalar-admin](https://github.com/scalar-labs/scalar-admin) interface, which enables you to take transactionally consistent backups for ScalarDB by pausing ScalarDB Server. +
{{ notice--warning | markdownify }}
-In this case, an example of configurations for the app is as follows: +The following is an example of the configuration for connecting the app to the underlying database through ScalarDB: ```properties # Transaction manager implementation. -scalar.db.transaction_manager=grpc - -# ScalarDB Server host. -scalar.db.contact_points= - -# ScalarDB Server port. -scalar.db.contact_port= -``` - -And an example of configurations for ScalarDB Server is as follows: +scalar.db.transaction_manager=consensus-commit -```properties # Storage implementation. scalar.db.storage=cassandra diff --git a/docs/multi-storage-transactions.md b/docs/multi-storage-transactions.md index 15b3f1757b..018c808b78 100644 --- a/docs/multi-storage-transactions.md +++ b/docs/multi-storage-transactions.md @@ -1,56 +1,60 @@ -# Multi-storage Transactions +# Multi-Storage Transactions -ScalarDB transactions can span multiple storages/databases while preserving ACID property with a -feature called *Multi-storage Transactions*. This documentation explains the feature briefly. +ScalarDB transactions can span multiple storages or databases while maintaining ACID compliance by using a feature called *multi-storage transactions*. -## How Multi-storage Transactions works +This page explains how multi-storage transactions work and how to configure the feature in ScalarDB. -Internally, the `multi-storage` implementation holds multiple storage instances and has mappings -from a namespace name to a proper storage instance. When an operation is executed, it chooses a -proper storage instance from the specified namespace by using the namespace-storage mappings and -uses it. +## How multi-storage transactions work in ScalarDB -## Configuration +In ScalarDB, the `multi-storage` implementation holds multiple storage instances and has mappings from a namespace name to a proper storage instance. When an operation is executed, the multi-storage transactions feature chooses a proper storage instance from the specified namespace by using the namespace-storage mapping and uses that storage instance. -You can use Multi-storage transactions in the same way as the other storages/databases at the code -level as long as the configuration is properly set for `multi-storage`. An example of the -configuration is shown as follows: +## How to configure ScalarDB to support multi-storage transactions + +To enable multi-storage transactions, you need to specify `consensus-commit` as the value for `scalar.db.transaction_manager`, `multi-storage` as the value for `scalar.db.storage`, and configure your databases in the ScalarDB properties file. + +The following is an example of configurations for multi-storage transactions: ```properties -# Consensus commit is required to use Multi-storage Transactions. +# Consensus Commit is required to support multi-storage transactions. scalar.db.transaction_manager=consensus-commit # Multi-storage implementation is used for Consensus Commit. scalar.db.storage=multi-storage -# Define storage names, comma-separated format. In this case, "cassandra" and "mysql". +# Define storage names by using a comma-separated format. +# In this case, "cassandra" and "mysql" are used. scalar.db.multi_storage.storages=cassandra,mysql -# Define the "cassandra" storage. You can set the storage properties (storage, contact_points, username, etc.) with the property name "scalar.db.multi_storage.storages..". For example, if you want to specify the "scalar.db.contact_points" property for the "cassandra" storage, you can specify "scalar.db.multi_storage.storages.cassandra.contact_points". +# Define the "cassandra" storage. +# When setting storage properties, such as `storage`, `contact_points`, `username`, and `password`, for multi-storage transactions, the format is `scalar.db.multi_storage.storages..`. +# For example, to configure the `scalar.db.contact_points` property for Cassandra, specify `scalar.db.multi_storage.storages.cassandra.contact_point`. scalar.db.multi_storage.storages.cassandra.storage=cassandra scalar.db.multi_storage.storages.cassandra.contact_points=localhost scalar.db.multi_storage.storages.cassandra.username=cassandra scalar.db.multi_storage.storages.cassandra.password=cassandra # Define the "mysql" storage. +# When defining JDBC-specific configurations for multi-storage transactions, you can follow a similar format of `scalar.db.multi_storage.storages..`. +# For example, to configure the `scalar.db.jdbc.connection_pool.min_idle` property for MySQL, specify `scalar.db.multi_storage.storages.mysql.jdbc.connection_pool.min_idle`. scalar.db.multi_storage.storages.mysql.storage=jdbc scalar.db.multi_storage.storages.mysql.contact_points=jdbc:mysql://localhost:3306/ scalar.db.multi_storage.storages.mysql.username=root scalar.db.multi_storage.storages.mysql.password=mysql -# JDBC specific configurations for the "mysql" storage. As mentioned before, the format is "scalar.db.multi_storage.storages..". So for example, if you want to specify the "scalar.db.jdbc.connection_pool.min_idle" property for the "mysql" storage, you can specify "scalar.db.multi_storage.storages.mysql.jdbc.connection_pool.min_idle". +# Define the JDBC-specific configurations for the "mysql" storage. scalar.db.multi_storage.storages.mysql.jdbc.connection_pool.min_idle=5 scalar.db.multi_storage.storages.mysql.jdbc.connection_pool.max_idle=10 scalar.db.multi_storage.storages.mysql.jdbc.connection_pool.max_total=25 -# Define namespace mappings from a namespace name to a storage. The format is ":,...". +# Define namespace mapping from a namespace name to a storage. +# The format is ":,...". scalar.db.multi_storage.namespace_mapping=user:cassandra,coordinator:mysql # Define the default storage that's used if a specified table doesn't have any mapping. scalar.db.multi_storage.default_storage=cassandra ``` -## Further reading +For additional configurations, see [ScalarDB Configurations](configurations.md). -Please see the following sample to learn Multi-storage Transactions further: +## Hands-on tutorial -- [Multi-storage Transaction Sample](https://github.com/scalar-labs/scalardb-samples/tree/main/multi-storage-transaction-sample) +For a hands-on tutorial, see [Create a Sample Application That Supports Multi-Storage Transactions](https://github.com/scalar-labs/scalardb-samples/tree/main/multi-storage-transaction-sample). diff --git a/docs/two-phase-commit-transactions.md b/docs/two-phase-commit-transactions.md index 801a0d3262..2765698781 100644 --- a/docs/two-phase-commit-transactions.md +++ b/docs/two-phase-commit-transactions.md @@ -1,33 +1,27 @@ -# Two-phase Commit Transactions +# Transactions with a Two-Phase Commit Interface -ScalarDB also supports two-phase commit style transactions called *Two-phase Commit Transactions*. -With Two-phase Commit Transactions, you can execute a transaction that spans multiple processes/applications (e.g., Microservices). +ScalarDB supports executing transactions with a two-phase commit interface. With the two-phase commit interface, you can execute a transaction that spans multiple processes or applications, like in a microservice architecture. -This document briefly explains how to execute Two-phase Commit Transactions in ScalarDB. +This page explains how transactions with a two-phase commit interface work in ScalarDB and how to configure and execute them in ScalarDB. -## Overview +## How transactions with a two-phase commit interface work in ScalarDB -ScalarDB normally executes transactions in a single transaction manager instance with a one-phase commit interface, which we call normal transactions. -In that case, you begin a transaction, execute CRUD operations, and commit the transaction in the same transaction manager instance. +ScalarDB normally executes transactions in a single transaction manager instance with a one-phase commit interface. In transactions with a one-phase commit interface, you begin a transaction, execute CRUD operations, and commit the transaction in the same transaction manager instance. -In addition to normal transactions, ScalarDB also supports *Two-phase Commit Transactions*, which execute transactions with a two-phase interface. -Two-phase Commit Transactions execute a transaction that spans multiple transaction manager instances. -The transaction manager instances can be in the same process/application or in different processes/applications. -For example, if you have transaction manager instances in multiple microservices, you can execute a transaction that spans multiple microservices. +In ScalarDB, you can execute transactions with a two-phase commit interface that span multiple transaction manager instances. The transaction manager instances can be in the same process or application, or the instances can be in different processes or applications. For example, if you have transaction manager instances in multiple microservices, you can execute a transaction that spans multiple microservices. -In Two-phase Commit Transactions, there are two roles, a coordinator and a participant, that collaboratively execute a single transaction. -A coordinator process and participant processes all have different transaction manager instances. -The coordinator process first begins a transaction, and the participant processes join the transaction. -After executing CRUD operations, the coordinator process and the participant processes commit the transaction by using the two-phase interface. +In transactions with a two-phase commit interface, there are two roles—Coordinator and a participant—that collaboratively execute a single transaction. -## Configuration +The Coordinator process and the participant processes all have different transaction manager instances. The Coordinator process first begins or starts a transaction, and the participant processes join the transaction. After executing CRUD operations, the Coordinator process and the participant processes commit the transaction by using the two-phase interface. -The configuration for Two-phase Commit Transactions is the same as the one for the normal transaction. +## How to configure ScalarDB to support transactions with a two-phase commit interface -For example, you can set the following configuration when you use Cassandra: +To enable transactions with a two-phase commit interface, you need to specify `consensus-commit` as the value for `scalar.db.transaction_manager` in the ScalarDB properties file. + +The following is an example of a configuration for transactions with a two-phase commit interface when using Cassandra: ```properties -# Consensus commit is required to use Two-phase Commit Transactions. +# Consensus Commit is required to support transactions with a two-phase commit interface. scalar.db.transaction_manager=consensus-commit # Storage implementation. @@ -44,82 +38,83 @@ scalar.db.username=cassandra scalar.db.password=cassandra ``` -For details about configurations, see [ScalarDB Configurations](configurations.md). - -## How to execute Two-phase Commit Transactions +For additional configurations, see [ScalarDB Configurations](configurations.md). -This section explains how to execute Two-phase Commit Transactions. +## How to execute transactions with a two-phase commit interface -Like a well-known two-phase commit protocol, there are two roles, a coordinator and a participant, that collaboratively execute a single transaction. -The coordinator process first begins a transaction, and the participant processes join the transaction. +To execute a two-phase commit transaction, you must get the transaction manager instance. Then, the Coordinator process can begin or start the transaction, and the participant can process the transaction. ### Get a `TwoPhaseCommitTransactionManager` instance -First, you need to get a `TwoPhaseCommitTransactionManager` instance to execute Two-phase Commit Transactions. +You first need to get a `TwoPhaseCommitTransactionManager` instance to execute transactions with a two-phase commit interface. -You can use `TransactionFactory` to get a `TwoPhaseCommitTransactionManager` instance as follows: +To get a `TwoPhaseCommitTransactionManager` instance, you can use `TransactionFactory` as follows: ```java -TransactionFactory factory = TransactionFactory.create(""); +TransactionFactory factory = TransactionFactory.create(""); TwoPhaseCommitTransactionManager transactionManager = factory.getTwoPhaseCommitTransactionManager(); ``` -### Begin/Start a transaction (for coordinator) +### Begin or start a transaction (for Coordinator) -You can begin/start a transaction as follows: +For the process or application that begins the transaction to act as Coordinator, you should use the following `begin` method: ```java -// Begin a transaction +// Begin a transaction. TwoPhaseCommitTransaction tx = transactionManager.begin(); +``` -Or +Or, for the process or application that begins the transaction to act as Coordinator, you should use the following `start` method: -// Start a transaction +```java +// Start a transaction. TwoPhaseCommitTransaction tx = transactionManager.start(); ``` -The process/application that begins the transaction acts as a coordinator, as mentioned. - -You can also begin/start a transaction by specifying a transaction ID as follows: +Alternatively, you can use the `begin` method for a transaction by specifying a transaction ID as follows: ```java -// Begin a transaction with specifying a transaction ID -TwoPhaseCommitTransaction tx = transactionManager.begin(""); +// Begin a transaction by specifying a transaction ID. +TwoPhaseCommitTransaction tx = transactionManager.begin(""); +``` -Or +Or, you can use the `start` method for a transaction by specifying a transaction ID as follows: -// Start a transaction with specifying a transaction ID -TwoPhaseCommitTransaction tx = transactionManager.start(""); +```java +// Start a transaction by specifying a transaction ID. +TwoPhaseCommitTransaction tx = transactionManager.start(""); ``` -Note that you must guarantee uniqueness of the transaction ID in this case. +### Join a transaction (for participants) + +For participants, you can join a transaction by specifying the transaction ID associated with the transaction that Coordinator has started or begun as follows: -And, you can get the transaction ID with `getId()` as follows: ```java -tx.getId(); +TwoPhaseCommitTransaction tx = transactionManager.join("") ``` -### Join the transaction (for participants) +{% capture notice--info %} +**Note** -If you are a participant, you can join the transaction that has been begun by the coordinator as follows: +To get the transaction ID with `getId()`, you can specify the following: ```java -TwoPhaseCommitTransaction tx = transactionManager.join("") +tx.getId(); ``` +{% endcapture %} -You need to specify the transaction ID associated with the transaction that the coordinator has begun. +
{{ notice--info | markdownify }}
### CRUD operations for the transaction -The CRUD operations of `TwoPhaseCommitTransacton` are the same as the ones of `DistributedTransaction`. -So please see also [Java API Guide - CRUD operations](api-guide.md#crud-operations) for the details. +The CRUD operations for `TwoPhaseCommitTransacton` are the same as the operations for `DistributedTransaction`. For details, see [CRUD operations](api-guide.md#crud-operations). -This is an example code for CRUD operations in Two-phase Commit Transactions: +The following is example code for CRUD operations in transactions with a two-phase commit interface: ```java TwoPhaseCommitTransaction tx = ... -// Retrieve the current balances for ids +// Retrieve the current balances by ID. Get fromGet = Get.newBuilder() .namespace(NAMESPACE) @@ -137,11 +132,11 @@ Get toGet = Optional fromResult = tx.get(fromGet); Optional toResult = tx.get(toGet); -// Calculate the balances (it assumes that both accounts exist) +// Calculate the balances (assuming that both accounts exist). int newFromBalance = fromResult.get().getInt(BALANCE) - amount; int newToBalance = toResult.get().getInt(BALANCE) + amount; -// Update the balances +// Update the balances. Put fromPut = Put.newBuilder() .namespace(NAMESPACE) @@ -162,68 +157,70 @@ tx.put(fromPut); tx.put(toPut); ``` -### Prepare/Commit/Rollback the transaction +### Prepare, commit, or roll back a transaction + +After finishing CRUD operations, you need to commit the transaction. As with the standard two-phase commit protocol, there are two phases: prepare and commit. -After finishing CRUD operations, you need to commit the transaction. -Like a well-known two-phase commit protocol, there are two phases: prepare and commit phases. -You first need to prepare the transaction in all the coordinator/participant processes, and then you need to commit the transaction in all the coordinator/participant processes as follows: +In all the Coordinator and participant processes, you need to prepare and then commit the transaction as follows: ```java TwoPhaseCommitTransaction tx = ... try { - // Execute CRUD operations in the coordinator/participant processes + // Execute CRUD operations in the Coordinator and participant processes. ... - // Prepare phase: Prepare the transaction in all the coordinator/participant processes + // Prepare phase: Prepare the transaction in all the Coordinator and participant processes. tx.prepare(); ... - // Commit phase: Commit the transaction in all the coordinator/participant processes + // Commit phase: Commit the transaction in all the Coordinator and participant processes. tx.commit(); ... } catch (TransactionException e) { - // When an error happans, you need to rollback the transaction in all the coordinator/participant processes + // If an error happens, you will need to roll back the transaction in all the Coordinator and participant processes. tx.rollback(); ... } ``` -For `prepare()`, if any of the coordinator or participant processes fails to prepare the transaction, you will need to call `rollback()` (or `abort()`) in all the coordinator/participant processes. +For `prepare()`, if any of the Coordinator or participant processes fail to prepare the transaction, you will need to call `rollback()` (or `abort()`) in all the Coordinator and participant processes. -For `commit()`, if any of the coordinator or participant processes succeed in committing the transaction, you can consider the transaction as committed. -In other words, in that situation, you can ignore the errors in the other coordinator/participant processes. -If all the coordinator/participant processes fail to commit the transaction, you need to call `rollback()` (or `abort()`) in all the coordinator/participant processes. +For `commit()`, if any of the Coordinator or participant processes successfully commit the transaction, you can consider the transaction as committed. When a transaction has been committed, you can ignore any errors in the other Coordinator and participant processes. If all the Coordinator and participant processes fail to commit the transaction, you will need to call `rollback()` (or `abort()`) in all the Coordinator and participant processes. -For better performance, you can call `prepare()`, `commit()`, `rollback()` in the coordinator/participant processes in parallel, respectively. +For better performance, you can call `prepare()`, `commit()`, and `rollback()` in the Coordinator and participant processes in parallel, respectively. #### Validate the transaction -Depending on the concurrency control protocol, you need to call `validate()` in all the coordinator/participant processes after `prepare()` and before `commit()`: +Depending on the concurrency control protocol, you need to call `validate()` in all the Coordinator and participant processes after `prepare()` and before `commit()`, as shown below: ```java -// Prepare phase 1: Prepare the transaction in all the coordinator/participant processes +// Prepare phase 1: Prepare the transaction in all the Coordinator and participant processes. tx.prepare(); ... -// Prepare phase 2: Validate the transaction in all the coordinator/participant processes +// Prepare phase 2: Validate the transaction in all the Coordinator and participant processes. tx.validate(); ... -// Commit phase: Commit the transaction in all the coordinator/participant processes +// Commit phase: Commit the transaction in all the Coordinator and participant processes. tx.commit(); ... ``` -Similar to `prepare()`, if any of the coordinator or participant processes fails to validate the transaction, you will need to call `rollback()` (or `abort()`) in all the coordinator/participant processes. -Also, you can call `validate()` in the coordinator/participant processes in parallel for better performance. +Similar to `prepare()`, if any of the Coordinator or participant processes fail to validate the transaction, you will need to call `rollback()` (or `abort()`) in all the Coordinator and participant processes. In addition, you can call `validate()` in the Coordinator and participant processes in parallel for better performance. + +{% capture notice--info %} +**Note** -Currently, you need to call `validate()` when you use the `Consensus Commit` transaction manager with `EXTRA_READ` serializable strategy in `SERIALIZABLE` isolation level. -In other cases, `validate()` does nothing. +When using the [Consensus Commit](configurations/#consensus-commit) transaction manager with `EXTRA_READ` set as the value for `scalar.db.consensus_commit.serializable_strategy` and `SERIALIZABLE` set as the value for `scalar.db.consensus_commit.isolation_level`, you need to call `validate()`. However, if you are not using Consensus Commit, specifying `validate()` will not have any effect. +{% endcapture %} -### Execute a transaction with multiple transaction manager instances +
{{ notice--info | markdownify }}
-By using the APIs described above, you can execute a transaction with multiple transaction manager instances as follows: +### Execute a transaction by using multiple transaction manager instances + +By using the APIs described above, you can execute a transaction by using multiple transaction manager instances as follows: ```java TransactionFactory factory1 = @@ -239,28 +236,28 @@ TwoPhaseCommitTransactionManager transactionManager2 = TwoPhaseCommitTransaction transaction1 = null; TwoPhaseCommitTransaction transaction2 = null; try { - // Begin a transaction + // Begin a transaction. transaction1 = transactionManager1.begin(); - // Join the transaction begun by transactionManager1 with the transaction ID + // Join the transaction begun by `transactionManager1` by getting the transaction ID. transaction2 = transactionManager2.join(transaction1.getId()); - // Execute CRUD operations in the transaction + // Execute CRUD operations in the transaction. Optional result = transaction1.get(...); List results = transaction2.scan(...); transaction1.put(...); transaction2.delete(...); - // Prepare the transaction + // Prepare the transaction. transaction1.prepare(); transaction2.prepare(); - // Validate the transaction + // Validate the transaction. transaction1.validate(); transaction2.validate(); - // Commit the transaction. If any of the transactions succeeds to commit, you can regard the - // transaction as committed + // Commit the transaction. If any of the transactions successfully commit, + // you can regard the transaction as committed. AtomicReference exception = new AtomicReference<>(); boolean anyMatch = Stream.of(transaction1, transaction2) @@ -275,51 +272,62 @@ try { } }); - // If all the transactions fail to commit, throw the exception and rollback the transaction + // If all the transactions fail to commit, throw the exception and roll back the transaction. if (!anyMatch) { throw exception.get(); } } catch (TransactionException e) { - // Rollback the transaction + // Roll back the transaction. if (transaction1 != null) { try { transaction1.rollback(); } catch (RollbackException e1) { - // Handle the exception + // Handle the exception. } } if (transaction2 != null) { try { transaction2.rollback(); } catch (RollbackException e1) { - // Handle the exception + // Handle the exception. } } } ``` -For simplicity, the above example code doesn't handle the exceptions that can be thrown by the APIs. -For more details, see [Handle exceptions](#handle-exceptions). +For simplicity, the above example code doesn't handle the exceptions that the APIs may throw. For details about handling exceptions, see [How to handle exceptions](#handle-exceptions). -As previously mentioned, for `commit()`, if any of the coordinator or participant processes succeed in committing the transaction, you can regard the transaction as committed. -Also, for better performance, you can execute `prepare()`, `validate()`, and `commit()` in parallel, respectively. +As previously mentioned, for `commit()`, if any of the Coordinator or participant processes succeed in committing the transaction, you can consider the transaction as committed. Also, for better performance, you can execute `prepare()`, `validate()`, and `commit()` in parallel, respectively. ### Resume a transaction -Given that processes or applications using Two-phase Commit Transactions usually involve multiple request/response exchanges, you might need to execute a transaction across various endpoints or APIs. -For such scenarios, you can use `resume()` to resume a transaction object (an instance of `TwoPhaseCommitTransaction`) that you previously began or joined. The following shows how `resume()` works: +Given that processes or applications that use transactions with a two-phase commit interface usually involve multiple request and response exchanges, you might need to execute a transaction across various endpoints or APIs. For such scenarios, you can use `resume()` to resume a transaction object (an instance of `TwoPhaseCommitTransaction`) that you previously began or joined. + +The following shows how `resume()` works: ```java -// Join (or begin) the transaction -TwoPhaseCommitTransaction tx = transactionManager.join(""); +// Join (or begin) the transaction. +TwoPhaseCommitTransaction tx = transactionManager.join(""); ... -// Resume the transaction by the trnasaction ID -TwoPhaseCommitTransaction tx1 = transactionManager.resume("") +// Resume the transaction by using the transaction ID. +TwoPhaseCommitTransaction tx1 = transactionManager.resume("") +``` + +{% capture notice--info %} +**Note** + +To get the transaction ID with `getId()`, you can specify the following: + +```java +tx.getId(); ``` +{% endcapture %} + +
{{ notice--info | markdownify }}
-For example, let's say you have two services that have the following endpoints: +The following is an example of two services that have multiple endpoints: ```java interface ServiceA { @@ -339,7 +347,7 @@ interface ServiceB { } ``` -And, let's say a client calls `ServiceA.facadeEndpoint()` that begins a transaction that spans the two services (`ServiceA` and `ServiceB`) as follows: +The following is an example of a client calling `ServiceA.facadeEndpoint()` that begins a transaction that spans the two services (`ServiceA` and `ServiceB`): ```java public class ServiceAImpl implements ServiceA { @@ -356,25 +364,25 @@ public class ServiceAImpl implements ServiceA { try { ... - // Call ServiceB endpoint1 + // Call `ServiceB` `endpoint1`. serviceB.endpoint1(tx.getId()); ... - // Call ServiceB endpoint2 + // Call `ServiceB` `endpoint2`. serviceB.endpoint2(tx.getId()); ... - // Prepare + // Prepare. tx.prepare(); serviceB.prepare(tx.getId()); - // Commit + // Commit. tx.commit(); serviceB.commit(tx.getId()); } catch (Exception e) { - // Rollback + // Roll back. tx.rollback(); serviceB.rollback(tx.getId()); } @@ -382,10 +390,9 @@ public class ServiceAImpl implements ServiceA { } ``` -This facade endpoint in `ServiceA` calls multiple endpoints (`endpoint1()`, `endpoint2()`, `prepare()`, `commit()`, and `rollback()`) of `ServiceB`. -And in Two-phase Commit Transactions, you need to use the same transaction object across the endpoints. -For this situation, you can resume the transaction. -The implementation of `ServiceB` is as follows: +As shown above, the facade endpoint in `ServiceA` calls multiple endpoints (`endpoint1()`, `endpoint2()`, `prepare()`, `commit()`, and `rollback()`) of `ServiceB`. In addition, in transactions with a two-phase commit interface, you need to use the same transaction object across the endpoints. + +In this situation, you can resume the transaction. The implementation of `ServiceB` is as follows: ```java public class ServiceBImpl implements ServiceB { @@ -396,64 +403,68 @@ public class ServiceBImpl implements ServiceB { @Override public void endpoint1(String txId) throws Exception { - // First, you need to join the transaction + // Join the transaction. TwoPhaseCommitTransaction tx = transactionManager.join(txId); } @Override public void endpoint2(String txId) throws Exception { - // You can resume the transaction that you joined in endpoint1() + // Resume the transaction that you joined in `endpoint1()`. TwoPhaseCommitTransaction tx = transactionManager.resume(txId); } @Override public void prepare(String txId) throws Exception { - // You can resume the transaction + // Resume the transaction. TwoPhaseCommitTransaction tx = transactionManager.resume(txId); ... - // Prepare + // Prepare. tx.prepare(); } @Override public void commit(String txId) throws Exception { - // You can resume the transaction + // Resume the transaction. TwoPhaseCommitTransaction tx = transactionManager.resume(txId); ... - // Commit + // Commit. tx.commit(); } @Override public void rollback(String txId) throws Exception { - // You can resume the transaction + // Resume the transaction. TwoPhaseCommitTransaction tx = transactionManager.resume(txId); ... - // Rollback + // Roll back. tx.rollback(); } } ``` -As you can see, by resuming the transaction, you can share the same transaction object across multiple endpoints in `ServiceB`. +As shown above, by resuming the transaction, you can share the same transaction object across multiple endpoints in `ServiceB`. + +## How to handle exceptions + +When executing a transaction by using multiple transaction manager instances, you will also need to handle exceptions properly. -### Handle exceptions +{% capture notice--warning %} +**Attention** -In the previous section, you saw [how to execute a transaction with multiple transaction manager instances](#execute-a-transaction-with-multiple-transaction-manager-instances). -However, you may also need to handle exceptions properly. If you don't handle exceptions properly, you may face anomalies or data inconsistency. -This section describes how to handle exceptions in Two-phase Commit Transactions. +{% endcapture %} -Two-phase Commit Transactions are basically executed by multiple processes/applications (a coordinator and participants). -However, in this example code, we use multiple transaction managers (`transactionManager1` and `transactionManager2`) in a single process for ease of explanation. +
{{ notice--warning | markdownify }}
-The following example code shows how to handle exceptions in Two-phase Commit Transactions: +For instance, in the example code in [Execute a transaction by using multiple transaction manager instances](#execute-a-transaction-by-using-multiple-transaction-manager-instances), multiple transaction managers (`transactionManager1` and `transactionManager2`) are used in a single process for ease of explanation. However, that example code doesn't include a way to handle exceptions. + +The following example code shows how to handle exceptions in transactions with a two-phase commit interface: ```java public class Sample { @@ -473,64 +484,62 @@ public class Sample { while (true) { if (retryCount++ > 0) { - // Retry the transaction three times maximum in this sample code + // Retry the transaction three times maximum in this sample code. if (retryCount >= 3) { - // Throw the last exception if the number of retries exceeds the maximum + // Throw the last exception if the number of retries exceeds the maximum. throw lastException; } - // Sleep 100 milliseconds before retrying the transaction in this sample code + // Sleep 100 milliseconds before retrying the transaction in this sample code. TimeUnit.MILLISECONDS.sleep(100); } TwoPhaseCommitTransaction transaction1 = null; TwoPhaseCommitTransaction transaction2 = null; try { - // Begin a transaction + // Begin a transaction. transaction1 = transactionManager1.begin(); - // Join the transaction begun by transactionManager1 with the transaction ID + // Join the transaction that `transactionManager1` begun by using the transaction ID. transaction2 = transactionManager2.join(transaction1.getId()); - // Execute CRUD operations in the transaction + // Execute CRUD operations in the transaction. Optional result = transaction1.get(...); List results = transaction2.scan(...); transaction1.put(...); transaction2.delete(...); - // Prepare the transaction + // Prepare the transaction. prepare(transaction1, transaction2); - // Validate the transaction + // Validate the transaction. validate(transaction1, transaction2); - // Commit the transaction + // Commit the transaction. commit(transaction1, transaction2); } catch (UnsatisfiedConditionException e) { - // You need to handle `UnsatisfiedConditionException` only if a mutation operation specifies - // a condition. This exception indicates the condition for the mutation operation is not met + // You need to handle `UnsatisfiedConditionException` only if a mutation operation specifies + // a condition. This exception indicates the condition for the mutation operation is not met. rollback(transaction1, transaction2); - // You can handle the exception here, according to your application requirements + // You can handle the exception here, according to your application requirements. return; } catch (UnknownTransactionStatusException e) { - // If you catch `UnknownTransactionStatusException` when committing the transaction, it - // indicates that the status of the transaction, whether it has succeeded or not, is - // unknown. In such a case, you need to check if the transaction is committed successfully - // or not and retry it if it failed. How to identify a transaction status is delegated to - // users + // If you catch `UnknownTransactionStatusException` when committing the transaction, + // it indicates that the status of the transaction, whether it was successful or not, is unknown. + // In such a case, you need to check if the transaction is committed successfully or not and + // retry the transaction if it failed. How to identify a transaction status is delegated to users. return; } catch (TransactionException e) { // For other exceptions, you can try retrying the transaction. - // For `CrudConflictException`, `PreparationConflictException`, - // `ValidationConflictException`, `CommitConflictException` and - // `TransactionNotFoundException`, you can basically retry the transaction. However, for the - // other exceptions, the transaction may still fail if the cause of the exception is - // nontransient. In such a case, you will exhaust the number of retries and throw the last - // exception + // For `CrudConflictException`, `PreparationConflictException`, `ValidationConflictException`, + // `CommitConflictException`, and `TransactionNotFoundException`, you can basically retry the + // transaction. However, for the other exceptions, the transaction will still fail if the cause of + // the exception is non-transient. In such a case, you will exhaust the number of retries and + // throw the last exception. rollback(transaction1, transaction2); @@ -541,7 +550,7 @@ public class Sample { private static void prepare(TwoPhaseCommitTransaction... transactions) throws TransactionException { - // You can execute `prepare()` in parallel + // You can execute `prepare()` in parallel. List exceptions = Stream.of(transactions) .parallel() @@ -557,7 +566,7 @@ public class Sample { .filter(Objects::nonNull) .collect(Collectors.toList()); - // If any of the transactions failed to prepare, throw the exception + // If any of the transactions failed to prepare, throw the exception. if (!exceptions.isEmpty()) { throw exceptions.get(0); } @@ -565,7 +574,7 @@ public class Sample { private static void validate(TwoPhaseCommitTransaction... transactions) throws TransactionException { - // You can execute `validate()` in parallel + // You can execute `validate()` in parallel. List exceptions = Stream.of(transactions) .parallel() @@ -581,7 +590,7 @@ public class Sample { .filter(Objects::nonNull) .collect(Collectors.toList()); - // If any of the transactions failed to validate, throw the exception + // If any of the transactions failed to validate, throw the exception. if (!exceptions.isEmpty()) { throw exceptions.get(0); } @@ -589,7 +598,7 @@ public class Sample { private static void commit(TwoPhaseCommitTransaction... transactions) throws TransactionException { - // You can execute `commit()` in parallel + // You can execute `commit()` in parallel. List exceptions = Stream.of(transactions) .parallel() @@ -605,13 +614,13 @@ public class Sample { .filter(Objects::nonNull) .collect(Collectors.toList()); - // If any of the transactions succeeded to commit, you can regard the transaction as committed + // If any of the transactions successfully committed, you can regard the transaction as committed. if (exceptions.size() < transactions.length) { if (!exceptions.isEmpty()) { - // You can log the exceptions here if you want + // You can log the exceptions here if you want. } - return; // Succeeded to commit + return; // Commit was successful. } // @@ -619,14 +628,14 @@ public class Sample { // // If any of the transactions failed to commit due to `UnknownTransactionStatusException`, throw - // it because you should not retry the transaction in such a case + // it because you should not retry the transaction in such a case. Optional unknownTransactionStatusException = exceptions.stream().filter(e -> e instanceof UnknownTransactionStatusException).findFirst(); if (unknownTransactionStatusException.isPresent()) { throw unknownTransactionStatusException.get(); } - // Otherwise, throw the first exception + // Otherwise, throw the first exception. throw exceptions.get(0); } @@ -639,97 +648,108 @@ public class Sample { try { t.rollback(); } catch (RollbackException e) { - // Rolling back the transaction failed. As the transaction should eventually - // recover, you don't need to do anything further. You can simply log the occurrence - // here + // Rolling back the transaction failed. The transaction should eventually recover, + // so you don't need to do anything further. You can simply log the occurrence here. } }); } } ``` -The `begin()` API could throw `TransactionException` or `TransactionNotFoundException`. -If you catch `TransactionException`, it indicates that the transaction has failed to begin due to transient or nontransient faults. You can try retrying the transaction, but you may not be able to begin the transaction due to nontransient faults. -If you catch `TransactionNotFoundException`, it indicates that the transaction has failed to begin due to transient faults. You can retry the transaction. +### `TransactionException` and `TransactionNotFoundException` + +The `begin()` API could throw `TransactionException` or `TransactionNotFoundException`: + +- If you catch `TransactionException`, this exception indicates that the transaction has failed to begin due to transient or non-transient faults. You can try retrying the transaction, but you may not be able to begin the transaction due to non-transient faults. +- If you catch `TransactionNotFoundException`, this exception indicates that the transaction has failed to begin due to transient faults. In this case, you can retry the transaction. + +The `join()` API could also throw `TransactionNotFoundException`. You can handle this exception in the same way that you handle the exceptions for the `begin()` API. + +### `CrudException` and `CrudConflictException` + +The APIs for CRUD operations (`get()`, `scan()`, `put()`, `delete()`, and `mutate()`) could throw `CrudException` or `CrudConflictException`: -The `join()` API could also throw `TransactionException` or `TransactionNotFoundException`. -You can handle these exceptions in the same way that you handle the exceptions for the `begin()` API. +- If you catch `CrudException`, this exception indicates that the transaction CRUD operation has failed due to transient or non-transient faults. You can try retrying the transaction from the beginning, but the transaction will still fail if the cause is non-transient. +- If you catch `CrudConflictException`, this exception indicates that the transaction CRUD operation has failed due to transient faults (for example, a conflict error). In this case, you can retry the transaction from the beginning. -The APIs for CRUD operations (`get()`, `scan()`, `put()`, `delete()`, and `mutate()`) could throw `CrudException` or `CrudConflictException`. -If you catch `CrudException`, it indicates that the transaction CRUD operation has failed due to transient or nontransient faults. You can try retrying the transaction from the beginning, but the transaction may still fail if the cause is nontransient. -If you catch `CrudConflictException`, it indicates that the transaction CRUD operation has failed due to transient faults (e.g., a conflict error). You can retry the transaction from the beginning. +### `UnsatisfiedConditionException` The APIs for mutation operations (`put()`, `delete()`, and `mutate()`) could also throw `UnsatisfiedConditionException`. -If you catch this exception, it indicates that the condition for the mutation operation is not met. -You can handle this exception according to your application requirements. -The `prepare()` API could throw `PreparationException` or `PreparationConflictException`. -If you catch `PreparationException`, it indicates that preparing the transaction fails due to transient or nontransient faults. You can try retrying the transaction from the beginning, but the transaction may still fail if the cause is nontransient. -If you catch `PreparationConflictException`, it indicates that preparing the transaction has failed due to transient faults (e.g., a conflict error). You can retry the transaction from the beginning. +If you catch `UnsatisfiedConditionException`, this exception indicates that the condition for the mutation operation is not met. You can handle this exception according to your application requirements. -The `validate()` API could throw `ValidationException` or `ValidationConflictException`. -If you catch `ValidationException`, it indicates that validating the transaction fails due to transient or nontransient faults. You can try retrying the transaction from the beginning, but the transaction may still fail if the cause is nontransient. -If you catch `ValidationConflictException`, it indicates that validating the transaction has failed due to transient faults (e.g., a conflict error). You can retry the transaction from the beginning. +### `PreparationException` and `PreparationConflictException` -Also, the `commit()` API could throw `CommitException`, `CommitConflictException`, or `UnknownTransactionStatusException`. -If you catch `CommitException`, it indicates that committing the transaction fails due to transient or nontransient faults. You can try retrying the transaction from the beginning, but the transaction may still fail if the cause is nontransient. -If you catch `CommitConflictException`, it indicates that committing the transaction has failed due to transient faults (e.g., a conflict error). You can retry the transaction from the beginning. -If you catch `UnknownTransactionStatusException`, it indicates that the status of the transaction, whether it has succeeded or not, is unknown. -In such a case, you need to check if the transaction is committed successfully and retry the transaction if it has failed. -How to identify a transaction status is delegated to users. -You may want to create a transaction status table and update it transactionally with other application data so that you can get the status of a transaction from the status table. +The `prepare()` API could throw `PreparationException` or `PreparationConflictException`: -Although not illustrated in the sample code, the `resume()` API could also throw `TransactionNotFoundException`. -This exception indicates that the transaction associated with the specified ID was not found and/or the transaction might have expired. -In either case, you can retry the transaction from the beginning since the cause of this exception is basically transient. +- If you catch `PreparationException`, this exception indicates that preparing the transaction fails due to transient or non-transient faults. You can try retrying the transaction from the beginning, but the transaction will still fail if the cause is non-transient. +- If you catch `PreparationConflictException`, this exception indicates that preparing the transaction has failed due to transient faults (for example, a conflict error). In this case, you can retry the transaction from the beginning. -In the sample code, for `UnknownTransactionStatusException`, the transaction is not retried because the cause of the exception is nontransient. -Also, for `UnsatisfiedConditionException`, the transaction is not retried because how to handle this exception depends on your application requirements. -For other exceptions, the transaction is retried because the cause of the exception is transient or nontransient. -If the cause of the exception is transient, the transaction may succeed if you retry it. -However, if the cause of the exception is nontransient, the transaction may still fail even if you retry it. -In such a case, you will exhaust the number of retries. +### `ValidationException` and `ValidationConflictException` -Please note that if you begin a transaction by specifying a transaction ID, you must use a different ID when you retry the transaction. -And, in the sample code, the transaction is retried three times maximum and sleeps for 100 milliseconds before it is retried. -But you can choose a retry policy, such as exponential backoff, according to your application requirements. +The `validate()` API could throw `ValidationException` or `ValidationConflictException`: -## Request routing in Two-phase Commit Transactions +- If you catch `ValidationException`, this exception indicates that validating the transaction fails due to transient or non-transient faults. You can try retrying the transaction from the beginning, but the transaction will still fail if the cause is non-transient. +- If you catch `ValidationConflictException`, this exception indicates that validating the transaction has failed due to transient faults (for example, a conflict error). In this case, you can retry the transaction from the beginning. -Services using Two-phase Commit Transactions usually execute a transaction by exchanging multiple requests and responses as follows: +### `CommitException`, `CommitConflictException`, and `UnknownTransactionStatusException` -![](images/two_phase_commit_sequence_diagram.png) +The `commit()` API could throw `CommitException`, `CommitConflictException`, or `UnknownTransactionStatusException`: -Also, each service typically has multiple servers (or hosts) for scalability and availability and uses server-side (proxy) or client-side load balancing to distribute requests to the servers. -In such a case, since a transaction processing in Two-phase Commit Transactions is stateful, requests in a transaction must be routed to the same servers while different transactions need to be distributed to balance the load. +- If you catch `CommitException`, this exception indicates that committing the transaction fails due to transient or non-transient faults. You can try retrying the transaction from the beginning, but the transaction will still fail if the cause is non-transient. +- If you catch `CommitConflictException`, this exception indicates that committing the transaction has failed due to transient faults (for example, a conflict error). In this case, you can retry the transaction from the beginning. +- If you catch `UnknownTransactionStatusException`, this exception indicates that the status of the transaction, whether it was successful or not, is unknown. In this case, you need to check if the transaction is committed successfully and retry the transaction if it has failed. -![](images/two_phase_commit_load_balancing.png) +How to identify a transaction status is delegated to users. You may want to create a transaction status table and update it transactionally with other application data so that you can get the status of a transaction from the status table. -There are several approaches to achieve it depending on the protocol between the services. The next section introduces some approaches for gRPC and HTTP/1.1. +### Notes about some exceptions -### gPRC +Although not illustrated in the example code, the `resume()` API could also throw `TransactionNotFoundException`. This exception indicates that the transaction associated with the specified ID was not found and/or the transaction might have expired. In either case, you can retry the transaction from the beginning since the cause of this exception is basically transient. + +In the sample code, for `UnknownTransactionStatusException`, the transaction is not retried because the application must check if the transaction was successful to avoid potential duplicate operations. For other exceptions, the transaction is retried because the cause of the exception is transient or non-transient. If the cause of the exception is transient, the transaction may succeed if you retry it. However, if the cause of the exception is non-transient, the transaction will still fail even if you retry it. In such a case, you will exhaust the number of retries. + +{% capture notice--info %} +**Note** + +If you begin a transaction by specifying a transaction ID, you must use a different ID when you retry the transaction. + +In addition, in the sample code, the transaction is retried three times maximum and sleeps for 100 milliseconds before it is retried. But you can choose a retry policy, such as exponential backoff, according to your application requirements. +{% endcapture %} + +
{{ notice--info | markdownify }}
-For details about load balancing in gRPC, see [gRPC Load Balancing](https://grpc.io/blog/grpc-load-balancing/). +## Request routing in transactions with a two-phase commit interface + +Services that use transactions with a two-phase commit interface usually execute a transaction by exchanging multiple requests and responses, as shown in the following diagram: + +![Sequence diagram for transactions with a two-phase commit interface](images/two_phase_commit_sequence_diagram.png) + +In addition, each service typically has multiple servers (or hosts) for scalability and availability and uses server-side (proxy) or client-side load balancing to distribute requests to the servers. In such a case, since transaction processing in transactions with a two-phase commit interface is stateful, requests in a transaction must be routed to the same servers while different transactions need to be distributed to balance the load, as shown in the following diagram: + +![Load balancing for transactions with a two-phase commit interface](images/two_phase_commit_load_balancing.png) + +There are several approaches to achieve load balancing for transactions with a two-phase commit interface depending on the protocol between the services. Some approaches for this include using gRPC and HTTP/1.1. + +### gPRC When you use a client-side load balancer, you can use the same gRPC connection to send requests in a transaction, which guarantees that the requests go to the same servers. -When you use a server-side (proxy) load balancer, solutions are different between an L3/L4 (transport level) load balancer and an L7 (application level) load balancer. -When using an L3/L4 load balancer, you can use the same gRPC connection to send requests in a transaction, similar to when you use a client-side load balancer. -Requests in the same gRPC connection always go to the same server in L3/L4 load balancing. -When using an L7 load balancer, since requests in the same gRPC connection don't necessarily go to the same server, you need to use cookies or similar method to route requests to the correct server. -For example, if you use [Envoy](https://www.envoyproxy.io/), you can use session affinity (sticky session) for gRPC. -Alternatively, you can use [bidirectional streaming RPC in gRPC](https://grpc.io/docs/what-is-grpc/core-concepts/#bidirectional-streaming-rpc) since the L7 load balancer distributes requests in the same stream to the same server. +When you use a server-side (proxy) load balancer, solutions are different between an L3/L4 (transport-level) load balancer and an L7 (application-level) load balancer: + +- When using an L3/L4 load balancer, you can use the same gRPC connection to send requests in a transaction, similar to when you use a client-side load balancer. In this case, requests in the same gRPC connection always go to the same server. +- When using an L7 load balancer, since requests in the same gRPC connection don't necessarily go to the same server, you need to use cookies or similar method to route requests to the correct server. + - For example, if you use [Envoy](https://www.envoyproxy.io/), you can use session affinity (sticky session) for gRPC. Alternatively, you can use [bidirectional streaming RPC in gRPC](https://grpc.io/docs/what-is-grpc/core-concepts/#bidirectional-streaming-rpc) since the L7 load balancer distributes requests in the same stream to the same server. + +For more details about load balancing in gRPC, see [gRPC Load Balancing](https://grpc.io/blog/grpc-load-balancing/). ### HTTP/1.1 -Typically, you use a server-side (proxy) load balancer with HTTP/1.1. -When using an L3/L4 load balancer, you can use the same HTTP connection to send requests in a transaction, which guarantees the requests go to the same server. -When using an L7 load balancer, since requests in the same HTTP connection don't necessarily go to the same server, you need to use cookies or similar method to route requests to the correct server. -You can use session affinity (sticky session) in that case. +Typically, you use a server-side (proxy) load balancer with HTTP/1.1: -## Further reading +- When using an L3/L4 load balancer, you can use the same HTTP connection to send requests in a transaction, which guarantees the requests go to the same server. +- When using an L7 load balancer, since requests in the same HTTP connection don't necessarily go to the same server, you need to use cookies or similar method to route requests to the correct server. +You can use session affinity (sticky session) in that case. -One of the use cases for Two-phase Commit Transactions is Microservice Transaction. -Please see the following sample to learn Two-phase Commit Transactions further: +## Hands-on tutorial -- [Microservice Transaction Sample](https://github.com/scalar-labs/scalardb-samples/tree/main/microservice-transaction-sample) +One of the use cases for transactions with a two-phase commit interface is microservice transactions. For a hands-on tutorial, see [Create a Sample Application That Supports Microservice Transactions](https://github.com/scalar-labs/scalardb-samples/tree/main/microservice-transaction-sample).