diff --git a/.github/workflows/ci.yaml b/.github/workflows/ci.yaml index 31d57c3fd4..31bf9d6c49 100644 --- a/.github/workflows/ci.yaml +++ b/.github/workflows/ci.yaml @@ -112,52 +112,6 @@ jobs: with: arguments: ':schema-loader:dockerfileLint' - build-check-example-project: - name: Build check for 'Getting Started' example project - runs-on: ubuntu-latest - - defaults: - run: - working-directory: docs/getting-started - - steps: - - uses: actions/checkout@v4 - - - name: Set up JDK ${{ env.JAVA_VERSION }} (${{ env.JAVA_VENDOR }}) - uses: actions/setup-java@v4 - with: - java-version: ${{ env.JAVA_VERSION }} - distribution: ${{ env.JAVA_VENDOR }} - - - name: Setup Gradle - uses: gradle/actions/setup-gradle@v3 - - - name: Build Getting Started project - run: ./gradlew assemble - - build-check-example-project-for-kotlin: - name: Build check for 'Getting Started' example project for Kotlin - runs-on: ubuntu-latest - - defaults: - run: - working-directory: docs/getting-started-kotlin - - steps: - - uses: actions/checkout@v4 - - - name: Set up JDK ${{ env.JAVA_VERSION }} (${{ env.JAVA_VENDOR }}) - uses: actions/setup-java@v4 - with: - java-version: ${{ env.JAVA_VERSION }} - distribution: ${{ env.JAVA_VENDOR }} - - - name: Setup Gradle - uses: gradle/actions/setup-gradle@v3 - - - name: Build Getting Started project - run: ./gradlew assemble - integration-test-for-cassandra-3-0: name: Cassandra 3.0 integration test (${{ matrix.mode.label }}) runs-on: ubuntu-latest diff --git a/README.md b/README.md index 9ea86b8831..42a0d45667 100644 --- a/README.md +++ b/README.md @@ -1,10 +1,9 @@ # ScalarDB -ScalarDB is a universal transaction manager that achieves: -- database/storage-agnostic ACID transactions in a scalable manner even if an underlying database or storage is not ACID-compliant. -- multi-storage/database/service ACID transactions that can span multiple (possibly different) databases, storages, and services. +ScalarDB is a cross-database HTAP engine. It achieves ACID transactions and real-time analytics across diverse databases to simplify the complexity of managing multiple databases. ## Install + The library is available on [maven central repository](https://mvnrepository.com/artifact/com.scalar-labs/scalardb). You can install it in your application using your build tool such as Gradle and Maven. @@ -26,35 +25,10 @@ To add a dependency using Maven: ## Docs -* [ScalarDB Documentation](https://scalardb.scalar-labs.com/docs/latest/) - * [ScalarDB Overview](https://scalardb.scalar-labs.com/docs/latest/overview) - * [ScalarDB Design Document](https://scalardb.scalar-labs.com/docs/latest/design/) - * [Getting Started with ScalarDB](https://scalardb.scalar-labs.com/docs/latest/getting-started-with-scalardb/) - * [Getting Started with ScalarDB by Using Kotlin](https://scalardb.scalar-labs.com/docs/latest/getting-started-with-scalardb-by-using-kotlin/) - * [Add ScalarDB to Your Build](https://scalardb.scalar-labs.com/docs/latest/add-scalardb-to-your-build/) - * [ScalarDB Java API Guide](https://scalardb.scalar-labs.com/docs/latest/api-guide/) - * [Multi-Storage Transactions](https://scalardb.scalar-labs.com/docs/latest/multi-storage-transactions/) - * [Transactions with a Two-Phase Commit Interface](https://scalardb.scalar-labs.com/docs/latest/two-phase-commit-transactions/) - * [ScalarDB Schema Loader](https://scalardb.scalar-labs.com/docs/latest/schema-loader/) - * [Importing Existing Tables to ScalarDB by Using ScalarDB Schema Loader](https://scalardb.scalar-labs.com/docs/latest/schema-loader-import/) - * [Requirements and Recommendations for the Underlying Databases of ScalarDB](https://scalardb.scalar-labs.com/docs/latest/requirements/) - * [How to Back Up and Restore Databases Used Through ScalarDB](https://scalardb.scalar-labs.com/docs/latest/backup-restore/) - * [ScalarDB Supported Databases](https://scalardb.scalar-labs.com/docs/latest/scalardb-supported-databases/) - * [ScalarDB Configurations](https://scalardb.scalar-labs.com/docs/latest/configurations/) - * [Storage Abstraction and API Guide](https://scalardb.scalar-labs.com/docs/latest/storage-abstraction/) - * [ScalarDB Error Codes](https://scalardb.scalar-labs.com/docs/latest/scalardb-core-status-codes/) -* Slides - * [Making Cassandra more capable, faster, and more reliable](https://speakerdeck.com/scalar/making-cassandra-more-capable-faster-and-more-reliable-at-apachecon-at-home-2020) at ApacheCon@Home 2020 - * [Scalar DB: A library that makes non-ACID databases ACID-compliant](https://speakerdeck.com/scalar/scalar-db-a-library-that-makes-non-acid-databases-acid-compliant) at Database Lounge Tokyo #6 2020 - * [Transaction Management on Cassandra](https://speakerdeck.com/scalar/transaction-management-on-cassandra) at Next Generation Cassandra Conference / ApacheCon NA 2019 -* Javadoc - * [scalardb](https://javadoc.io/doc/com.scalar-labs/scalardb/latest/index.html) - ScalarDB: A universal transaction manager that achieves database-agnostic transactions and distributed transactions that span multiple databases - * [scalardb-rpc](https://javadoc.io/doc/com.scalar-labs/scalardb-rpc/latest/index.html) - ScalarDB RPC libraries - * [scalardb-schema-loader](https://javadoc.io/doc/com.scalar-labs/scalardb-schema-loader/latest/index.html) - ScalarDB Schema Loader: A tool for schema creation and schema deletion in ScalarDB -* [Jepsen tests](https://github.com/scalar-labs/scalar-jepsen) -* [TLA+](tla+/consensus-commit/README.md) +See our [User Documentation](https://scalardb.scalar-labs.com/docs/latest/). ## Contributing + This library is mainly maintained by the Scalar Engineering Team, but of course we appreciate any help. * For asking questions, finding answers and helping other users, please go to [stackoverflow](https://stackoverflow.com/) and use [scalardb](https://stackoverflow.com/questions/tagged/scalardb) tag. @@ -86,7 +60,8 @@ All the exception and log messages in this project are consistent with the follo When contributing to this project, please follow these guidelines. ## License + ScalarDB is dual-licensed under both the Apache 2.0 License (found in the LICENSE file in the root directory) and a commercial license. You may select, at your option, one of the above-listed licenses. -The commercial license includes several enterprise-grade features such as ScalarDB Server, management tools, and declarative query interfaces like GraphQL and SQL interfaces. -Regarding the commercial license, please [contact us](https://scalar-labs.com/contact_us/) for more information. +The commercial license includes several enterprise-grade features such as ScalarDB Cluster, management tools, and declarative query interfaces like GraphQL and SQL interfaces. +For more information about the commercial license, please [contact us](https://www.scalar-labs.com/contact). diff --git a/docs/_config.yml b/docs/_config.yml deleted file mode 100644 index a35bff9b41..0000000000 --- a/docs/_config.yml +++ /dev/null @@ -1,9 +0,0 @@ -# Note: This file is used only for redirecting visitors from the old ScalarDB docs site hosted in this repository to the new ScalarDB docs site. This file can be deleted after the old ScalarDB docs site hosted in this repository is no longer deployed. -google_analytics: "G-Q4TKS77KCP" - -defaults: - - scope: - path: "" # Specifies where the docs are located. - type: "default" # Defines the type for docs. - values: - layout: default # Defines the template type used for docs (templates are located in the "_layouts" folder). diff --git a/docs/_includes/analytics.html b/docs/_includes/analytics.html deleted file mode 100644 index f485f91a98..0000000000 --- a/docs/_includes/analytics.html +++ /dev/null @@ -1,11 +0,0 @@ - - - diff --git a/docs/_layouts/default.html b/docs/_layouts/default.html deleted file mode 100644 index 74d2a52692..0000000000 --- a/docs/_layouts/default.html +++ /dev/null @@ -1,13 +0,0 @@ - ---- ---- - - - - {% if site.google_analytics and jekyll.environment == 'production' %} - {% include analytics.html %} - {% endif %} - - diff --git a/docs/add-scalardb-to-your-build.md b/docs/add-scalardb-to-your-build.md deleted file mode 100644 index d9935341e0..0000000000 --- a/docs/add-scalardb-to-your-build.md +++ /dev/null @@ -1,43 +0,0 @@ -> [!CAUTION] -> -> This documentation has been moved to the centralized ScalarDB documentation repository, [docs-internal-scalardb](https://github.com/scalar-labs/docs-internal-scalardb). Please update this documentation in that repository instead. -> -> To view the ScalarDB documentation, visit [ScalarDB Documentation](https://scalardb.scalar-labs.com/docs/). - -# Add ScalarDB to Your Build - -The ScalarDB library is available on the [Maven Central Repository](https://mvnrepository.com/artifact/com.scalar-labs/scalardb). You can add the library as a build dependency to your application by using Gradle or Maven. - -## Configure your application based on your build tool - -Select your build tool, and follow the instructions to add the build dependency for ScalarDB to your application. - -
-
- - -
- -
- -To add the build dependency for ScalarDB by using Gradle, add the following to `build.gradle` in your application, replacing `` with the version of ScalarDB that you want to use: - -```gradle -dependencies { - implementation 'com.scalar-labs:scalardb:' -} -``` -
-
- -To add the build dependency for ScalarDB by using Maven, add the following to `pom.xml` in your application, replacing `` with the version of ScalarDB that you want to use: - -```xml - - com.scalar-labs - scalardb - - -``` -
-
diff --git a/docs/api-guide.md b/docs/api-guide.md deleted file mode 100644 index 68e82345d4..0000000000 --- a/docs/api-guide.md +++ /dev/null @@ -1,1264 +0,0 @@ -> [!CAUTION] -> -> This documentation has been moved to the centralized ScalarDB documentation repository, [docs-internal-scalardb](https://github.com/scalar-labs/docs-internal-scalardb). Please update this documentation in that repository instead. -> -> To view the ScalarDB documentation, visit [ScalarDB Documentation](https://scalardb.scalar-labs.com/docs/). - -# ScalarDB Java API Guide - -The ScalarDB Java API is mainly composed of the Administrative API and Transactional API. This guide briefly explains what kinds of APIs exist, how to use them, and related topics like how to handle exceptions. - -## Administrative API - -This section explains how to execute administrative operations programmatically by using the Administrative API in ScalarDB. - -{% capture notice--info %} -**Note** - -Another method for executing administrative operations is to use [Schema Loader](schema-loader.md). -{% endcapture %} - -
{{ notice--info | markdownify }}
- -### Get a `DistributedTransactionAdmin` instance - -You first need to get a `DistributedTransactionAdmin` instance to execute administrative operations. - -To get a `DistributedTransactionAdmin` instance, you can use `TransactionFactory` as follows: - -```java -TransactionFactory transactionFactory = TransactionFactory.create(""); -DistributedTransactionAdmin admin = transactionFactory.getTransactionAdmin(); -``` - -For details about configurations, see [ScalarDB Configurations](configurations.md). - -After you have executed all administrative operations, you should close the `DistributedTransactionAdmin` instance as follows: - -```java -admin.close(); -``` - -### Create a namespace - -Before creating tables, namespaces must be created since a table belongs to one namespace. - -You can create a namespace as follows: - -```java -// Create the namespace "ns". If the namespace already exists, an exception will be thrown. -admin.createNamespace("ns"); - -// Create the namespace only if it does not already exist. -boolean ifNotExists = true; -admin.createNamespace("ns", ifNotExists); - -// Create the namespace with options. -Map options = ...; -admin.createNamespace("ns", options); -``` - -#### Creation options - -In the creation operations, like creating a namespace and creating a table, you can specify options that are maps of option names and values (`Map`). By using the options, you can set storage adapter–specific configurations. - -Select your database to see the options available: - -
-
- - - - -
- -
- -| Name | Description | Default | -|----------------------|----------------------------------------------------------------------------------------|------------------| -| replication-strategy | Cassandra replication strategy. Must be `SimpleStrategy` or `NetworkTopologyStrategy`. | `SimpleStrategy` | -| compaction-strategy | Cassandra compaction strategy, Must be `LCS`, `STCS` or `TWCS`. | `STCS` | -| replication-factor | Cassandra replication factor. | 3 | - -
-
- -| Name | Description | Default | -|------------|-----------------------------------------------------|---------| -| ru | Base resource unit. | 400 | -| no-scaling | Disable auto-scaling for Cosmos DB for NoSQL. | false | - -
-
- -| Name | Description | Default | -|------------|-----------------------------------------|---------| -| no-scaling | Disable auto-scaling for DynamoDB. | false | -| no-backup | Disable continuous backup for DynamoDB. | false | -| ru | Base resource unit. | 10 | - -
-
- -No options are available for JDBC databases. - -
-
- -### Create a table - -When creating a table, you should define the table metadata and then create the table. - -To define the table metadata, you can use `TableMetadata`. The following shows how to define the columns, partition key, clustering key including clustering orders, and secondary indexes of a table: - -```java -// Define the table metadata. -TableMetadata tableMetadata = - TableMetadata.newBuilder() - .addColumn("c1", DataType.INT) - .addColumn("c2", DataType.TEXT) - .addColumn("c3", DataType.BIGINT) - .addColumn("c4", DataType.FLOAT) - .addColumn("c5", DataType.DOUBLE) - .addPartitionKey("c1") - .addClusteringKey("c2", Scan.Ordering.Order.DESC) - .addClusteringKey("c3", Scan.Ordering.Order.ASC) - .addSecondaryIndex("c4") - .build(); -``` - -For details about the data model of ScalarDB, see [Data Model](design.md#data-model). - -Then, create a table as follows: - -```java -// Create the table "ns.tbl". If the table already exists, an exception will be thrown. -admin.createTable("ns", "tbl", tableMetadata); - -// Create the table only if it does not already exist. -boolean ifNotExists = true; -admin.createTable("ns", "tbl", tableMetadata, ifNotExists); - -// Create the table with options. -Map options = ...; -admin.createTable("ns", "tbl", tableMetadata, options); -``` - -### Create a secondary index - -You can create a secondary index as follows: - -```java -// Create a secondary index on column "c5" for table "ns.tbl". If a secondary index already exists, an exception will be thrown. -admin.createIndex("ns", "tbl", "c5"); - -// Create the secondary index only if it does not already exist. -boolean ifNotExists = true; -admin.createIndex("ns", "tbl", "c5", ifNotExists); - -// Create the secondary index with options. -Map options = ...; -admin.createIndex("ns", "tbl", "c5", options); -``` - -### Add a new column to a table - -You can add a new, non-partition key column to a table as follows: - -```java -// Add a new column "c6" with the INT data type to the table "ns.tbl". -admin.addNewColumnToTable("ns", "tbl", "c6", DataType.INT) -``` - -{% capture notice--warning %} -**Attention** - -You should carefully consider adding a new column to a table because the execution time may vary greatly depending on the underlying storage. Please plan accordingly and consider the following, especially if the database runs in production: - -- **For Cosmos DB for NoSQL and DynamoDB:** Adding a column is almost instantaneous as the table schema is not modified. Only the table metadata stored in a separate table is updated. -- **For Cassandra:** Adding a column will only update the schema metadata and will not modify the existing schema records. The cluster topology is the main factor for the execution time. Changes to the schema metadata are shared to each cluster node via a gossip protocol. Because of this, the larger the cluster, the longer it will take for all nodes to be updated. -- **For relational databases (MySQL, Oracle, etc.):** Adding a column shouldn't take a long time to execute. -{% endcapture %} - -
{{ notice--warning | markdownify }}
- -### Truncate a table - -You can truncate a table as follows: - -```java -// Truncate the table "ns.tbl". -admin.truncateTable("ns", "tbl"); -``` - -### Drop a secondary index - -You can drop a secondary index as follows: - -```java -// Drop the secondary index on column "c5" from table "ns.tbl". If the secondary index does not exist, an exception will be thrown. -admin.dropIndex("ns", "tbl", "c5"); - -// Drop the secondary index only if it exists. -boolean ifExists = true; -admin.dropIndex("ns", "tbl", "c5", ifExists); -``` - -### Drop a table - -You can drop a table as follows: - -```java -// Drop the table "ns.tbl". If the table does not exist, an exception will be thrown. -admin.dropTable("ns", "tbl"); - -// Drop the table only if it exists. -boolean ifExists = true; -admin.dropTable("ns", "tbl", ifExists); -``` - -### Drop a namespace - -You can drop a namespace as follows: - -```java -// Drop the namespace "ns". If the namespace does not exist, an exception will be thrown. -admin.dropNamespace("ns"); - -// Drop the namespace only if it exists. -boolean ifExists = true; -admin.dropNamespace("ns", ifExists); -``` - -### Get existing namespaces - -You can get the existing namespaces as follows: - -```java -Set namespaces = admin.getNamespaceNames(); -``` - -### Get the tables of a namespace - -You can get the tables of a namespace as follows: - -```java -// Get the tables of the namespace "ns". -Set tables = admin.getNamespaceTableNames("ns"); -``` - -### Get table metadata - -You can get table metadata as follows: - -```java -// Get the table metadata for "ns.tbl". -TableMetadata tableMetadata = admin.getTableMetadata("ns", "tbl"); -``` - -### Repair a namespace - -If a namespace is in an unknown state, such as the namespace exists in the underlying storage but not its ScalarDB metadata or vice versa, this method will re-create the namespace and its metadata if necessary. - -You can repair the namespace as follows: - -```java -// Repair the namespace "ns" with options. -Map options = ...; -admin.repairNamespace("ns", options); -``` - -### Repair a table - -If a table is in an unknown state, such as the table exists in the underlying storage but not its ScalarDB metadata or vice versa, this method will re-create the table, its secondary indexes, and their metadata if necessary. - -You can repair the table as follows: - -```java -// Repair the table "ns.tbl" with options. -TableMetadata tableMetadata = - TableMetadata.newBuilder() - ... - .build(); -Map options = ...; -admin.repairTable("ns", "tbl", tableMetadata, options); -``` - -### Upgrade the environment to support the latest ScalarDB API - -You can upgrade the ScalarDB environment to support the latest version of the ScalarDB API. Typically, as indicated in the release notes, you will need to run this method after updating the ScalarDB version that your application environment uses. - -```java -// Upgrade the ScalarDB environment. -Map options = ...; -admin.upgrade(options); -``` - -### Specify operations for the Coordinator table - -The Coordinator table is used by the [Transactional API](#transactional-api) to track the statuses of transactions. - -When using a transaction manager, you must create the Coordinator table to execute transactions. In addition to creating the table, you can truncate and drop the Coordinator table. - -#### Create the Coordinator table - -You can create the Coordinator table as follows: - -```java -// Create the Coordinator table. -admin.createCoordinatorTables(); - -// Create the Coordinator table only if one does not already exist. -boolean ifNotExist = true; -admin.createCoordinatorTables(ifNotExist); - -// Create the Coordinator table with options. -Map options = ...; -admin.createCoordinatorTables(options); -``` - -#### Truncate the Coordinator table - -You can truncate the Coordinator table as follows: - -```java -// Truncate the Coordinator table. -admin.truncateCoordinatorTables(); -``` - -#### Drop the Coordinator table - -You can drop the Coordinator table as follows: - -```java -// Drop the Coordinator table. -admin.dropCoordinatorTables(); - -// Drop the Coordinator table if one exist. -boolean ifExist = true; -admin.dropCoordinatorTables(ifExist); -``` - -### Import a table - -You can import an existing table to ScalarDB as follows: - -```java -// Import the table "ns.tbl". If the table is already managed by ScalarDB, the target table does not -// exist, or the table does not meet the requirements of the ScalarDB table, an exception will be thrown. -admin.importTable("ns", "tbl", options); -``` - -{% capture notice--warning %} -**Attention** - -You should carefully plan to import a table to ScalarDB in production because it will add transaction metadata columns to your database tables and the ScalarDB metadata tables. In this case, there would also be several differences between your database and ScalarDB, as well as some limitations. For details, see [Importing Existing Tables to ScalarDB by Using ScalarDB Schema Loader](./schema-loader-import.md). - -{% endcapture %} - -
{{ notice--warning | markdownify }}
- -## Transactional API - -This section explains how to execute transactional operations by using the Transactional API in ScalarDB. - -### Get a `DistributedTransactionManager` instance - -You first need to get a `DistributedTransactionManager` instance to execute transactional operations. - -To get a `DistributedTransactionManager` instance, you can use `TransactionFactory` as follows: - -```java -TransactionFactory transactionFactory = TransactionFactory.create(""); -DistributedTransactionManager transactionManager = transactionFactory.getTransactionManager(); -``` - -After you have executed all transactional operations, you should close the `DistributedTransactionManager` instance as follows: - -```java -transactionManager.close(); -``` - -### Begin or start a transaction - -Before executing transactional CRUD operations, you need to begin or start a transaction. - -You can begin a transaction as follows: - -```java -// Begin a transaction. -DistributedTransaction transaction = transactionManager.begin(); -``` - -Or, you can start a transaction as follows: - -```java -// Start a transaction. -DistributedTransaction transaction = transactionManager.start(); -``` - -Alternatively, you can use the `begin` method for a transaction by specifying a transaction ID as follows: - -```java -// Begin a transaction with specifying a transaction ID. -DistributedTransaction transaction = transactionManager.begin(""); -``` - -Or, you can use the `start` method for a transaction by specifying a transaction ID as follows: - -```java -// Start a transaction with specifying a transaction ID. -DistributedTransaction transaction = transactionManager.start(""); -``` - -{% capture notice--info %} -**Note** - -Specifying a transaction ID is useful when you want to link external systems to ScalarDB. Otherwise, you should use the `begin()` method or the `start()` method. - -When you specify a transaction ID, make sure you specify a unique ID (for example, UUID v4) throughout the system since ScalarDB depends on the uniqueness of transaction IDs for correctness. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -### Join a transaction - -Joining a transaction is particularly useful in a stateful application where a transaction spans multiple client requests. In such a scenario, the application can start a transaction during the first client request. Then, in subsequent client requests, the application can join the ongoing transaction by using the `join()` method. - -You can join an ongoing transaction that has already begun by specifying the transaction ID as follows: - -```java -// Join a transaction. -DistributedTransaction transaction = transactionManager.join(""); -``` - -{% capture notice--info %} -**Note** - -To get the transaction ID with `getId()`, you can specify the following: - -```java -tx.getId(); -``` -{% endcapture %} - -
{{ notice--info | markdownify }}
- -### Resume a transaction - -Resuming a transaction is particularly useful in a stateful application where a transaction spans multiple client requests. In such a scenario, the application can start a transaction during the first client request. Then, in subsequent client requests, the application can resume the ongoing transaction by using the `resume()` method. - -You can resume an ongoing transaction that you have already begun by specifying a transaction ID as follows: - -```java -// Resume a transaction. -DistributedTransaction transaction = transactionManager.resume(""); -``` - -{% capture notice--info %} -**Note** - -To get the transaction ID with `getId()`, you can specify the following: - -```java -tx.getId(); -``` -{% endcapture %} - -
{{ notice--info | markdownify }}
- -### Implement CRUD operations - -The following sections describe key construction and CRUD operations. - -{% capture notice--info %} -**Note** - -Although all the builders of the CRUD operations can specify consistency by using the `consistency()` methods, those methods are ignored. Instead, the `LINEARIZABLE` consistency level is always used in transactions. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -#### Key construction - -Most CRUD operations need to specify `Key` objects (partition-key, clustering-key, etc.). So, before moving on to CRUD operations, the following explains how to construct a `Key` object. - -For a single column key, you can use `Key.of()` methods to construct the key as follows: - -```java -// For a key that consists of a single column of INT. -Key key1 = Key.ofInt("col1", 1); - -// For a key that consists of a single column of BIGINT. -Key key2 = Key.ofBigInt("col1", 100L); - -// For a key that consists of a single column of DOUBLE. -Key key3 = Key.ofDouble("col1", 1.3d); - -// For a key that consists of a single column of TEXT. -Key key4 = Key.ofText("col1", "value"); -``` - -For a key that consists of two to five columns, you can use the `Key.of()` method to construct the key as follows. Similar to `ImmutableMap.of()` in Guava, you need to specify column names and values in turns: - -```java -// For a key that consists of two to five columns. -Key key1 = Key.of("col1", 1, "col2", 100L); -Key key2 = Key.of("col1", 1, "col2", 100L, "col3", 1.3d); -Key key3 = Key.of("col1", 1, "col2", 100L, "col3", 1.3d, "col4", "value"); -Key key4 = Key.of("col1", 1, "col2", 100L, "col3", 1.3d, "col4", "value", "col5", false); -``` - -For a key that consists of more than five columns, we can use the builder to construct the key as follows: - -```java -// For a key that consists of more than five columns. -Key key = Key.newBuilder() - .addInt("col1", 1) - .addBigInt("col2", 100L) - .addDouble("col3", 1.3d) - .addText("col4", "value") - .addBoolean("col5", false) - .addInt("col6", 100) - .build(); -``` - -#### `Get` operation - -`Get` is an operation to retrieve a single record specified by a primary key. - -You need to create a `Get` object first, and then you can execute the object by using the `transaction.get()` method as follows: - -```java -// Create a `Get` operation. -Key partitionKey = Key.ofInt("c1", 10); -Key clusteringKey = Key.of("c2", "aaa", "c3", 100L); - -Get get = - Get.newBuilder() - .namespace("ns") - .table("tbl") - .partitionKey(partitionKey) - .clusteringKey(clusteringKey) - .projections("c1", "c2", "c3", "c4") - .build(); - -// Execute the `Get` operation. -Optional result = transaction.get(get); -``` - -You can also specify projections to choose which columns are returned. - -##### Handle `Result` objects - -The `Get` operation and `Scan` operation return `Result` objects. The following shows how to handle `Result` objects. - -You can get a column value of a result by using `get("")` methods as follows: - -```java -// Get the BOOLEAN value of a column. -boolean booleanValue = result.getBoolean(""); - -// Get the INT value of a column. -int intValue = result.getInt(""); - -// Get the BIGINT value of a column. -long bigIntValue = result.getBigInt(""); - -// Get the FLOAT value of a column. -float floatValue = result.getFloat(""); - -// Get the DOUBLE value of a column. -double doubleValue = result.getDouble(""); - -// Get the TEXT value of a column. -String textValue = result.getText(""); - -// Get the BLOB value of a column as a `ByteBuffer`. -ByteBuffer blobValue = result.getBlob(""); - -// Get the BLOB value of a column as a `byte` array. -byte[] blobValueAsBytes = result.getBlobAsBytes(""); -``` - -And if you need to check if a value of a column is null, you can use the `isNull("")` method. - -``` java -// Check if a value of a column is null. -boolean isNull = result.isNull(""); -``` - -For more details, see the `Result` page in the [Javadoc](https://javadoc.io/doc/com.scalar-labs/scalardb/latest/index.html) of the version of ScalarDB that you're using. - -##### Execute `Get` by using a secondary index - -You can execute a `Get` operation by using a secondary index. - -Instead of specifying a partition key, you can specify an index key (indexed column) to use a secondary index as follows: - -```java -// Create a `Get` operation by using a secondary index. -Key indexKey = Key.ofFloat("c4", 1.23F); - -Get get = - Get.newBuilder() - .namespace("ns") - .table("tbl") - .indexKey(indexKey) - .projections("c1", "c2", "c3", "c4") - .build(); - -// Execute the `Get` operation. -Optional result = transaction.get(get); -``` - -{% capture notice--info %} -**Note** - -If the result has more than one record, `transaction.get()` will throw an exception. If you want to handle multiple results, see [Execute `Scan` by using a secondary index](#execute-scan-by-using-a-secondary-index). - -{% endcapture %} - -
{{ notice--info | markdownify }}
- -#### `Scan` operation - -`Scan` is an operation to retrieve multiple records within a partition. You can specify clustering-key boundaries and orderings for clustering-key columns in `Scan` operations. - -You need to create a `Scan` object first, and then you can execute the object by using the `transaction.scan()` method as follows: - -```java -// Create a `Scan` operation. -Key partitionKey = Key.ofInt("c1", 10); -Key startClusteringKey = Key.of("c2", "aaa", "c3", 100L); -Key endClusteringKey = Key.of("c2", "aaa", "c3", 300L); - -Scan scan = - Scan.newBuilder() - .namespace("ns") - .table("tbl") - .partitionKey(partitionKey) - .start(startClusteringKey, true) // Include startClusteringKey - .end(endClusteringKey, false) // Exclude endClusteringKey - .projections("c1", "c2", "c3", "c4") - .orderings(Scan.Ordering.desc("c2"), Scan.Ordering.asc("c3")) - .limit(10) - .build(); - -// Execute the `Scan` operation. -List results = transaction.scan(scan); -``` - -You can omit the clustering-key boundaries or specify either a `start` boundary or an `end` boundary. If you don't specify `orderings`, you will get results ordered by the clustering order that you defined when creating the table. - -In addition, you can specify `projections` to choose which columns are returned and use `limit` to specify the number of records to return in `Scan` operations. - -##### Execute `Scan` by using a secondary index - -You can execute a `Scan` operation by using a secondary index. - -Instead of specifying a partition key, you can specify an index key (indexed column) to use a secondary index as follows: - -```java -// Create a `Scan` operation by using a secondary index. -Key indexKey = Key.ofFloat("c4", 1.23F); - -Scan scan = - Scan.newBuilder() - .namespace("ns") - .table("tbl") - .indexKey(indexKey) - .projections("c1", "c2", "c3", "c4") - .limit(10) - .build(); - -// Execute the `Scan` operation. -List results = transaction.scan(scan); -``` - -{% capture notice--info %} -**Note** - -You can't specify clustering-key boundaries and orderings in `Scan` by using a secondary index. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -##### Execute cross-partition `Scan` without specifying a partition key to retrieve all the records of a table - -You can execute a `Scan` operation across all partitions, which we call *cross-partition scan*, without specifying a partition key by enabling the following configuration in the ScalarDB properties file. - -```properties -scalar.db.cross_partition_scan.enabled=true -``` - -{% capture notice--warning %} -**Attention** - -For non-JDBC databases, we do not recommend enabling cross-partition scan with the `SERIALIAZABLE` isolation level because transactions could be executed at a lower isolation level (that is, `SNAPSHOT`). When using non-JDBC databases, use cross-partition scan at your own risk only if consistency does not matter for your transactions. -{% endcapture %} - -
{{ notice--warning | markdownify }}
- -Instead of calling the `partitionKey()` method in the builder, you can call the `all()` method to scan a table without specifying a partition key as follows: - -```java -// Create a `Scan` operation without specifying a partition key. -Scan scan = - Scan.newBuilder() - .namespace("ns") - .table("tbl") - .all() - .projections("c1", "c2", "c3", "c4") - .limit(10) - .build(); - -// Execute the `Scan` operation. -List results = transaction.scan(scan); -``` - -{% capture notice--info %} -**Note** - -You can't specify any orderings in cross-partition `Scan` when using non-JDBC databases. For details on how to use cross-partition `Scan` with filtering or ordering, see [Execute cross-partition `Scan` with filtering and ordering](#execute-cross-partition-scan-with-filtering-and-ordering). -{% endcapture %} - -
{{ notice--info | markdownify }}
- -##### Execute cross-partition `Scan` with filtering and ordering - -By enabling the cross-partition scan option with filtering and ordering as follows, you can execute a cross-partition `Scan` operation with flexible conditions and orderings: - -```properties -scalar.db.cross_partition_scan.enabled=true -scalar.db.cross_partition_scan.filtering.enabled=true -scalar.db.cross_partition_scan.ordering.enabled=true -``` - -You can call the `where()` and `ordering()` methods after calling the `all()` method to specify arbitrary conditions and orderings as follows: - -```java -// Create a `Scan` operation with arbitrary conditions and orderings. -Scan scan = - Scan.newBuilder() - .namespace("ns") - .table("tbl") - .all() - .where(ConditionBuilder.column("c1").isNotEqualToInt(10)) - .projections("c1", "c2", "c3", "c4") - .orderings(Scan.Ordering.desc("c3"), Scan.Ordering.asc("c4")) - .limit(10) - .build(); - -// Execute the `Scan` operation. -List results = transaction.scan(scan); -``` - -As an argument of the `where()` method, you can specify a condition, an and-wise condition set, or an or-wise condition set. After calling the `where()` method, you can add more conditions or condition sets by using the `and()` method or `or()` method as follows: - -```java -// Create a `Scan` operation with condition sets. -Scan scan = - Scan.newBuilder() - .namespace("ns") - .table("tbl") - .all() - .where( - ConditionSetBuilder.condition(ConditionBuilder.column("c1").isLessThanInt(10)) - .or(ConditionBuilder.column("c1").isGreaterThanInt(20)) - .build()) - .and( - ConditionSetBuilder.condition(ConditionBuilder.column("c2").isLikeText("a%")) - .or(ConditionBuilder.column("c2").isLikeText("b%")) - .build()) - .limit(10) - .build(); -``` - -{% capture notice--info %} -**Note** - -In the `where()` condition method chain, the conditions must be an and-wise junction of `ConditionalExpression` or `OrConditionSet` (known as conjunctive normal form) like the above example or an or-wise junction of `ConditionalExpression` or `AndConditionSet` (known as disjunctive normal form). -{% endcapture %} - -
{{ notice--info | markdownify }}
- -For more details about available conditions and condition sets, see the `ConditionBuilder` and `ConditionSetBuilder` page in the [Javadoc](https://javadoc.io/doc/com.scalar-labs/scalardb/latest/index.html) of the version of ScalarDB that you're using. - -#### `Put` operation - -`Put` is an operation to put a record specified by a primary key. The operation behaves as an upsert operation for a record, in which the operation updates the record if the record exists or inserts the record if the record does not exist. - -{% capture notice--info %} -**Note** - -When you update an existing record, you need to read the record by using `Get` or `Scan` before using a `Put` operation. Otherwise, the operation will fail due to a conflict. This occurs because of the specification of ScalarDB to manage transactions properly. Instead of reading the record explicitly, you can enable implicit pre-read. For details, see [Enable implicit pre-read for `Put` operations](#enable-implicit-pre-read-for-put-operations). -{% endcapture %} - -
{{ notice--info | markdownify }}
- -You need to create a `Put` object first, and then you can execute the object by using the `transaction.put()` method as follows: - -```java -// Create a `Put` operation. -Key partitionKey = Key.ofInt("c1", 10); -Key clusteringKey = Key.of("c2", "aaa", "c3", 100L); - -Put put = - Put.newBuilder() - .namespace("ns") - .table("tbl") - .partitionKey(partitionKey) - .clusteringKey(clusteringKey) - .floatValue("c4", 1.23F) - .doubleValue("c5", 4.56) - .build(); - -// Execute the `Put` operation. -transaction.put(put); -``` - -You can also put a record with `null` values as follows: - -```java -Put put = - Put.newBuilder() - .namespace("ns") - .table("tbl") - .partitionKey(partitionKey) - .clusteringKey(clusteringKey) - .floatValue("c4", null) - .doubleValue("c5", null) - .build(); -``` - -##### Enable implicit pre-read for `Put` operations - -In Consensus Commit, an application must read a record before mutating the record with `Put` and `Delete` operations to obtain the latest states of the record if the record exists. Instead of reading the record explicitly, you can enable *implicit pre-read*. By enabling implicit pre-read, if an application does not read the record explicitly in a transaction, ScalarDB will read the record on behalf of the application before committing the transaction. - -You can enable implicit pre-read for a `Put` operation by specifying `enableImplicitPreRead()` in the `Put` operation builder as follows: - -```java -Put put = - Put.newBuilder() - .namespace("ns") - .table("tbl") - .partitionKey(partitionKey) - .clusteringKey(clusteringKey) - .floatValue("c4", 1.23F) - .doubleValue("c5", 4.56) - .enableImplicitPreRead() - .build(); -``` - -{% capture notice--info %} -**Note** - -If you are certain that a record you are trying to mutate does not exist, you should not enable implicit pre-read for the `Put` operation for better performance. For example, if you load initial data, you should not enable implicit pre-read. A `Put` operation without implicit pre-read is faster than `Put` operation with implicit pre-read because the operation skips an unnecessary read. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -#### `Delete` operation - -`Delete` is an operation to delete a record specified by a primary key. - -{% capture notice--info %} -**Note** - -When you delete a record, you don't have to read the record beforehand because implicit pre-read is always enabled for `Delete` operations. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -You need to create a `Delete` object first, and then you can execute the object by using the `transaction.delete()` method as follows: - -```java -// Create a `Delete` operation. -Key partitionKey = Key.ofInt("c1", 10); -Key clusteringKey = Key.of("c2", "aaa", "c3", 100L); - -Delete delete = - Delete.newBuilder() - .namespace("ns") - .table("tbl") - .partitionKey(partitionKey) - .clusteringKey(clusteringKey) - .build(); - -// Execute the `Delete` operation. -transaction.delete(delete); -``` - -#### `Put` and `Delete` with a condition - -You can write arbitrary conditions (for example, a bank account balance must be equal to or more than zero) that you require a transaction to meet before being committed by implementing logic that checks the conditions in the transaction. Alternatively, you can write simple conditions in a mutation operation, such as `Put` and `Delete`. - -When a `Put` or `Delete` operation includes a condition, the operation is executed only if the specified condition is met. If the condition is not met when the operation is executed, an exception called `UnsatisfiedConditionException` will be thrown. - -{% capture notice--info %} -**Note** - -When you specify a condition in a `Put` operation, you need to read the record beforehand or enable implicit pre-read. -{% endcapture %} - -
{{ notice--info | markdownify }}
- - -##### Conditions for `Put` - -You can specify a condition in a `Put` operation as follows: - -```java -// Build a condition. -MutationCondition condition = - ConditionBuilder.putIf(ConditionBuilder.column("c4").isEqualToFloat(0.0F)) - .and(ConditionBuilder.column("c5").isEqualToDouble(0.0)) - .build(); - -Put put = - Put.newBuilder() - .namespace("ns") - .table("tbl") - .partitionKey(partitionKey) - .clusteringKey(clusteringKey) - .floatValue("c4", 1.23F) - .doubleValue("c5", 4.56) - .condition(condition) // condition - .build(); - -// Execute the `Put` operation. -transaction.put(put); -``` - -In addition to using the `putIf` condition, you can specify the `putIfExists` and `putIfNotExists` conditions as follows: - -```java -// Build a `putIfExists` condition. -MutationCondition putIfExistsCondition = ConditionBuilder.putIfExists(); - -// Build a `putIfNotExists` condition. -MutationCondition putIfNotExistsCondition = ConditionBuilder.putIfNotExists(); -``` - -##### Conditions for `Delete` - -You can specify a condition in a `Delete` operation as follows: - -```java -// Build a condition. -MutationCondition condition = - ConditionBuilder.deleteIf(ConditionBuilder.column("c4").isEqualToFloat(0.0F)) - .and(ConditionBuilder.column("c5").isEqualToDouble(0.0)) - .build(); - -Delete delete = - Delete.newBuilder() - .namespace("ns") - .table("tbl") - .partitionKey(partitionKey) - .clusteringKey(clusteringKey) - .condition(condition) // condition - .build(); - -// Execute the `Delete` operation. -transaction.delete(delete); -``` - -In addition to using the `deleteIf` condition, you can specify the `deleteIfExists` condition as follows: - -```java -// Build a `deleteIfExists` condition. -MutationCondition deleteIfExistsCondition = ConditionBuilder.deleteIfExists(); -``` - -#### Mutate operation - -Mutate is an operation to execute multiple mutations (`Put` and `Delete` operations). - -You need to create mutation objects first, and then you can execute the objects by using the `transaction.mutate()` method as follows: - -```java -// Create `Put` and `Delete` operations. -Key partitionKey = Key.ofInt("c1", 10); - -Key clusteringKeyForPut = Key.of("c2", "aaa", "c3", 100L); - -Put put = - Put.newBuilder() - .namespace("ns") - .table("tbl") - .partitionKey(partitionKey) - .clusteringKey(clusteringKeyForPut) - .floatValue("c4", 1.23F) - .doubleValue("c5", 4.56) - .build(); - -Key clusteringKeyForDelete = Key.of("c2", "bbb", "c3", 200L); - -Delete delete = - Delete.newBuilder() - .namespace("ns") - .table("tbl") - .partitionKey(partitionKey) - .clusteringKey(clusteringKeyForDelete) - .build(); - -// Execute the operations. -transaction.mutate(Arrays.asList(put, delete)); -``` - -#### Default namespace for CRUD operations - -A default namespace for all CRUD operations can be set by using a property in the ScalarDB configuration. - -```properties -scalar.db.default_namespace_name= -``` - -Any operation that does not specify a namespace will use the default namespace set in the configuration. - -```java -// This operation will target the default namespace. -Scan scanUsingDefaultNamespace = - Scan.newBuilder() - .table("tbl") - .all() - .build(); -// This operation will target the "ns" namespace. -Scan scanUsingSpecifiedNamespace = - Scan.newBuilder() - .namespace("ns") - .table("tbl") - .all() - .build(); -``` - -### Commit a transaction - -After executing CRUD operations, you need to commit a transaction to finish it. - -You can commit a transaction as follows: - -```java -// Commit a transaction. -transaction.commit(); -``` - -### Roll back or abort a transaction - -If an error occurs when executing a transaction, you can roll back or abort the transaction. - -You can roll back a transaction as follows: - -```java -// Roll back a transaction. -transaction.rollback(); -``` - -Or, you can abort a transaction as follows: - -```java -// Abort a transaction. -transaction.abort(); -``` - -For details about how to handle exceptions in ScalarDB, see [How to handle exceptions](#how-to-handle-exceptions). - -## How to handle exceptions - -When executing a transaction, you will also need to handle exceptions properly. - -{% capture notice--warning %} -**Attention** - -If you don't handle exceptions properly, you may face anomalies or data inconsistency. -{% endcapture %} - -
{{ notice--warning | markdownify }}
- -The following sample code shows how to handle exceptions: - -```java -public class Sample { - public static void main(String[] args) throws Exception { - TransactionFactory factory = TransactionFactory.create(""); - DistributedTransactionManager transactionManager = factory.getTransactionManager(); - - int retryCount = 0; - TransactionException lastException = null; - - while (true) { - if (retryCount++ > 0) { - // Retry the transaction three times maximum. - if (retryCount >= 3) { - // Throw the last exception if the number of retries exceeds the maximum. - throw lastException; - } - - // Sleep 100 milliseconds before retrying the transaction. - TimeUnit.MILLISECONDS.sleep(100); - } - - DistributedTransaction transaction = null; - try { - // Begin a transaction. - transaction = transactionManager.begin(); - - // Execute CRUD operations in the transaction. - Optional result = transaction.get(...); - List results = transaction.scan(...); - transaction.put(...); - transaction.delete(...); - - // Commit the transaction. - transaction.commit(); - } catch (UnsatisfiedConditionException e) { - // You need to handle `UnsatisfiedConditionException` only if a mutation operation specifies a condition. - // This exception indicates the condition for the mutation operation is not met. - - try { - transaction.rollback(); - } catch (RollbackException ex) { - // Rolling back the transaction failed. Since the transaction should eventually recover, - // you don't need to do anything further. You can simply log the occurrence here. - } - - // You can handle the exception here, according to your application requirements. - - return; - } catch (UnknownTransactionStatusException e) { - // If you catch `UnknownTransactionStatusException` when committing the transaction, - // it indicates that the status of the transaction, whether it was successful or not, is unknown. - // In such a case, you need to check if the transaction is committed successfully or not and - // retry the transaction if it failed. How to identify a transaction status is delegated to users. - return; - } catch (TransactionException e) { - // For other exceptions, you can try retrying the transaction. - - // For `CrudConflictException`, `CommitConflictException`, and `TransactionNotFoundException`, - // you can basically retry the transaction. However, for the other exceptions, the transaction - // will still fail if the cause of the exception is non-transient. In such a case, you will - // exhaust the number of retries and throw the last exception. - - if (transaction != null) { - try { - transaction.rollback(); - } catch (RollbackException ex) { - // Rolling back the transaction failed. The transaction should eventually recover, - // so you don't need to do anything further. You can simply log the occurrence here. - } - } - - lastException = e; - } - } - } -} -``` - -### `TransactionException` and `TransactionNotFoundException` - -The `begin()` API could throw `TransactionException` or `TransactionNotFoundException`: - -- If you catch `TransactionException`, this exception indicates that the transaction has failed to begin due to transient or non-transient faults. You can try retrying the transaction, but you may not be able to begin the transaction due to non-transient faults. -- If you catch `TransactionNotFoundException`, this exception indicates that the transaction has failed to begin due to transient faults. In this case, you can retry the transaction. - -The `join()` API could also throw `TransactionNotFoundException`. You can handle this exception in the same way that you handle the exceptions for the `begin()` API. - -### `CrudException` and `CrudConflictException` - -The APIs for CRUD operations (`get()`, `scan()`, `put()`, `delete()`, and `mutate()`) could throw `CrudException` or `CrudConflictException`: - -- If you catch `CrudException`, this exception indicates that the transaction CRUD operation has failed due to transient or non-transient faults. You can try retrying the transaction from the beginning, but the transaction may still fail if the cause is non-transient. -- If you catch `CrudConflictException`, this exception indicates that the transaction CRUD operation has failed due to transient faults (for example, a conflict error). In this case, you can retry the transaction from the beginning. - -### `UnsatisfiedConditionException` - -The APIs for mutation operations (`put()`, `delete()`, and `mutate()`) could also throw `UnsatisfiedConditionException`. - -If you catch `UnsatisfiedConditionException`, this exception indicates that the condition for the mutation operation is not met. You can handle this exception according to your application requirements. - -### `CommitException`, `CommitConflictException`, and `UnknownTransactionStatusException` - -The `commit()` API could throw `CommitException`, `CommitConflictException`, or `UnknownTransactionStatusException`: - -- If you catch `CommitException`, this exception indicates that committing the transaction fails due to transient or non-transient faults. You can try retrying the transaction from the beginning, but the transaction may still fail if the cause is non-transient. -- If you catch `CommitConflictException`, this exception indicates that committing the transaction has failed due to transient faults (for example, a conflict error). In this case, you can retry the transaction from the beginning. -- If you catch `UnknownTransactionStatusException`, this exception indicates that the status of the transaction, whether it was successful or not, is unknown. In this case, you need to check if the transaction is committed successfully and retry the transaction if it has failed. - -How to identify a transaction status is delegated to users. You may want to create a transaction status table and update it transactionally with other application data so that you can get the status of a transaction from the status table. - -### Notes about some exceptions - -Although not illustrated in the sample code, the `resume()` API could also throw `TransactionNotFoundException`. This exception indicates that the transaction associated with the specified ID was not found and/or the transaction might have expired. In either case, you can retry the transaction from the beginning since the cause of this exception is basically transient. - -In the sample code, for `UnknownTransactionStatusException`, the transaction is not retried because the application must check if the transaction was successful to avoid potential duplicate operations. For other exceptions, the transaction is retried because the cause of the exception is transient or non-transient. If the cause of the exception is transient, the transaction may succeed if you retry it. However, if the cause of the exception is non-transient, the transaction will still fail even if you retry it. In such a case, you will exhaust the number of retries. - -{% capture notice--info %} -**Note** - -In the sample code, the transaction is retried three times maximum and sleeps for 100 milliseconds before it is retried. But you can choose a retry policy, such as exponential backoff, according to your application requirements. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -### Group commit for the Coordinator table - -The Coordinator table that is used for Consensus Commit transactions is a vital data store, and using robust storage for it is recommended. However, utilizing more robust storage options, such as internally leveraging multi-AZ or multi-region replication, may lead to increased latency when writing records to the storage, resulting in poor throughput performance. - -ScalarDB provides a group commit feature for the Coordinator table that groups multiple record writes into a single write operation, improving write throughput. In this case, latency may increase or decrease, depending on the underlying database and the workload. - -To enable the group commit feature, add the following configuration: - -```properties -# By default, this configuration is set to `false`. -scalar.db.consensus_commit.coordinator.group_commit.enabled=true - -# These properties are for tuning the performance of the group commit feature. -# scalar.db.consensus_commit.coordinator.group_commit.group_size_fix_timeout_millis=40 -# scalar.db.consensus_commit.coordinator.group_commit.delayed_slot_move_timeout_millis=800 -# scalar.db.consensus_commit.coordinator.group_commit.old_group_abort_timeout_millis=30000 -# scalar.db.consensus_commit.coordinator.group_commit.timeout_check_interval_millis=10 -# scalar.db.consensus_commit.coordinator.group_commit.metrics_monitor_log_enabled=true -``` - -#### Limitations - -This section describes the limitations of the group commit feature. - -##### Custom transaction ID passed by users - -The group commit feature implicitly generates an internal value and uses it as a part of transaction ID. Therefore, a custom transaction ID manually passed by users via `com.scalar.db.transaction.consensuscommit.ConsensusCommitManager.begin(String txId)` or `com.scalar.db.transaction.consensuscommit.TwoPhaseConsensusCommitManager.begin(String txId)` can't be used as is for later API calls. You need to use a transaction ID returned from`com.scalar.db.transaction.consensuscommit.ConsensusCommit.getId()` or `com.scalar.db.transaction.consensuscommit.TwoPhaseConsensusCommit.getId()` instead. - -```java - // This custom transaction ID needs to be used for ScalarDB transactions. - String myTxId = UUID.randomUUID().toString(); - - ... - - DistributedTransaction transaction = manager.begin(myTxId); - - ... - - // When the group commit feature is enabled, a custom transaction ID passed by users can't be used as is. - // logger.info("The transaction state: {}", manager.getState(myTxId)); - logger.info("The transaction state: {}", manager.getState(transaction.getId())); -``` - -##### Prohibition of use with a two-phase commit interface - -The group commit feature manages all ongoing transactions in memory. If this feature is enabled with a two-phase commit interface, the information must be solely maintained by the coordinator service to prevent conflicts caused by participant services' inconsistent writes to the Coordinator table, which may contain different transaction distributions over groups. - -This limitation introduces some complexities and inflexibilities related to application development. Therefore, combining the use of the group commit feature with a two-phase commit interface is currently prohibited. - -## Investigating Consensus Commit transaction manager errors - -To investigate errors when using the Consensus Commit transaction manager, you can enable a configuration that will return table metadata augmented with transaction metadata columns, which can be helpful when investigating transaction-related issues. This configuration, which is only available when troubleshooting the Consensus Commit transaction manager, enables you to see transaction metadata column details for a given table by using the `DistributedTransactionAdmin.getTableMetadata()` method. - -By adding the following configuration, `Get` and `Scan` operations results will contain [transaction metadata](schema-loader.md#internal-metadata-for-consensus-commit): - -```properties -# By default, this configuration is set to `false`. -scalar.db.consensus_commit.include_metadata.enabled=true -``` diff --git a/docs/backup-restore.md b/docs/backup-restore.md deleted file mode 100644 index b9daf146f8..0000000000 --- a/docs/backup-restore.md +++ /dev/null @@ -1,229 +0,0 @@ -> [!CAUTION] -> -> This documentation has been moved to the centralized ScalarDB documentation repository, [docs-internal-scalardb](https://github.com/scalar-labs/docs-internal-scalardb). Please update this documentation in that repository instead. -> -> To view the ScalarDB documentation, visit [ScalarDB Documentation](https://scalardb.scalar-labs.com/docs/). - -# How to Back Up and Restore Databases Used Through ScalarDB - -Since ScalarDB provides transaction capabilities on top of non-transactional or transactional databases non-invasively, you need to take special care to back up and restore the databases in a transactionally consistent way. - -This guide describes how to back up and restore the databases that ScalarDB supports. - -## Create a backup - -How you create a backup depends on which database you're using and whether or not you're using multiple databases. The following decision tree shows which approach you should take. - -```mermaid -flowchart TD - A[Are you using a single database with ScalarDB?] - A -->|Yes| B[Does the database have transaction support?] - B -->|Yes| C[Perform back up without explicit pausing] - B ---->|No| D[Perform back up with explicit pausing] - A ---->|No| D -``` - -### Back up without explicit pausing - -If you're using ScalarDB with a single database with support for transactions, you can create a backup of the database even while ScalarDB continues to accept transactions. - -{% capture notice--warning %} -**Attention** - -Before creating a backup, you should consider the safest way to create a transactionally consistent backup of your databases and understand any risks that are associated with the backup process. -{% endcapture %} - -
{{ notice--warning | markdownify }}
- -One requirement for creating a backup in ScalarDB is that backups for all the ScalarDB-managed tables (including the Coordinator table) need to be transactionally consistent or automatically recoverable to a transactionally consistent state. That means that you need to create a consistent backup by dumping all tables in a single transaction. - -How you create a transactionally consistent backup depends on the type of database that you're using. Select a database to see how to create a transactionally consistent backup for ScalarDB. - -{% capture notice--info %} -**Note** - -The backup methods by database listed below are just examples of some of the databases that ScalarDB supports. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -
-
- - - - - -
- -
- -You can restore to any point within the backup retention period by using the automated backup feature. -
-
- -Use the `mysqldump` command with the `--single-transaction` option. -
-
- -Use the `pg_dump` command. -
-
- -Use the `.backup` command with the `.timeout` command as specified in [Special commands to sqlite3 (dot-commands)](https://www.sqlite.org/cli.html#special_commands_to_sqlite3_dot_commands_) - -For an example, see [BASH: SQLite3 .backup command](https://stackoverflow.com/questions/23164445/bash-sqlite3-backup-command). -
-
- -Clusters are backed up automatically based on the backup policy, and these backups are retained for a specific duration. You can also perform on-demand backups. For details on performing backups, see [YugabyteDB Managed: Back up and restore clusters](https://docs.yugabyte.com/preview/yugabyte-cloud/cloud-clusters/backup-clusters/). -
-
- -### Back up with explicit pausing - -Another way to create a transactionally consistent backup is to create a backup while a cluster of ScalarDB instances does not have any outstanding transactions. Creating the backup depends on the following: - -- If the underlying database has a point-in-time snapshot or backup feature, you can create a backup during the period when no outstanding transactions exist. -- If the underlying database has a point-in-time restore or recovery (PITR) feature, you can set a restore point to a time (preferably the mid-time) in the pause duration period when no outstanding transactions exist. - -{% capture notice--info %} -**Note** - -When using a PITR feature, you should minimize the clock drifts between clients and servers by using clock synchronization, such as NTP. Otherwise, the time you get as the paused duration might be too different from the time in which the pause was actually conducted, which could restore the backup to a point where ongoing transactions exist. - -In addition, you should pause for a sufficient amount of time (for example, five seconds) and use the mid-time of the paused duration as a restore point since clock synchronization cannot perfectly synchronize clocks between nodes. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -To make ScalarDB drain outstanding requests and stop accepting new requests so that a pause duration can be created, you should implement the [Scalar Admin](https://github.com/scalar-labs/scalar-admin) interface properly in your application that uses ScalarDB or use [ScalarDB Cluster (redirects to the Enterprise docs site)](https://scalardb.scalar-labs.com/docs/latest/scalardb-cluster/), which implements the Scalar Admin interface. - -By using the [Scalar Admin client tool](https://github.com/scalar-labs/scalar-admin/tree/main/java#scalar-admin-client-tool), you can pause nodes, servers, or applications that implement the Scalar Admin interface without losing ongoing transactions. - -How you create a transactionally consistent backup depends on the type of database that you're using. Select a database to see how to create a transactionally consistent backup for ScalarDB. - -{% capture notice--warning %} -**Note** - -The backup methods by database listed below are just examples of some of the databases that ScalarDB supports. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -
-
- - - - -
- -
- -Cassandra has a built-in replication feature, so you do not always have to create a transactionally consistent backup. For example, if the replication factor is set to `3` and only the data of one of the nodes in a Cassandra cluster is lost, you won't need a transactionally consistent backup (snapshot) because the node can be recovered by using a normal, transactionally inconsistent backup (snapshot) and the repair feature. - -However, if the quorum of cluster nodes loses their data, you will need a transactionally consistent backup (snapshot) to restore the cluster to a certain transactionally consistent point. - -To create a transactionally consistent cluster-wide backup (snapshot), pause the application that is using ScalarDB or [ScalarDB Cluster (redirects to the Enterprise docs site)](https://scalardb.scalar-labs.com/docs/latest/scalardb-cluster/) and create backups (snapshots) of the nodes as described in [Back up with explicit pausing](#back-up-with-explicit-pausing) or stop the Cassandra cluster, take copies of all the data in the nodes, and start the cluster. -
-
- -You must create a Cosmos DB for NoSQL account with a continuous backup policy that has the PITR feature enabled. After enabling the feature, backups are created continuously. - -To specify a transactionally consistent restore point, pause your application that is using ScalarDB with Cosmos DB for NoSQL as described in [Back up with explicit pausing](#back-up-with-explicit-pausing). -
-
- -You must enable the PITR feature for DynamoDB tables. If you're using [ScalarDB Schema Loader](schema-loader.md) to create schemas, the tool enables the PITR feature for tables by default. - -To specify a transactionally consistent restore point, pause your application that is using ScalarDB with DynamoDB as described in [Back up with explicit pausing](#back-up-with-explicit-pausing). -
-
- -You can perform on-demand backups or scheduled backups during a paused duration. For details on performing backups, see [YugabyteDB Managed: Back up and restore clusters](https://docs.yugabyte.com/preview/yugabyte-cloud/cloud-clusters/backup-clusters/). -
-
- -## Restore a backup - -How you restore a transactionally consistent backup depends on the type of database that you're using. Select a database to see how to create a transactionally consistent backup for ScalarDB. - -{% capture notice--warning %} -**Note** - -The restore methods by database listed below are just examples of some of the databases that ScalarDB supports. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -
-
- - - - - - - - -
- -
- -You can restore to any point within the backup retention period by using the automated backup feature. -
-
- -First, stop all the nodes of the Cassandra cluster. Then, clean the `data`, `commitlog`, and `hints` directories, and place the backups (snapshots) in each node. - -After placing the backups (snapshots) in each node, start all the nodes of the Cassandra Cluster. -
-
- -Follow the official Azure documentation for [restore an account by using Azure portal](https://docs.microsoft.com/en-us/azure/cosmos-db/restore-account-continuous-backup#restore-account-portal). After restoring a backup, [configure the default consistency level](https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/how-to-manage-consistency#configure-the-default-consistency-level) of the restored databases to `STRONG`. In addition, you should use the mid-time of the paused duration as the restore point as previously explained. - -ScalarDB implements the Cosmos DB adapter by using its stored procedures, which are installed when creating schemas by using ScalarDB Schema Loader. However, the PITR feature of Cosmos DB doesn't restore stored procedures. Because of this, you need to re-install the required stored procedures for all tables after restoration. You can do this by using ScalarDB Schema Loader with the `--repair-all` option. For details, see [Repair tables](schema-loader.md#repair-tables). -
-
- -Follow the official AWS documentation for [restoring a DynamoDB table to a point in time](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/PointInTimeRecovery.Tutorial.html), but keep in mind that a table can only be restored with an alias. Because of this, you will need to restore the table with an alias, delete the original table, and rename the alias to the original name to restore the tables with the same name. - -To do this procedure: - -1. Create a backup. - 1. Select the mid-time of the paused duration as the restore point. - 2. Restore by using the PITR of table A to table B. - 3. Create a backup of the restored table B (assuming that the backup is named backup B). - 4. Remove table B. -2. Restore the backup. - 1. Remove table A. - 2. Create a table named A by using backup B. - -{% capture notice--info %} -**Note** - -* You must do the steps mentioned above for each table because tables can only be restored one at a time. -* Configurations such as PITR and auto-scaling policies are reset to the default values for restored tables, so you must manually configure the required settings. For details, see the official AWS documentation for [How to restore DynamoDB tables with DynamoDB](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/CreateBackup.html#CreateBackup_HowItWorks-restore). -{% endcapture %} - -
{{ notice--info | markdownify }}
- -
-
- -If you used `mysqldump` to create the backup file, use the `mysql` command to restore the backup as specified in [Reloading SQL-Format Backups](https://dev.mysql.com/doc/mysql-backup-excerpt/8.0/en/reloading-sql-format-dumps.html). -
-
- -If you used `pg_dump` to create the backup file, use the `psql` command to restore the backup as specified in [Restoring the Dump](https://www.postgresql.org/docs/current/backup-dump.html#BACKUP-DUMP-RESTORE). -
-
- -Use the `.restore` command as specified in [Special commands to sqlite3 (dot-commands)](https://www.sqlite.org/cli.html#special_commands_to_sqlite3_dot_commands_). -
-
- -You can restore from the scheduled or on-demand backup within the backup retention period. For details on performing backups, see [YugabyteDB Managed: Back up and restore clusters](https://docs.yugabyte.com/preview/yugabyte-cloud/cloud-clusters/backup-clusters/). -
-
diff --git a/docs/configurations.md b/docs/configurations.md deleted file mode 100644 index b801812ec1..0000000000 --- a/docs/configurations.md +++ /dev/null @@ -1,288 +0,0 @@ -> [!CAUTION] -> -> This documentation has been moved to the centralized ScalarDB documentation repository, [docs-internal-scalardb](https://github.com/scalar-labs/docs-internal-scalardb). Please update this documentation in that repository instead. -> -> To view the ScalarDB documentation, visit [ScalarDB Documentation](https://scalardb.scalar-labs.com/docs/). - -# ScalarDB Configurations - -This page describes the available configurations for ScalarDB. - -## ScalarDB client configurations - -ScalarDB provides its own transaction protocol called Consensus Commit. You can use the Consensus Commit protocol directly through the ScalarDB client library or through [ScalarDB Cluster (redirects to the Enterprise docs site)](https://scalardb.scalar-labs.com/docs/latest/scalardb-cluster/), which is a component that is available only in the ScalarDB Enterprise edition. - -### Use Consensus Commit directly - -Consensus Commit is the default transaction manager type in ScalarDB. To use the Consensus Commit transaction manager, add the following to the ScalarDB properties file: - -```properties -scalar.db.transaction_manager=consensus-commit -``` - -{% capture notice--info %} -**Note** - -If you don't specify the `scalar.db.transaction_manager` property, `consensus-commit` will be the default value. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -#### Basic configurations - -The following basic configurations are available for the Consensus Commit transaction manager: - -| Name | Description | Default | -|-------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------| -| `scalar.db.transaction_manager` | `consensus-commit` should be specified. | - | -| `scalar.db.consensus_commit.isolation_level` | Isolation level used for Consensus Commit. Either `SNAPSHOT` or `SERIALIZABLE` can be specified. | `SNAPSHOT` | -| `scalar.db.consensus_commit.serializable_strategy` | Serializable strategy used for Consensus Commit. Either `EXTRA_READ` or `EXTRA_WRITE` can be specified. If `SNAPSHOT` is specified in the property `scalar.db.consensus_commit.isolation_level`, this configuration will be ignored. | `EXTRA_READ` | -| `scalar.db.consensus_commit.coordinator.namespace` | Namespace name of Coordinator tables. | `coordinator` | -| `scalar.db.consensus_commit.include_metadata.enabled` | If set to `true`, `Get` and `Scan` operations results will contain transaction metadata. To see the transaction metadata columns details for a given table, you can use the `DistributedTransactionAdmin.getTableMetadata()` method, which will return the table metadata augmented with the transaction metadata columns. Using this configuration can be useful to investigate transaction-related issues. | `false` | - -#### Performance-related configurations - -The following performance-related configurations are available for the Consensus Commit transaction manager: - -| Name | Description | Default | -|----------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------| -| `scalar.db.consensus_commit.parallel_executor_count` | Number of executors (threads) for parallel execution. | `128` | -| `scalar.db.consensus_commit.parallel_preparation.enabled` | Whether or not the preparation phase is executed in parallel. | `true` | -| `scalar.db.consensus_commit.parallel_validation.enabled` | Whether or not the validation phase (in `EXTRA_READ`) is executed in parallel. | The value of `scalar.db.consensus_commit.parallel_commit.enabled` | -| `scalar.db.consensus_commit.parallel_commit.enabled` | Whether or not the commit phase is executed in parallel. | `true` | -| `scalar.db.consensus_commit.parallel_rollback.enabled` | Whether or not the rollback phase is executed in parallel. | The value of `scalar.db.consensus_commit.parallel_commit.enabled` | -| `scalar.db.consensus_commit.async_commit.enabled` | Whether or not the commit phase is executed asynchronously. | `false` | -| `scalar.db.consensus_commit.async_rollback.enabled` | Whether or not the rollback phase is executed asynchronously. | The value of `scalar.db.consensus_commit.async_commit.enabled` | -| `scalar.db.consensus_commit.parallel_implicit_pre_read.enabled` | Whether or not implicit pre-read is executed in parallel. | `true` | -| `scalar.db.consensus_commit.coordinator.group_commit.enabled` | Whether or not committing the transaction state is executed in batch mode. This feature can't be used with a two-phase commit interface. | `false` | -| `scalar.db.consensus_commit.coordinator.group_commit.slot_capacity` | Maximum number of slots in a group for the group commit feature. A large value improves the efficiency of group commit, but may also increase latency and the likelihood of transaction conflicts.[^1] | `20` | -| `scalar.db.consensus_commit.coordinator.group_commit.group_size_fix_timeout_millis` | Timeout to fix the size of slots in a group. A large value improves the efficiency of group commit, but may also increase latency and the likelihood of transaction conflicts.[^1] | `40` | -| `scalar.db.consensus_commit.coordinator.group_commit.delayed_slot_move_timeout_millis` | Timeout to move delayed slots from a group to another isolated group to prevent the original group from being affected by delayed transactions. A large value improves the efficiency of group commit, but may also increase the latency and the likelihood of transaction conflicts.[^1] | `1200` | -| `scalar.db.consensus_commit.coordinator.group_commit.old_group_abort_timeout_millis` | Timeout to abort an old ongoing group. A small value reduces resource consumption through aggressive aborts, but may also increase the likelihood of unnecessary aborts for long-running transactions. | `60000` | -| `scalar.db.consensus_commit.coordinator.group_commit.timeout_check_interval_millis` | Interval for checking the group commit–related timeouts. | `20` | -| `scalar.db.consensus_commit.coordinator.group_commit.metrics_monitor_log_enabled` | Whether or not the metrics of the group commit are logged periodically. | `false` | - -#### Underlying storage or database configurations - -Consensus Commit has a storage abstraction layer and supports multiple underlying storages. You can specify the storage implementation by using the `scalar.db.storage` property. - -Select a database to see the configurations available for each storage. - -
-
- - - - -
- -
- -The following configurations are available for Cassandra: - -| Name | Description | Default | -|-----------------------------------------|-----------------------------------------------------------------------|------------| -| `scalar.db.storage` | `cassandra` must be specified. | - | -| `scalar.db.contact_points` | Comma-separated contact points. | | -| `scalar.db.contact_port` | Port number for all the contact points. | | -| `scalar.db.username` | Username to access the database. | | -| `scalar.db.password` | Password to access the database. | | - -
-
- -The following configurations are available for CosmosDB for NoSQL: - -| Name | Description | Default | -|--------------------------------------|----------------------------------------------------------------------------------------------------------|----------| -| `scalar.db.storage` | `cosmos` must be specified. | - | -| `scalar.db.contact_points` | Azure Cosmos DB for NoSQL endpoint with which ScalarDB should communicate. | | -| `scalar.db.password` | Either a master or read-only key used to perform authentication for accessing Azure Cosmos DB for NoSQL. | | -| `scalar.db.cosmos.consistency_level` | Consistency level used for Cosmos DB operations. `STRONG` or `BOUNDED_STALENESS` can be specified. | `STRONG` | - -
-
- -The following configurations are available for DynamoDB: - -| Name | Description | Default | -|---------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------| -| `scalar.db.storage` | `dynamo` must be specified. | - | -| `scalar.db.contact_points` | AWS region with which ScalarDB should communicate (e.g., `us-east-1`). | | -| `scalar.db.username` | AWS access key used to identify the user interacting with AWS. | | -| `scalar.db.password` | AWS secret access key used to authenticate the user interacting with AWS. | | -| `scalar.db.dynamo.endpoint_override` | Amazon DynamoDB endpoint with which ScalarDB should communicate. This is primarily used for testing with a local instance instead of an AWS service. | | -| `scalar.db.dynamo.namespace.prefix` | Prefix for the user namespaces and metadata namespace names. Since AWS requires having unique tables names in a single AWS region, this is useful if you want to use multiple ScalarDB environments (development, production, etc.) in a single AWS region. | | - -
-
- -The following configurations are available for JDBC databases: - -| Name | Description | Default | -|-----------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------| -| `scalar.db.storage` | `jdbc` must be specified. | - | -| `scalar.db.contact_points` | JDBC connection URL. | | -| `scalar.db.username` | Username to access the database. | | -| `scalar.db.password` | Password to access the database. | | -| `scalar.db.jdbc.connection_pool.min_idle` | Minimum number of idle connections in the connection pool. | `20` | -| `scalar.db.jdbc.connection_pool.max_idle` | Maximum number of connections that can remain idle in the connection pool. | `50` | -| `scalar.db.jdbc.connection_pool.max_total` | Maximum total number of idle and borrowed connections that can be active at the same time for the connection pool. Use a negative value for no limit. | `100` | -| `scalar.db.jdbc.prepared_statements_pool.enabled` | Setting this property to `true` enables prepared-statement pooling. | `false` | -| `scalar.db.jdbc.prepared_statements_pool.max_open` | Maximum number of open statements that can be allocated from the statement pool at the same time. Use a negative value for no limit. | `-1` | -| `scalar.db.jdbc.isolation_level` | Isolation level for JDBC. `READ_UNCOMMITTED`, `READ_COMMITTED`, `REPEATABLE_READ`, or `SERIALIZABLE` can be specified. | Underlying-database specific | -| `scalar.db.jdbc.table_metadata.connection_pool.min_idle` | Minimum number of idle connections in the connection pool for the table metadata. | `5` | -| `scalar.db.jdbc.table_metadata.connection_pool.max_idle` | Maximum number of connections that can remain idle in the connection pool for the table metadata. | `10` | -| `scalar.db.jdbc.table_metadata.connection_pool.max_total` | Maximum total number of idle and borrowed connections that can be active at the same time for the connection pool for the table metadata. Use a negative value for no limit. | `25` | -| `scalar.db.jdbc.admin.connection_pool.min_idle` | Minimum number of idle connections in the connection pool for admin. | `5` | -| `scalar.db.jdbc.admin.connection_pool.max_idle` | Maximum number of connections that can remain idle in the connection pool for admin. | `10` | -| `scalar.db.jdbc.admin.connection_pool.max_total` | Maximum total number of idle and borrowed connections that can be active at the same time for the connection pool for admin. Use a negative value for no limit. | `25` | - -{% capture notice--info %} -**Note** - -If you're using SQLite3 as a JDBC database, you must set `scalar.db.contact_points` as follows: - -```properties -scalar.db.contact_points=jdbc:sqlite:.sqlite3?busy_timeout=10000 -``` - -Unlike other JDBC databases, [SQLite3 doesn't fully support concurrent access](https://www.sqlite.org/lang_transaction.html). -To avoid frequent errors caused internally by [`SQLITE_BUSY`](https://www.sqlite.org/rescode.html#busy), we recommend setting a [`busy_timeout`](https://www.sqlite.org/c3ref/busy_timeout.html) parameter. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -
-
- -##### Multi-storage support - -ScalarDB supports using multiple storage implementations simultaneously. You can use multiple storages by specifying `multi-storage` as the value for the `scalar.db.storage` property. - -For details about using multiple storages, see [Multi-Storage Transactions](multi-storage-transactions.md). - -### Use Consensus Commit through ScalarDB Cluster - -[ScalarDB Cluster (redirects to the Enterprise docs site)](https://scalardb.scalar-labs.com/docs/latest/scalardb-cluster/) is a component that provides a gRPC interface to ScalarDB. - -For details about client configurations, see the ScalarDB Cluster [client configurations (redirects to the Enterprise docs site)](https://scalardb.scalar-labs.com/docs/latest/scalardb-cluster/developer-guide-for-scalardb-cluster-with-java-api/#client-configurations). - -## Cross-partition scan configurations - -By enabling the cross-partition scan option as described below, the `Scan` operation can retrieve all records across partitions. In addition, you can specify arbitrary conditions and orderings in the cross-partition `Scan` operation by enabling `cross_partition_scan.filtering` and `cross_partition_scan.ordering`, respectively. Currently, the cross-partition scan with ordering is available only for JDBC databases. To enable filtering and ordering, `scalar.db.cross_partition_scan.enabled` must be set to `true`. - -For details on how to use cross-partition scan, see [Scan operation](./api-guide.md#scan-operation). - -{% capture notice--warning %} -**Attention** - -For non-JDBC databases, we do not recommend enabling cross-partition scan with the `SERIALIAZABLE` isolation level because transactions could be executed at a lower isolation level (that is, `SNAPSHOT`). When using non-JDBC databases, use cross-partition scan at your own risk only if consistency does not matter for your transactions. -{% endcapture %} - -
{{ notice--warning | markdownify }}
- -| Name | Description | Default | -|----------------------------------------------------|-----------------------------------------------|---------| -| `scalar.db.cross_partition_scan.enabled` | Enable cross-partition scan. | `false` | -| `scalar.db.cross_partition_scan.filtering.enabled` | Enable filtering in cross-partition scan. | `false` | -| `scalar.db.cross_partition_scan.ordering.enabled` | Enable ordering in cross-partition scan. | `false` | - -## Other ScalarDB configurations - -The following are additional configurations available for ScalarDB: - -| Name | Description | Default | -|------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------| -| `scalar.db.metadata.cache_expiration_time_secs` | ScalarDB has a metadata cache to reduce the number of requests to the database. This setting specifies the expiration time of the cache in seconds. | `-1` (no expiration) | -| `scalar.db.active_transaction_management.expiration_time_millis` | ScalarDB maintains ongoing transactions, which can be resumed by using a transaction ID. This setting specifies the expiration time of this transaction management feature in milliseconds. | `-1` (no expiration) | -| `scalar.db.default_namespace_name` | The given namespace name will be used by operations that do not already specify a namespace. | | -| `scalar.db.system_namespace_name` | The given namespace name will be used by ScalarDB internally. | `scalardb` | - -## Placeholder usage - -You can use placeholders in the values, and they are replaced with environment variables (`${env:}`) or system properties (`${sys:}`). You can also specify default values in placeholders like `${sys::-}`. - -The following is an example of a configuration that uses placeholders: - -```properties -scalar.db.username=${env::-admin} -scalar.db.password=${env:} -``` - -In this example configuration, ScalarDB reads the username and password from environment variables. If the environment variable `SCALAR_DB_USERNAME` does not exist, ScalarDB uses the default value `admin`. - -## Configuration examples - -This section provides some configuration examples. - -### Configuration example #1 - App and database - -```mermaid -flowchart LR - app["App
(ScalarDB library with
Consensus Commit)"] - db[(Underlying storage or database)] - app --> db -``` - -In this example configuration, the app (ScalarDB library with Consensus Commit) connects to an underlying storage or database (in this case, Cassandra) directly. - -{% capture notice--warning %} -**Attention** - -This configuration exists only for development purposes and isn’t suitable for a production environment. This is because the app needs to implement the [Scalar Admin](https://github.com/scalar-labs/scalar-admin) interface to take transactionally consistent backups for ScalarDB, which requires additional configurations. -{% endcapture %} - -
{{ notice--warning | markdownify }}
- -The following is an example of the configuration for connecting the app to the underlying database through ScalarDB: - -```properties -# Transaction manager implementation. -scalar.db.transaction_manager=consensus-commit - -# Storage implementation. -scalar.db.storage=cassandra - -# Comma-separated contact points. -scalar.db.contact_points= - -# Credential information to access the database. -scalar.db.username= -scalar.db.password= -``` - -### Configuration example #2 - App, ScalarDB Cluster, and database - -```mermaid -flowchart LR - app["App -
ScalarDB library with gRPC"] - cluster["ScalarDB Cluster -
(ScalarDB library with
Consensus Commit)"] - db[(Underlying storage or database)] - app --> cluster --> db -``` - -In this example configuration, the app (ScalarDB library with gRPC) connects to an underlying storage or database (in this case, Cassandra) through ScalarDB Cluster, which is a component that is available only in the ScalarDB Enterprise edition. - -{% capture notice--info %} -**Note** - -This configuration is acceptable for production use because ScalarDB Cluster implements the [Scalar Admin](https://github.com/scalar-labs/scalar-admin) interface, which enables you to take transactionally consistent backups for ScalarDB by pausing ScalarDB Cluster. - -{% endcapture %} - -
{{ notice--info | markdownify }}
- -The following is an example of the configuration for connecting the app to the underlying database through ScalarDB Cluster: - -```properties -# Transaction manager implementation. -scalar.db.transaction_manager=cluster - -# Contact point of the cluster. -scalar.db.contact_points=indirect: -``` - -For details about client configurations, see the ScalarDB Cluster [client configurations (redirects to the Enterprise docs site)](https://scalardb.scalar-labs.com/docs/latest/scalardb-cluster/developer-guide-for-scalardb-cluster-with-java-api/#client-configurations). - -[^1]: It's worth benchmarking the performance with a few variations (for example, 75% and 125% of the default value) on the same underlying storage that your application uses, considering your application's access pattern, to determine the optimal configuration as it really depends on those factors. Also, it's important to benchmark combinations of these parameters (for example, first, `slot_capacity:20` and `group_size_fix_timeout_millis:40`; second, `slot_capacity:30` and `group_size_fix_timeout_millis:40`; and third, `slot_capacity:20` and `group_size_fix_timeout_millis:80`) to determine the optimal combination. diff --git a/docs/data-modeling.md b/docs/data-modeling.md deleted file mode 100644 index 1a3f65929a..0000000000 --- a/docs/data-modeling.md +++ /dev/null @@ -1,130 +0,0 @@ -> [!CAUTION] -> -> This documentation has been moved to the centralized ScalarDB documentation repository, [docs-internal-scalardb](https://github.com/scalar-labs/docs-internal-scalardb). Please update this documentation in that repository instead. -> -> To view the ScalarDB documentation, visit [ScalarDB Documentation](https://scalardb.scalar-labs.com/docs/). - -# Model Your Data - -Data modeling (or in other words, designing your database schemas) is the process of conceptualizing and visualizing how data will be stored and used by identifying the patterns used to access data and the types of queries to be performed within business operations. - -This page first explains the ScalarDB data model and then describes how to design your database schemas based on the data model. - -## ScalarDB data model - -ScalarDB's data model is an extended key-value model inspired by the Bigtable data model. It is similar to the relational model but differs in several ways, as described below. The data model is chosen to abstract various databases, such as relational databases, NoSQL databases, and NewSQL databases. - -The following diagram shows an example of ScalarDB tables, each of which is a collection of records. This section first explains what objects, such as tables and records, ScalarDB defines and then describes how to locate the records. - -![ScalarDB data model](images/scalardb_data_model.png) - -### Objects in ScalarDB - -The ScalarDB data model has several objects. - -#### Namespace - -A namespace is a collection of tables analogous to an SQL namespace or database. - -#### Table - -A table is a collection of partitions. A namespace most often contains one or more tables, each identified by a name. - -#### Partition - -A partition is a collection of records and a unit of distribution to nodes, whether logical or physical. Therefore, records within the same partition are placed in the same node. ScalarDB assumes multiple partitions are distributed by hashing. - -#### Record / row - -A record or row is a set of columns that is uniquely identifiable among all other records. - -#### Column - -A column is a fundamental data element and does not need to be broken down any further. Each record is composed of one or more columns. Each column has a data type. For details about the data type, refer to [Data-type mapping between ScalarDB and other databases](schema-loader.md#data-type-mapping-between-scalardb-and-other-databases). - -#### Secondary index - -A secondary index is a sorted copy of a column in a single base table. Each index entry is linked to a corresponding table partition. ScalarDB currently doesn't support multi-column indexes, so it can create indexes with only one column. - -### How to locate records - -This section discusses how to locate records from a table. - -#### Primary key - -A primary key uniquely identifies each record; no two records can have the same primary key. Therefore, you can locate a record by specifying a primary key. A primary key comprises a partition key and, optionally, a clustering key. - -#### Partition key - -A partition key uniquely identifies a partition. A partition key comprises a set of columns, which are called partition key columns. When you specify only a partition key, you can get a set of records that belong to the partition. - -#### Clustering key - -A clustering key uniquely identifies a record within a partition. It comprises a set of columns called clustering-key columns. When you want to specify a clustering key, you should specify a partition key for efficient lookups. When you specify a clustering key without a partition key, you end up scanning all the partitions. Scanning all the partitions is time consuming, especially when the amount of data is large, so only do so at your own discretion. - -Records within a partition are assumed to be sorted by clustering-key columns, specified as a clustering order. Therefore, you can specify a part of clustering-key columns in the defined order to narrow down the results to be returned. - -#### Index key - -An index key identifies records by looking up the key in indexes. An index key lookup spans all the partitions, so it is not necessarily efficient, especially if the selectivity of a lookup is not low. - -## How to design your database schemas - -You can design your database schemas similarly to the relational model, but there is a basic principle and are a few best practices to follow. - -### Query-driven data modeling - -In relational databases, data is organized in normalized tables with foreign keys used to reference related data in other tables. The queries that the application will make are structured by the tables, and the related data is queried as table joins. - -Although ScalarDB supports join operations in ScalarDB SQL, data modeling should be more query-driven, like NoSQL databases. The data access patterns and application queries should determine the structure and organization of tables. - -### Best practices - -This section describes best practices for designing your database schemas. - -#### Consider data distribution - -Preferably, you should try to balance loads to partitions by properly selecting partition and clustering keys. - -For example, in a banking application, if you choose an account ID as a partition key, you can perform any account operations for a specific account within the partition to which the account belongs. So, if you operate on different account IDs, you will access different partitions. - -On the other hand, if you choose a branch ID as a partition key and an account ID as a clustering key, all the accesses to a branch's account IDs go to the same partition, causing an imbalance in loads and data sizes. In addition, you should choose a high-cardinality column as a partition key because creating a small number of large partitions also causes an imbalance in loads and data sizes. - -#### Try to read a single partition - -Because of the data model characteristics, single partition lookup is most efficient. If you need to issue a scan or select a request that requires multi-partition lookups or scans, which you can [enable with cross-partition scan](configurations.md#cross-partition-scan-configurations), do so at your own discretion and consider updating the schemas if possible. - -For example, in a banking application, if you choose email as a partition key and an account ID as a clustering key, and issue a query that specifies an account ID, the query will span all the partitions because it cannot identify the corresponding partition efficiently. In such a case, you should always look up the table with an account ID. - -:::note - -If you read multiple partitions on a relational database with proper indexes, your query might be efficient because the query is pushed down to the database. - -::: - -#### Try to avoid using secondary indexes - -Similarly to the above, if you need to issue a scan or select a request that uses a secondary index, the request will span all the partitions of a table. Therefore, you should try to avoid using secondary indexes. If you need to use a secondary index, use it through a low-selectivity query, which looks up a small portion. - -As an alternative to secondary indexes, you can create another table that works as a clustered index of a base table. - -For example, assume there is a table with three columns: `table1(A, B, C)`, with the primary key `A`. Then, you can create a table like `index-table1(C, A, B)` with `C` as the primary key so that you can look up a single partition by specifying a value for `C`. This approach could speed up read queries but might create more load to write queries because you need to write to two tables by using ScalarDB transactions. - -:::note - -There are plans to have a table-based secondary-index feature in ScalarDB in the future. - -::: - -#### Consider data is assumed to be distributed by hashing - -In the current ScalarDB data model, data is assumed to be distributed by hashing. Therefore, you can't perform range queries efficiently without a partition key. - -If you want to issue range queries efficiently, you need to do so within a partition. However, if you follow this approach, you must specify a partition key. This can pose scalability issues as the range queries always go to the same partition, potentially overloading it. This limitation is not specific to ScalarDB but to databases where data is distributed by hashing for scalability. - -:::note - -If you run ScalarDB on a relational database with proper indexes, your range query might be efficient because the query is pushed down to the database. - -::: - diff --git a/docs/design.md b/docs/design.md deleted file mode 100644 index 6271b5fb67..0000000000 --- a/docs/design.md +++ /dev/null @@ -1,12 +0,0 @@ -> [!CAUTION] -> -> This documentation has been moved to the centralized ScalarDB documentation repository, [docs-internal-scalardb](https://github.com/scalar-labs/docs-internal-scalardb). Please update this documentation in that repository instead. -> -> To view the ScalarDB documentation, visit [ScalarDB Documentation](https://scalardb.scalar-labs.com/docs/). - -# ScalarDB Design Document - -For details about the design and implementation of ScalarDB, please see the following documents, which we presented at the VLDB 2023 conference: - -- **Speakerdeck presentation:** [ScalarDB: Universal Transaction Manager for Polystores](https://speakerdeck.com/scalar/scalardb-universal-transaction-manager-for-polystores-vldb23) -- **Detailed paper:** [ScalarDB: Universal Transaction Manager for Polystores](https://www.vldb.org/pvldb/vol16/p3768-yamada.pdf) diff --git a/docs/getting-started-kotlin/build.gradle.kts b/docs/getting-started-kotlin/build.gradle.kts deleted file mode 100644 index 1bc59a683f..0000000000 --- a/docs/getting-started-kotlin/build.gradle.kts +++ /dev/null @@ -1,35 +0,0 @@ -import org.jetbrains.kotlin.gradle.tasks.KotlinCompile - -plugins { - kotlin("jvm") version "1.8.21" - application -} - -group = "com.scalar-labs" -version = "1.0-SNAPSHOT" - -repositories { - mavenCentral() -} - -dependencies { - implementation("com.scalar-labs", "scalardb", "3.13.0") - testImplementation(kotlin("test")) -} - -tasks.test { - useJUnitPlatform() -} - -tasks.withType { - kotlinOptions.jvmTarget = "1.8" -} - -java { - sourceCompatibility = JavaVersion.VERSION_1_8 - targetCompatibility = JavaVersion.VERSION_1_8 -} - -application { - mainClass.set("MainKt") -} diff --git a/docs/getting-started-kotlin/gradle.properties b/docs/getting-started-kotlin/gradle.properties deleted file mode 100644 index 7fc6f1ff27..0000000000 --- a/docs/getting-started-kotlin/gradle.properties +++ /dev/null @@ -1 +0,0 @@ -kotlin.code.style=official diff --git a/docs/getting-started-kotlin/gradle/wrapper/gradle-wrapper.jar b/docs/getting-started-kotlin/gradle/wrapper/gradle-wrapper.jar deleted file mode 100644 index e6441136f3..0000000000 Binary files a/docs/getting-started-kotlin/gradle/wrapper/gradle-wrapper.jar and /dev/null differ diff --git a/docs/getting-started-kotlin/gradle/wrapper/gradle-wrapper.properties b/docs/getting-started-kotlin/gradle/wrapper/gradle-wrapper.properties deleted file mode 100644 index b82aa23a4f..0000000000 --- a/docs/getting-started-kotlin/gradle/wrapper/gradle-wrapper.properties +++ /dev/null @@ -1,7 +0,0 @@ -distributionBase=GRADLE_USER_HOME -distributionPath=wrapper/dists -distributionUrl=https\://services.gradle.org/distributions/gradle-8.7-bin.zip -networkTimeout=10000 -validateDistributionUrl=true -zipStoreBase=GRADLE_USER_HOME -zipStorePath=wrapper/dists diff --git a/docs/getting-started-kotlin/gradlew b/docs/getting-started-kotlin/gradlew deleted file mode 100755 index 1aa94a4269..0000000000 --- a/docs/getting-started-kotlin/gradlew +++ /dev/null @@ -1,249 +0,0 @@ -#!/bin/sh - -# -# Copyright © 2015-2021 the original authors. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# https://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# - -############################################################################## -# -# Gradle start up script for POSIX generated by Gradle. -# -# Important for running: -# -# (1) You need a POSIX-compliant shell to run this script. If your /bin/sh is -# noncompliant, but you have some other compliant shell such as ksh or -# bash, then to run this script, type that shell name before the whole -# command line, like: -# -# ksh Gradle -# -# Busybox and similar reduced shells will NOT work, because this script -# requires all of these POSIX shell features: -# * functions; -# * expansions «$var», «${var}», «${var:-default}», «${var+SET}», -# «${var#prefix}», «${var%suffix}», and «$( cmd )»; -# * compound commands having a testable exit status, especially «case»; -# * various built-in commands including «command», «set», and «ulimit». -# -# Important for patching: -# -# (2) This script targets any POSIX shell, so it avoids extensions provided -# by Bash, Ksh, etc; in particular arrays are avoided. -# -# The "traditional" practice of packing multiple parameters into a -# space-separated string is a well documented source of bugs and security -# problems, so this is (mostly) avoided, by progressively accumulating -# options in "$@", and eventually passing that to Java. -# -# Where the inherited environment variables (DEFAULT_JVM_OPTS, JAVA_OPTS, -# and GRADLE_OPTS) rely on word-splitting, this is performed explicitly; -# see the in-line comments for details. -# -# There are tweaks for specific operating systems such as AIX, CygWin, -# Darwin, MinGW, and NonStop. -# -# (3) This script is generated from the Groovy template -# https://github.com/gradle/gradle/blob/HEAD/subprojects/plugins/src/main/resources/org/gradle/api/internal/plugins/unixStartScript.txt -# within the Gradle project. -# -# You can find Gradle at https://github.com/gradle/gradle/. -# -############################################################################## - -# Attempt to set APP_HOME - -# Resolve links: $0 may be a link -app_path=$0 - -# Need this for daisy-chained symlinks. -while - APP_HOME=${app_path%"${app_path##*/}"} # leaves a trailing /; empty if no leading path - [ -h "$app_path" ] -do - ls=$( ls -ld "$app_path" ) - link=${ls#*' -> '} - case $link in #( - /*) app_path=$link ;; #( - *) app_path=$APP_HOME$link ;; - esac -done - -# This is normally unused -# shellcheck disable=SC2034 -APP_BASE_NAME=${0##*/} -# Discard cd standard output in case $CDPATH is set (https://github.com/gradle/gradle/issues/25036) -APP_HOME=$( cd "${APP_HOME:-./}" > /dev/null && pwd -P ) || exit - -# Use the maximum available, or set MAX_FD != -1 to use that value. -MAX_FD=maximum - -warn () { - echo "$*" -} >&2 - -die () { - echo - echo "$*" - echo - exit 1 -} >&2 - -# OS specific support (must be 'true' or 'false'). -cygwin=false -msys=false -darwin=false -nonstop=false -case "$( uname )" in #( - CYGWIN* ) cygwin=true ;; #( - Darwin* ) darwin=true ;; #( - MSYS* | MINGW* ) msys=true ;; #( - NONSTOP* ) nonstop=true ;; -esac - -CLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar - - -# Determine the Java command to use to start the JVM. -if [ -n "$JAVA_HOME" ] ; then - if [ -x "$JAVA_HOME/jre/sh/java" ] ; then - # IBM's JDK on AIX uses strange locations for the executables - JAVACMD=$JAVA_HOME/jre/sh/java - else - JAVACMD=$JAVA_HOME/bin/java - fi - if [ ! -x "$JAVACMD" ] ; then - die "ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME - -Please set the JAVA_HOME variable in your environment to match the -location of your Java installation." - fi -else - JAVACMD=java - if ! command -v java >/dev/null 2>&1 - then - die "ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH. - -Please set the JAVA_HOME variable in your environment to match the -location of your Java installation." - fi -fi - -# Increase the maximum file descriptors if we can. -if ! "$cygwin" && ! "$darwin" && ! "$nonstop" ; then - case $MAX_FD in #( - max*) - # In POSIX sh, ulimit -H is undefined. That's why the result is checked to see if it worked. - # shellcheck disable=SC2039,SC3045 - MAX_FD=$( ulimit -H -n ) || - warn "Could not query maximum file descriptor limit" - esac - case $MAX_FD in #( - '' | soft) :;; #( - *) - # In POSIX sh, ulimit -n is undefined. That's why the result is checked to see if it worked. - # shellcheck disable=SC2039,SC3045 - ulimit -n "$MAX_FD" || - warn "Could not set maximum file descriptor limit to $MAX_FD" - esac -fi - -# Collect all arguments for the java command, stacking in reverse order: -# * args from the command line -# * the main class name -# * -classpath -# * -D...appname settings -# * --module-path (only if needed) -# * DEFAULT_JVM_OPTS, JAVA_OPTS, and GRADLE_OPTS environment variables. - -# For Cygwin or MSYS, switch paths to Windows format before running java -if "$cygwin" || "$msys" ; then - APP_HOME=$( cygpath --path --mixed "$APP_HOME" ) - CLASSPATH=$( cygpath --path --mixed "$CLASSPATH" ) - - JAVACMD=$( cygpath --unix "$JAVACMD" ) - - # Now convert the arguments - kludge to limit ourselves to /bin/sh - for arg do - if - case $arg in #( - -*) false ;; # don't mess with options #( - /?*) t=${arg#/} t=/${t%%/*} # looks like a POSIX filepath - [ -e "$t" ] ;; #( - *) false ;; - esac - then - arg=$( cygpath --path --ignore --mixed "$arg" ) - fi - # Roll the args list around exactly as many times as the number of - # args, so each arg winds up back in the position where it started, but - # possibly modified. - # - # NB: a `for` loop captures its iteration list before it begins, so - # changing the positional parameters here affects neither the number of - # iterations, nor the values presented in `arg`. - shift # remove old arg - set -- "$@" "$arg" # push replacement arg - done -fi - - -# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script. -DEFAULT_JVM_OPTS='"-Xmx64m" "-Xms64m"' - -# Collect all arguments for the java command: -# * DEFAULT_JVM_OPTS, JAVA_OPTS, JAVA_OPTS, and optsEnvironmentVar are not allowed to contain shell fragments, -# and any embedded shellness will be escaped. -# * For example: A user cannot expect ${Hostname} to be expanded, as it is an environment variable and will be -# treated as '${Hostname}' itself on the command line. - -set -- \ - "-Dorg.gradle.appname=$APP_BASE_NAME" \ - -classpath "$CLASSPATH" \ - org.gradle.wrapper.GradleWrapperMain \ - "$@" - -# Stop when "xargs" is not available. -if ! command -v xargs >/dev/null 2>&1 -then - die "xargs is not available" -fi - -# Use "xargs" to parse quoted args. -# -# With -n1 it outputs one arg per line, with the quotes and backslashes removed. -# -# In Bash we could simply go: -# -# readarray ARGS < <( xargs -n1 <<<"$var" ) && -# set -- "${ARGS[@]}" "$@" -# -# but POSIX shell has neither arrays nor command substitution, so instead we -# post-process each arg (as a line of input to sed) to backslash-escape any -# character that might be a shell metacharacter, then use eval to reverse -# that process (while maintaining the separation between arguments), and wrap -# the whole thing up as a single "set" statement. -# -# This will of course break if any of these variables contains a newline or -# an unmatched quote. -# - -eval "set -- $( - printf '%s\n' "$DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS" | - xargs -n1 | - sed ' s~[^-[:alnum:]+,./:=@_]~\\&~g; ' | - tr '\n' ' ' - )" '"$@"' - -exec "$JAVACMD" "$@" diff --git a/docs/getting-started-kotlin/gradlew.bat b/docs/getting-started-kotlin/gradlew.bat deleted file mode 100644 index 7101f8e467..0000000000 --- a/docs/getting-started-kotlin/gradlew.bat +++ /dev/null @@ -1,92 +0,0 @@ -@rem -@rem Copyright 2015 the original author or authors. -@rem -@rem Licensed under the Apache License, Version 2.0 (the "License"); -@rem you may not use this file except in compliance with the License. -@rem You may obtain a copy of the License at -@rem -@rem https://www.apache.org/licenses/LICENSE-2.0 -@rem -@rem Unless required by applicable law or agreed to in writing, software -@rem distributed under the License is distributed on an "AS IS" BASIS, -@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -@rem See the License for the specific language governing permissions and -@rem limitations under the License. -@rem - -@if "%DEBUG%"=="" @echo off -@rem ########################################################################## -@rem -@rem Gradle startup script for Windows -@rem -@rem ########################################################################## - -@rem Set local scope for the variables with windows NT shell -if "%OS%"=="Windows_NT" setlocal - -set DIRNAME=%~dp0 -if "%DIRNAME%"=="" set DIRNAME=. -@rem This is normally unused -set APP_BASE_NAME=%~n0 -set APP_HOME=%DIRNAME% - -@rem Resolve any "." and ".." in APP_HOME to make it shorter. -for %%i in ("%APP_HOME%") do set APP_HOME=%%~fi - -@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script. -set DEFAULT_JVM_OPTS="-Xmx64m" "-Xms64m" - -@rem Find java.exe -if defined JAVA_HOME goto findJavaFromJavaHome - -set JAVA_EXE=java.exe -%JAVA_EXE% -version >NUL 2>&1 -if %ERRORLEVEL% equ 0 goto execute - -echo. 1>&2 -echo ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH. 1>&2 -echo. 1>&2 -echo Please set the JAVA_HOME variable in your environment to match the 1>&2 -echo location of your Java installation. 1>&2 - -goto fail - -:findJavaFromJavaHome -set JAVA_HOME=%JAVA_HOME:"=% -set JAVA_EXE=%JAVA_HOME%/bin/java.exe - -if exist "%JAVA_EXE%" goto execute - -echo. 1>&2 -echo ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME% 1>&2 -echo. 1>&2 -echo Please set the JAVA_HOME variable in your environment to match the 1>&2 -echo location of your Java installation. 1>&2 - -goto fail - -:execute -@rem Setup the command line - -set CLASSPATH=%APP_HOME%\gradle\wrapper\gradle-wrapper.jar - - -@rem Execute Gradle -"%JAVA_EXE%" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% "-Dorg.gradle.appname=%APP_BASE_NAME%" -classpath "%CLASSPATH%" org.gradle.wrapper.GradleWrapperMain %* - -:end -@rem End local scope for the variables with windows NT shell -if %ERRORLEVEL% equ 0 goto mainEnd - -:fail -rem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of -rem the _cmd.exe /c_ return code! -set EXIT_CODE=%ERRORLEVEL% -if %EXIT_CODE% equ 0 set EXIT_CODE=1 -if not ""=="%GRADLE_EXIT_CONSOLE%" exit %EXIT_CODE% -exit /b %EXIT_CODE% - -:mainEnd -if "%OS%"=="Windows_NT" endlocal - -:omega diff --git a/docs/getting-started-kotlin/scalardb.properties b/docs/getting-started-kotlin/scalardb.properties deleted file mode 100644 index b1dbcbdec6..0000000000 --- a/docs/getting-started-kotlin/scalardb.properties +++ /dev/null @@ -1,12 +0,0 @@ -# Comma-separated contact points -scalar.db.contact_points=localhost - -# Port number for all the contact points. Default port number for each database is used if empty. -#scalar.db.contact_port= - -# Credential information to access the database -scalar.db.username=cassandra -scalar.db.password=cassandra - -# Storage implementation. Either cassandra or cosmos or dynamo or jdbc can be set. Default storage is cassandra. -#scalar.db.storage=cassandra diff --git a/docs/getting-started-kotlin/settings.gradle.kts b/docs/getting-started-kotlin/settings.gradle.kts deleted file mode 100644 index 897772e394..0000000000 --- a/docs/getting-started-kotlin/settings.gradle.kts +++ /dev/null @@ -1,3 +0,0 @@ - -rootProject.name = "scalardb-kotlin-sample" - diff --git a/docs/getting-started-kotlin/src/main/kotlin/Main.kt b/docs/getting-started-kotlin/src/main/kotlin/Main.kt deleted file mode 100644 index 543fe10bf6..0000000000 --- a/docs/getting-started-kotlin/src/main/kotlin/Main.kt +++ /dev/null @@ -1,67 +0,0 @@ -import sample.ElectronicMoney -import java.io.File -import kotlin.system.exitProcess - -fun main(args: Array) { - var action: String? = null - var amount = 0 - var to: String? = null - var from: String? = null - var id: String? = null - var scalarDBProperties: String? = null - var i = 0 - while (i < args.size) { - if ("-action" == args[i]) { - action = args[++i] - } else if ("-amount" == args[i]) { - amount = args[++i].toInt() - } else if ("-to" == args[i]) { - to = args[++i] - } else if ("-from" == args[i]) { - from = args[++i] - } else if ("-id" == args[i]) { - id = args[++i] - } else if ("-config" == args[i]) { - scalarDBProperties = args[++i] - } else if ("-help" == args[i]) { - printUsageAndExit() - return - } - ++i - } - if (action == null) { - printUsageAndExit() - return - } - val eMoney = ElectronicMoney( - scalarDBProperties ?: (System.getProperty("user.dir") + File.separator + "scalardb.properties") - ) - if (action.equals("charge", ignoreCase = true)) { - if (to == null || amount < 0) { - printUsageAndExit() - return - } - eMoney.charge(to, amount) - } else if (action.equals("pay", ignoreCase = true)) { - if (to == null || amount < 0 || from == null) { - printUsageAndExit() - return - } - eMoney.pay(from, to, amount) - } else if (action.equals("getBalance", ignoreCase = true)) { - if (id == null) { - printUsageAndExit() - return - } - val balance = eMoney.getBalance(id) - println("The balance for $id is $balance") - } - eMoney.close() -} - -fun printUsageAndExit() { - System.err.println( - "ElectronicMoneyMain -action charge/pay/getBalance [-amount number (needed for charge and pay)] [-to id (needed for charge and pay)] [-from id (needed for pay)] [-id id (needed for getBalance)]" - ) - exitProcess(1) -} diff --git a/docs/getting-started-kotlin/src/main/kotlin/sample/ElectronicMoney.kt b/docs/getting-started-kotlin/src/main/kotlin/sample/ElectronicMoney.kt deleted file mode 100644 index 15cefaa5f7..0000000000 --- a/docs/getting-started-kotlin/src/main/kotlin/sample/ElectronicMoney.kt +++ /dev/null @@ -1,141 +0,0 @@ -package sample - -import com.scalar.db.api.DistributedTransactionManager -import com.scalar.db.api.Get -import com.scalar.db.api.Put -import com.scalar.db.exception.transaction.TransactionException -import com.scalar.db.io.Key -import com.scalar.db.service.TransactionFactory - -class ElectronicMoney(scalarDBProperties: String) { - companion object { - private const val NAMESPACE = "emoney" - private const val TABLENAME = "account" - private const val ID = "id" - private const val BALANCE = "balance" - } - - private val manager: DistributedTransactionManager - - init { - val factory = TransactionFactory.create(scalarDBProperties) - manager = factory.transactionManager - } - - @Throws(TransactionException::class) - fun charge(id: String, amount: Int) { - // Start a transaction - val tx = manager.start() - try { - // Retrieve the current balance for id - val get = Get.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, id)) - .build() - val result = tx.get(get) - - // Calculate the balance - var balance = amount - if (result.isPresent) { - val current = result.get().getInt(BALANCE) - balance += current - } - - // Update the balance - val put = Put.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, id)) - .intValue(BALANCE, balance) - .build() - tx.put(put) - - // Commit the transaction (records are automatically recovered in case of failure) - tx.commit() - } catch (e: Exception) { - tx.abort() - throw e - } - } - - @Throws(TransactionException::class) - fun pay(fromId: String, toId: String, amount: Int) { - // Start a transaction - val tx = manager.start() - try { - // Retrieve the current balances for ids - val fromGet = Get.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, fromId)) - .build() - val toGet = Get.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, toId)) - .build() - val fromResult = tx.get(fromGet) - val toResult = tx.get(toGet) - - // Calculate the balances (it assumes that both accounts exist) - val newFromBalance = fromResult.get().getInt(BALANCE) - amount - val newToBalance = toResult.get().getInt(BALANCE) + amount - if (newFromBalance < 0) { - throw RuntimeException("$fromId doesn't have enough balance.") - } - - // Update the balances - val fromPut = Put.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, fromId)) - .intValue(BALANCE, newFromBalance) - .build() - val toPut = Put.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, toId)) - .intValue(BALANCE, newToBalance) - .build() - tx.put(fromPut) - tx.put(toPut) - - // Commit the transaction (records are automatically recovered in case of failure) - tx.commit() - } catch (e: Exception) { - tx.abort() - throw e - } - } - - @Throws(TransactionException::class) - fun getBalance(id: String): Int { - // Start a transaction - val tx = manager.start() - return try { - // Retrieve the current balances for id - val get = Get.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, id)) - .build() - val result = tx.get(get) - var balance = -1 - if (result.isPresent) { - balance = result.get().getInt(BALANCE) - } - - // Commit the transaction - tx.commit() - balance - } catch (e: Exception) { - tx.abort() - throw e - } - } - - fun close() { - manager.close() - } -} diff --git a/docs/getting-started-with-scalardb-by-using-kotlin.md b/docs/getting-started-with-scalardb-by-using-kotlin.md deleted file mode 100644 index 302c16a3b6..0000000000 --- a/docs/getting-started-with-scalardb-by-using-kotlin.md +++ /dev/null @@ -1,379 +0,0 @@ -> [!CAUTION] -> -> This documentation has been moved to the centralized ScalarDB documentation repository, [docs-internal-scalardb](https://github.com/scalar-labs/docs-internal-scalardb). Please update this documentation in that repository instead. -> -> To view the ScalarDB documentation, visit [ScalarDB Documentation](https://scalardb.scalar-labs.com/docs/). - -# Getting Started with ScalarDB by Using Kotlin - -This getting started tutorial explains how to configure your preferred database in ScalarDB and set up a basic electronic money application by using Kotlin. Since Kotlin has Java interoperability, you can use ScalarDB directly from Kotlin. - -{% capture notice--warning %} -**Warning** - -The electronic money application is simplified for this tutorial and isn't suitable for a production environment. -{% endcapture %} - -
{{ notice--warning | markdownify }}
- -## Install a JDK - -Because ScalarDB is written in Java, you must have one of the following Java Development Kits (JDKs) installed in your environment: - -- [Oracle JDK](https://www.oracle.com/java/technologies/downloads/) LTS version (8, 11, or 17) -- [OpenJDK](https://openjdk.org/install/) LTS version (8, 11, or 17) - -{% capture notice--info %} -**Note** - -We recommend using the LTS versions mentioned above, but other non-LTS versions may work. - -In addition, other JDKs should work with ScalarDB, but we haven't tested them. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -## Clone the `scalardb` repository - -Open a terminal window, and go to your working directory. Then, clone the [scalardb](https://github.com/scalar-labs/scalardb) repository by running the following command: - -```shell -$ git clone https://github.com/scalar-labs/scalardb -``` - -Then, go to the `scalardb/docs/getting-started-kotlin` directory in the cloned repository by running the following command: - -```shell -$ cd scalardb/docs/getting-started-kotlin -``` - -## Set up your database for ScalarDB - -Select your database, and follow the instructions to configure it for ScalarDB. - -For a list of databases that ScalarDB supports, see [Supported Databases](scalardb-supported-databases.md). - -
-
- - - - -
- -
- -Confirm that you have Cassandra installed. If Cassandra isn't installed, visit [Downloading Cassandra](https://cassandra.apache.org/_/download.html). - -### Configure Cassandra -{:.no_toc} - -Open **cassandra.yaml** in your preferred IDE. Then, change `commitlog_sync` from `periodic` to `batch` so that you don't lose data if a quorum of replica nodes goes down. - -### Configure ScalarDB -{:.no_toc} - -The following instructions assume that you have properly installed and configured the JDK and Cassandra in your local environment, and Cassandra is running on your localhost. - -The **scalardb.properties** file in the `docs/getting-started-kotlin` directory holds database configurations for ScalarDB. The following is a basic configuration for Cassandra. Be sure to change the values for `scalar.db.username` and `scalar.db.password` as described. - -```properties -# The Cassandra storage implementation is used for Consensus Commit. -scalar.db.storage=cassandra - -# Comma-separated contact points. -scalar.db.contact_points=localhost - -# The port number for all the contact points. -scalar.db.contact_port=9042 - -# The username and password to access the database. -scalar.db.username= -scalar.db.password= -``` -
-
- -To use Azure Cosmos DB for NoSQL, you must have an Azure account. If you don't have an Azure account, visit [Create an Azure Cosmos DB account](https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/quickstart-portal#create-account). - -### Configure Cosmos DB for NoSQL -{:.no_toc} - -Set the **default consistency level** to **Strong** according to the official document at [Configure the default consistency level](https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/how-to-manage-consistency#configure-the-default-consistency-level). - -### Configure ScalarDB -{:.no_toc} - -The following instructions assume that you have properly installed and configured the JDK in your local environment and properly configured your Cosmos DB for NoSQL account in Azure. - -The **scalardb.properties** file in the `docs/getting-started-kotlin` directory holds database configurations for ScalarDB. Be sure to change the values for `scalar.db.contact_points` and `scalar.db.password` as described. - -```properties -# The Cosmos DB for NoSQL storage implementation is used for Consensus Commit. -scalar.db.storage=cosmos - -# The Cosmos DB for NoSQL URI. -scalar.db.contact_points= - -# The Cosmos DB for NoSQL key to access the database. -scalar.db.password= -``` - -{% capture notice--info %} -**Note** - -You can use a primary key or a secondary key as the value for `scalar.db.password`. -{% endcapture %} -
{{ notice--info | markdownify }}
-
-
- -To use Amazon DynamoDB, you must have an AWS account. If you don't have an AWS account, visit [Getting started: Are you a first-time AWS user?](https://docs.aws.amazon.com/accounts/latest/reference/welcome-first-time-user.html). - -### Configure ScalarDB -{:.no_toc} - -The following instructions assume that you have properly installed and configured the JDK in your local environment. - -The **scalardb.properties** file in the `docs/getting-started-kotlin` directory holds database configurations for ScalarDB. Be sure to change the values for `scalar.db.contact_points`, `scalar.db.username`, and `scalar.db.password` as described. - -```properties -# The DynamoDB storage implementation is used for Consensus Commit. -scalar.db.storage=dynamo - -# The AWS region. -scalar.db.contact_points= - -# The AWS access key ID and secret access key to access the database. -scalar.db.username= -scalar.db.password= -``` -
-
- -Confirm that you have a JDBC database installed. For a list of supported JDBC databases, see [Supported Databases](scalardb-supported-databases.md). - -### Configure ScalarDB -{:.no_toc} - -The following instructions assume that you have properly installed and configured the JDK and JDBC database in your local environment, and the JDBC database is running on your localhost. - -The **scalardb.properties** file in the `docs/getting-started-kotlin` directory holds database configurations for ScalarDB. The following is a basic configuration for JDBC databases. - -{% capture notice--info %} -**Note** - -Be sure to uncomment the `scalar.db.contact_points` variable and change the value of the JDBC database you are using, and change the values for `scalar.db.username` and `scalar.db.password` as described. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -```properties -# The JDBC database storage implementation is used for Consensus Commit. -scalar.db.storage=jdbc - -# The JDBC database URL for the type of database you are using. -# scalar.db.contact_points=jdbc:mysql://localhost:3306/ -# scalar.db.contact_points=jdbc:oracle:thin:@//localhost:1521/ -# scalar.db.contact_points=jdbc:postgresql://localhost:5432/ -# scalar.db.contact_points=jdbc:sqlserver://localhost:1433; -# scalar.db.contact_points=jdbc:sqlite://localhost:3306.sqlite3?busy_timeout=10000 -# scalar.db.contact_points=jdbc:yugabytedb://127.0.0.1:5433\\,127.0.0.2:5433\\,127.0.0.3:5433/?load-balance=true - -# The username and password for connecting to the database. -scalar.db.username= -scalar.db.password= -``` -
-
- -## Create and load the database schema - -You need to define the database schema (the method in which the data will be organized) in the application. For details about the supported data types, see [Data type mapping between ScalarDB and other databases](schema-loader.md#data-type-mapping-between-scalardb-and-the-other-databases). - -For this tutorial, create a file named **emoney.json** in the `scalardb/docs/getting-started-kotlin` directory. Then, add the following JSON code to define the schema. - -```json -{ - "emoney.account": { - "transaction": true, - "partition-key": [ - "id" - ], - "clustering-key": [], - "columns": { - "id": "TEXT", - "balance": "INT" - } - } -} -``` - -To apply the schema, go to the [`scalardb` Releases](https://github.com/scalar-labs/scalardb/releases) page and download the ScalarDB Schema Loader that matches the version of ScalarDB that you are using to the `getting-started` folder. - -Then, based on your database, run the following command, replacing `` with the version of the ScalarDB Schema Loader that you downloaded: - -
-
- - - - -
- -
- -```shell -$ java -jar scalardb-schema-loader-.jar --config scalardb.properties --schema-file emoney.json --coordinator --replication-factor=1 -``` - -{% capture notice--info %} -**Note** - -The `--coordinator` option is specified because a table with `transaction` set to `true` exists in the schema. For details about configuring and loading a schema, see [ScalarDB Schema Loader](schema-loader.md). - -In addition, the `--replication-factor=1` option has an effect only when using Cassandra. The default replication factor is `3`, but to facilitate the setup in this tutorial, `1` is used so that you only need to prepare a cluster with one node instead of three nodes. However, keep in mind that a replication factor of `1` is not suited for production. -{% endcapture %} - -
{{ notice--info | markdownify }}
-
-
- -```shell -$ java -jar scalardb-schema-loader-.jar --config scalardb.properties --schema-file emoney.json --coordinator -``` - -{% capture notice--info %} -**Note** - -The `--coordinator` option is specified because a table with `transaction` set to `true` exists in the schema. For details about configuring and loading a schema, see [ScalarDB Schema Loader](schema-loader.md). -{% endcapture %} - -
{{ notice--info | markdownify }}
-
-
- -```shell -$ java -jar scalardb-schema-loader-.jar --config scalardb.properties --schema-file emoney.json --coordinator -``` - -{% capture notice--info %} -**Note** - -The `--coordinator` option is specified because a table with `transaction` set to `true` exists in the schema. For details about configuring and loading a schema, see [ScalarDB Schema Loader](schema-loader.md). -{% endcapture %} - -
{{ notice--info | markdownify }}
-
-
- -```shell -$ java -jar scalardb-schema-loader-.jar --config scalardb.properties --schema-file emoney.json --coordinator -``` - -{% capture notice--info %} -**Note** - -The `--coordinator` option is specified because a table with `transaction` set to `true` exists in the schema. For details about configuring and loading a schema, see [ScalarDB Schema Loader](schema-loader.md). -{% endcapture %} - -
{{ notice--info | markdownify }}
-
-
- -## Execute transactions and retrieve data in the basic electronic money application - -After loading the schema, you can execute transactions and retrieve data in the basic electronic money application that is included in the repository that you cloned. - -The application supports the following types of transactions: - -- Create an account. -- Add funds to an account. -- Send funds between two accounts. -- Get an account balance. - -{% capture notice--info %} -**Note** - -When you first execute a Gradle command, Gradle will automatically install the necessary libraries. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -### Create an account with a balance - -You need an account with a balance so that you can send funds between accounts. - -To create an account for **customer1** that has a balance of **500**, run the following command: - -```shell -$ ./gradlew run --args="-action charge -amount 500 -to customer1" -``` - -### Create an account without a balance - -After setting up an account that has a balance, you need another account for sending funds to. - -To create an account for **merchant1** that has a balance of **0**, run the following command: - -```shell -$ ./gradlew run --args="-action charge -amount 0 -to merchant1" -``` - -### Add funds to an account - -You can add funds to an account in the same way that you created and added funds to an account in [Create an account with a balance](#create-an-account-with-a-balance). - -To add **500** to the account for **customer1**, run the following command: - -```shell -$ ./gradlew run --args="-action charge -amount 500 -to customer1" -``` - -The account for **customer1** will now have a balance of **1000**. - -### Send electronic money between two accounts - -Now that you have created two accounts, with at least one of those accounts having a balance, you can send funds from one account to the other account. - -To have **customer1** pay **100** to **merchant1**, run the following command: - -```shell -$ ./gradlew run --args="-action pay -amount 100 -from customer1 -to merchant1" -``` - -### Get an account balance - -After sending funds from one account to the other, you can check the balance of each account. - -To get the balance of **customer1**, run the following command: - -```shell -$ ./gradlew run --args="-action getBalance -id customer1" -``` - -You should see the following output: - -```shell -... -The balance for customer1 is 900 -... -``` - -To get the balance of **merchant1**, run the following command: - -```shell -$ ./gradlew run --args="-action getBalance -id merchant1" -``` - -You should see the following output: - -```shell -... -The balance for merchant1 is 100 -... -``` - -## Reference - -To see the source code for the electronic money application used in this tutorial, see [`ElectronicMoney.kt`](./getting-started-kotlin/src/main/kotlin/sample/ElectronicMoney.kt). diff --git a/docs/getting-started-with-scalardb.md b/docs/getting-started-with-scalardb.md deleted file mode 100644 index 7e6debc1df..0000000000 --- a/docs/getting-started-with-scalardb.md +++ /dev/null @@ -1,379 +0,0 @@ -> [!CAUTION] -> -> This documentation has been moved to the centralized ScalarDB documentation repository, [docs-internal-scalardb](https://github.com/scalar-labs/docs-internal-scalardb). Please update this documentation in that repository instead. -> -> To view the ScalarDB documentation, visit [ScalarDB Documentation](https://scalardb.scalar-labs.com/docs/). - -# Getting Started with ScalarDB - -This getting started tutorial explains how to configure your preferred database in ScalarDB and set up a basic electronic money application. - -{% capture notice--warning %} -**Warning** - -The electronic money application is simplified for this tutorial and isn't suitable for a production environment. -{% endcapture %} - -
{{ notice--warning | markdownify }}
- -## Install a JDK - -Because ScalarDB is written in Java, you must have one of the following Java Development Kits (JDKs) installed in your environment: - -- [Oracle JDK](https://www.oracle.com/java/technologies/downloads/) LTS version (8, 11, or 17) -- [OpenJDK](https://openjdk.org/install/) LTS version (8, 11, or 17) - -{% capture notice--info %} -**Note** - -We recommend using the LTS versions mentioned above, but other non-LTS versions may work. - -In addition, other JDKs should work with ScalarDB, but we haven't tested them. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -## Clone the `scalardb` repository - -Open a terminal window, and go to your working directory. Then, clone the [scalardb](https://github.com/scalar-labs/scalardb) repository by running the following command: - -```shell -$ git clone https://github.com/scalar-labs/scalardb -``` - -Then, go to the `scalardb/docs/getting-started` directory in the cloned repository by running the following command: - -```shell -$ cd scalardb/docs/getting-started -``` - -## Set up your database for ScalarDB - -Select your database, and follow the instructions to configure it for ScalarDB. - -For a list of databases that ScalarDB supports, see [Supported Databases](scalardb-supported-databases.md). - -
-
- - - - -
- -
- -Confirm that you have Cassandra installed. If Cassandra isn't installed, visit [Downloading Cassandra](https://cassandra.apache.org/_/download.html). - -### Configure Cassandra -{:.no_toc} - -Open **cassandra.yaml** in your preferred IDE. Then, change `commitlog_sync` from `periodic` to `batch` so that you don't lose data if a quorum of replica nodes goes down. - -### Configure ScalarDB -{:.no_toc} - -The following instructions assume that you have properly installed and configured the JDK and Cassandra in your local environment, and Cassandra is running on your localhost. - -The **scalardb.properties** file in the `docs/getting-started` directory holds database configurations for ScalarDB. The following is a basic configuration for Cassandra. Be sure to change the values for `scalar.db.username` and `scalar.db.password` as described. - -```properties -# The Cassandra storage implementation is used for Consensus Commit. -scalar.db.storage=cassandra - -# Comma-separated contact points. -scalar.db.contact_points=localhost - -# The port number for all the contact points. -scalar.db.contact_port=9042 - -# The username and password to access the database. -scalar.db.username= -scalar.db.password= -``` -
-
- -To use Azure Cosmos DB for NoSQL, you must have an Azure account. If you don't have an Azure account, visit [Create an Azure Cosmos DB account](https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/quickstart-portal#create-account). - -### Configure Cosmos DB for NoSQL -{:.no_toc} - -Set the **default consistency level** to **Strong** according to the official document at [Configure the default consistency level](https://learn.microsoft.com/en-us/azure/cosmos-db/nosql/how-to-manage-consistency#configure-the-default-consistency-level). - -### Configure ScalarDB -{:.no_toc} - -The following instructions assume that you have properly installed and configured the JDK in your local environment and properly configured your Cosmos DB for NoSQL account in Azure. - -The **scalardb.properties** file in the `docs/getting-started` directory holds database configurations for ScalarDB. Be sure to change the values for `scalar.db.contact_points` and `scalar.db.password` as described. - -```properties -# The Cosmos DB for NoSQL storage implementation is used for Consensus Commit. -scalar.db.storage=cosmos - -# The Cosmos DB for NoSQL URI. -scalar.db.contact_points= - -# The Cosmos DB for NoSQL key to access the database. -scalar.db.password= -``` - -{% capture notice--info %} -**Note** - -You can use a primary key or a secondary key as the value for `scalar.db.password`. -{% endcapture %} -
{{ notice--info | markdownify }}
-
-
- -To use Amazon DynamoDB, you must have an AWS account. If you don't have an AWS account, visit [Getting started: Are you a first-time AWS user?](https://docs.aws.amazon.com/accounts/latest/reference/welcome-first-time-user.html). - -### Configure ScalarDB -{:.no_toc} - -The following instructions assume that you have properly installed and configured the JDK in your local environment. - -The **scalardb.properties** file in the `docs/getting-started` directory holds database configurations for ScalarDB. Be sure to change the values for `scalar.db.contact_points`, `scalar.db.username`, and `scalar.db.password` as described. - -```properties -# The DynamoDB storage implementation is used for Consensus Commit. -scalar.db.storage=dynamo - -# The AWS region. -scalar.db.contact_points= - -# The AWS access key ID and secret access key to access the database. -scalar.db.username= -scalar.db.password= -``` -
-
- -Confirm that you have a JDBC database installed. For a list of supported JDBC databases, see [Supported Databases](scalardb-supported-databases.md). - -### Configure ScalarDB -{:.no_toc} - -The following instructions assume that you have properly installed and configured the JDK and JDBC database in your local environment, and the JDBC database is running on your localhost. - -The **scalardb.properties** file in the `docs/getting-started` directory holds database configurations for ScalarDB. The following is a basic configuration for JDBC databases. - -{% capture notice--info %} -**Note** - -Be sure to uncomment the `scalar.db.contact_points` variable and change the value of the JDBC database you are using, and change the values for `scalar.db.username` and `scalar.db.password` as described. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -```properties -# The JDBC database storage implementation is used for Consensus Commit. -scalar.db.storage=jdbc - -# The JDBC database URL for the type of database you are using. -# scalar.db.contact_points=jdbc:mysql://localhost:3306/ -# scalar.db.contact_points=jdbc:oracle:thin:@//localhost:1521/ -# scalar.db.contact_points=jdbc:postgresql://localhost:5432/ -# scalar.db.contact_points=jdbc:sqlserver://localhost:1433; -# scalar.db.contact_points=jdbc:sqlite://localhost:3306.sqlite3?busy_timeout=10000 -# scalar.db.contact_points=jdbc:yugabytedb://127.0.0.1:5433\\,127.0.0.2:5433\\,127.0.0.3:5433/?load-balance=true - -# The username and password for connecting to the database. -scalar.db.username= -scalar.db.password= -``` -
-
- -## Create and load the database schema - -You need to define the database schema (the method in which the data will be organized) in the application. For details about the supported data types, see [Data type mapping between ScalarDB and other databases](schema-loader.md#data-type-mapping-between-scalardb-and-the-other-databases). - -For this tutorial, create a file named **emoney.json** in the `scalardb/docs/getting-started` directory. Then, add the following JSON code to define the schema. - -```json -{ - "emoney.account": { - "transaction": true, - "partition-key": [ - "id" - ], - "clustering-key": [], - "columns": { - "id": "TEXT", - "balance": "INT" - } - } -} -``` - -To apply the schema, go to the [`scalardb` Releases](https://github.com/scalar-labs/scalardb/releases) page and download the ScalarDB Schema Loader that matches the version of ScalarDB that you are using to the `getting-started` folder. - -Then, based on your database, run the following command, replacing `` with the version of the ScalarDB Schema Loader that you downloaded: - -
-
- - - - -
- -
- -```shell -$ java -jar scalardb-schema-loader-.jar --config scalardb.properties --schema-file emoney.json --coordinator --replication-factor=1 -``` - -{% capture notice--info %} -**Note** - -The `--coordinator` option is specified because a table with `transaction` set to `true` exists in the schema. For details about configuring and loading a schema, see [ScalarDB Schema Loader](schema-loader.md). - -In addition, the `--replication-factor=1` option has an effect only when using Cassandra. The default replication factor is `3`, but to facilitate the setup in this tutorial, `1` is used so that you only need to prepare a cluster with one node instead of three nodes. However, keep in mind that a replication factor of `1` is not suited for production. -{% endcapture %} - -
{{ notice--info | markdownify }}
-
-
- -```shell -$ java -jar scalardb-schema-loader-.jar --config scalardb.properties --schema-file emoney.json --coordinator -``` - -{% capture notice--info %} -**Note** - -The `--coordinator` option is specified because a table with `transaction` set to `true` exists in the schema. For details about configuring and loading a schema, see [ScalarDB Schema Loader](schema-loader.md). -{% endcapture %} - -
{{ notice--info | markdownify }}
-
-
- -```shell -$ java -jar scalardb-schema-loader-.jar --config scalardb.properties --schema-file emoney.json --coordinator -``` - -{% capture notice--info %} -**Note** - -The `--coordinator` option is specified because a table with `transaction` set to `true` exists in the schema. For details about configuring and loading a schema, see [ScalarDB Schema Loader](schema-loader.md). -{% endcapture %} - -
{{ notice--info | markdownify }}
-
-
- -```shell -$ java -jar scalardb-schema-loader-.jar --config scalardb.properties --schema-file emoney.json --coordinator -``` - -{% capture notice--info %} -**Note** - -The `--coordinator` option is specified because a table with `transaction` set to `true` exists in the schema. For details about configuring and loading a schema, see [ScalarDB Schema Loader](schema-loader.md). -{% endcapture %} - -
{{ notice--info | markdownify }}
-
-
- -## Execute transactions and retrieve data in the basic electronic money application - -After loading the schema, you can execute transactions and retrieve data in the basic electronic money application that is included in the repository that you cloned. - -The application supports the following types of transactions: - -- Create an account. -- Add funds to an account. -- Send funds between two accounts. -- Get an account balance. - -{% capture notice--info %} -**Note** - -When you first execute a Gradle command, Gradle will automatically install the necessary libraries. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -### Create an account with a balance - -You need an account with a balance so that you can send funds between accounts. - -To create an account for **customer1** that has a balance of **500**, run the following command: - -```shell -$ ./gradlew run --args="-action charge -amount 500 -to customer1" -``` - -### Create an account without a balance - -After setting up an account that has a balance, you need another account for sending funds to. - -To create an account for **merchant1** that has a balance of **0**, run the following command: - -```shell -$ ./gradlew run --args="-action charge -amount 0 -to merchant1" -``` - -### Add funds to an account - -You can add funds to an account in the same way that you created and added funds to an account in [Create an account with a balance](#create-an-account-with-a-balance). - -To add **500** to the account for **customer1**, run the following command: - -```shell -$ ./gradlew run --args="-action charge -amount 500 -to customer1" -``` - -The account for **customer1** will now have a balance of **1000**. - -### Send electronic money between two accounts - -Now that you have created two accounts, with at least one of those accounts having a balance, you can send funds from one account to the other account. - -To have **customer1** pay **100** to **merchant1**, run the following command: - -```shell -$ ./gradlew run --args="-action pay -amount 100 -from customer1 -to merchant1" -``` - -### Get an account balance - -After sending funds from one account to the other, you can check the balance of each account. - -To get the balance of **customer1**, run the following command: - -```shell -$ ./gradlew run --args="-action getBalance -id customer1" -``` - -You should see the following output: - -```shell -... -The balance for customer1 is 900 -... -``` - -To get the balance of **merchant1**, run the following command: - -```shell -$ ./gradlew run --args="-action getBalance -id merchant1" -``` - -You should see the following output: - -```shell -... -The balance for merchant1 is 100 -... -``` - -## Reference - -To see the source code for the electronic money application used in this tutorial, see [`ElectronicMoney.java`](./getting-started/src/main/java/sample/ElectronicMoney.java). diff --git a/docs/getting-started/build.gradle b/docs/getting-started/build.gradle deleted file mode 100644 index ba6b5c8781..0000000000 --- a/docs/getting-started/build.gradle +++ /dev/null @@ -1,19 +0,0 @@ -apply plugin: 'java' -apply plugin: 'idea' -apply plugin: 'application' - -repositories { - mavenCentral() -} - -application { - mainClass = "sample.ElectronicMoneyMain" -} - -dependencies { - implementation 'com.scalar-labs:scalardb:3.13.0' - implementation 'org.slf4j:slf4j-simple:1.7.30' -} - -sourceCompatibility = 1.8 -targetCompatibility = 1.8 diff --git a/docs/getting-started/gradle/wrapper/gradle-wrapper.jar b/docs/getting-started/gradle/wrapper/gradle-wrapper.jar deleted file mode 100644 index e6441136f3..0000000000 Binary files a/docs/getting-started/gradle/wrapper/gradle-wrapper.jar and /dev/null differ diff --git a/docs/getting-started/gradle/wrapper/gradle-wrapper.properties b/docs/getting-started/gradle/wrapper/gradle-wrapper.properties deleted file mode 100644 index b82aa23a4f..0000000000 --- a/docs/getting-started/gradle/wrapper/gradle-wrapper.properties +++ /dev/null @@ -1,7 +0,0 @@ -distributionBase=GRADLE_USER_HOME -distributionPath=wrapper/dists -distributionUrl=https\://services.gradle.org/distributions/gradle-8.7-bin.zip -networkTimeout=10000 -validateDistributionUrl=true -zipStoreBase=GRADLE_USER_HOME -zipStorePath=wrapper/dists diff --git a/docs/getting-started/gradlew b/docs/getting-started/gradlew deleted file mode 100755 index 1aa94a4269..0000000000 --- a/docs/getting-started/gradlew +++ /dev/null @@ -1,249 +0,0 @@ -#!/bin/sh - -# -# Copyright © 2015-2021 the original authors. -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# https://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# - -############################################################################## -# -# Gradle start up script for POSIX generated by Gradle. -# -# Important for running: -# -# (1) You need a POSIX-compliant shell to run this script. If your /bin/sh is -# noncompliant, but you have some other compliant shell such as ksh or -# bash, then to run this script, type that shell name before the whole -# command line, like: -# -# ksh Gradle -# -# Busybox and similar reduced shells will NOT work, because this script -# requires all of these POSIX shell features: -# * functions; -# * expansions «$var», «${var}», «${var:-default}», «${var+SET}», -# «${var#prefix}», «${var%suffix}», and «$( cmd )»; -# * compound commands having a testable exit status, especially «case»; -# * various built-in commands including «command», «set», and «ulimit». -# -# Important for patching: -# -# (2) This script targets any POSIX shell, so it avoids extensions provided -# by Bash, Ksh, etc; in particular arrays are avoided. -# -# The "traditional" practice of packing multiple parameters into a -# space-separated string is a well documented source of bugs and security -# problems, so this is (mostly) avoided, by progressively accumulating -# options in "$@", and eventually passing that to Java. -# -# Where the inherited environment variables (DEFAULT_JVM_OPTS, JAVA_OPTS, -# and GRADLE_OPTS) rely on word-splitting, this is performed explicitly; -# see the in-line comments for details. -# -# There are tweaks for specific operating systems such as AIX, CygWin, -# Darwin, MinGW, and NonStop. -# -# (3) This script is generated from the Groovy template -# https://github.com/gradle/gradle/blob/HEAD/subprojects/plugins/src/main/resources/org/gradle/api/internal/plugins/unixStartScript.txt -# within the Gradle project. -# -# You can find Gradle at https://github.com/gradle/gradle/. -# -############################################################################## - -# Attempt to set APP_HOME - -# Resolve links: $0 may be a link -app_path=$0 - -# Need this for daisy-chained symlinks. -while - APP_HOME=${app_path%"${app_path##*/}"} # leaves a trailing /; empty if no leading path - [ -h "$app_path" ] -do - ls=$( ls -ld "$app_path" ) - link=${ls#*' -> '} - case $link in #( - /*) app_path=$link ;; #( - *) app_path=$APP_HOME$link ;; - esac -done - -# This is normally unused -# shellcheck disable=SC2034 -APP_BASE_NAME=${0##*/} -# Discard cd standard output in case $CDPATH is set (https://github.com/gradle/gradle/issues/25036) -APP_HOME=$( cd "${APP_HOME:-./}" > /dev/null && pwd -P ) || exit - -# Use the maximum available, or set MAX_FD != -1 to use that value. -MAX_FD=maximum - -warn () { - echo "$*" -} >&2 - -die () { - echo - echo "$*" - echo - exit 1 -} >&2 - -# OS specific support (must be 'true' or 'false'). -cygwin=false -msys=false -darwin=false -nonstop=false -case "$( uname )" in #( - CYGWIN* ) cygwin=true ;; #( - Darwin* ) darwin=true ;; #( - MSYS* | MINGW* ) msys=true ;; #( - NONSTOP* ) nonstop=true ;; -esac - -CLASSPATH=$APP_HOME/gradle/wrapper/gradle-wrapper.jar - - -# Determine the Java command to use to start the JVM. -if [ -n "$JAVA_HOME" ] ; then - if [ -x "$JAVA_HOME/jre/sh/java" ] ; then - # IBM's JDK on AIX uses strange locations for the executables - JAVACMD=$JAVA_HOME/jre/sh/java - else - JAVACMD=$JAVA_HOME/bin/java - fi - if [ ! -x "$JAVACMD" ] ; then - die "ERROR: JAVA_HOME is set to an invalid directory: $JAVA_HOME - -Please set the JAVA_HOME variable in your environment to match the -location of your Java installation." - fi -else - JAVACMD=java - if ! command -v java >/dev/null 2>&1 - then - die "ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH. - -Please set the JAVA_HOME variable in your environment to match the -location of your Java installation." - fi -fi - -# Increase the maximum file descriptors if we can. -if ! "$cygwin" && ! "$darwin" && ! "$nonstop" ; then - case $MAX_FD in #( - max*) - # In POSIX sh, ulimit -H is undefined. That's why the result is checked to see if it worked. - # shellcheck disable=SC2039,SC3045 - MAX_FD=$( ulimit -H -n ) || - warn "Could not query maximum file descriptor limit" - esac - case $MAX_FD in #( - '' | soft) :;; #( - *) - # In POSIX sh, ulimit -n is undefined. That's why the result is checked to see if it worked. - # shellcheck disable=SC2039,SC3045 - ulimit -n "$MAX_FD" || - warn "Could not set maximum file descriptor limit to $MAX_FD" - esac -fi - -# Collect all arguments for the java command, stacking in reverse order: -# * args from the command line -# * the main class name -# * -classpath -# * -D...appname settings -# * --module-path (only if needed) -# * DEFAULT_JVM_OPTS, JAVA_OPTS, and GRADLE_OPTS environment variables. - -# For Cygwin or MSYS, switch paths to Windows format before running java -if "$cygwin" || "$msys" ; then - APP_HOME=$( cygpath --path --mixed "$APP_HOME" ) - CLASSPATH=$( cygpath --path --mixed "$CLASSPATH" ) - - JAVACMD=$( cygpath --unix "$JAVACMD" ) - - # Now convert the arguments - kludge to limit ourselves to /bin/sh - for arg do - if - case $arg in #( - -*) false ;; # don't mess with options #( - /?*) t=${arg#/} t=/${t%%/*} # looks like a POSIX filepath - [ -e "$t" ] ;; #( - *) false ;; - esac - then - arg=$( cygpath --path --ignore --mixed "$arg" ) - fi - # Roll the args list around exactly as many times as the number of - # args, so each arg winds up back in the position where it started, but - # possibly modified. - # - # NB: a `for` loop captures its iteration list before it begins, so - # changing the positional parameters here affects neither the number of - # iterations, nor the values presented in `arg`. - shift # remove old arg - set -- "$@" "$arg" # push replacement arg - done -fi - - -# Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script. -DEFAULT_JVM_OPTS='"-Xmx64m" "-Xms64m"' - -# Collect all arguments for the java command: -# * DEFAULT_JVM_OPTS, JAVA_OPTS, JAVA_OPTS, and optsEnvironmentVar are not allowed to contain shell fragments, -# and any embedded shellness will be escaped. -# * For example: A user cannot expect ${Hostname} to be expanded, as it is an environment variable and will be -# treated as '${Hostname}' itself on the command line. - -set -- \ - "-Dorg.gradle.appname=$APP_BASE_NAME" \ - -classpath "$CLASSPATH" \ - org.gradle.wrapper.GradleWrapperMain \ - "$@" - -# Stop when "xargs" is not available. -if ! command -v xargs >/dev/null 2>&1 -then - die "xargs is not available" -fi - -# Use "xargs" to parse quoted args. -# -# With -n1 it outputs one arg per line, with the quotes and backslashes removed. -# -# In Bash we could simply go: -# -# readarray ARGS < <( xargs -n1 <<<"$var" ) && -# set -- "${ARGS[@]}" "$@" -# -# but POSIX shell has neither arrays nor command substitution, so instead we -# post-process each arg (as a line of input to sed) to backslash-escape any -# character that might be a shell metacharacter, then use eval to reverse -# that process (while maintaining the separation between arguments), and wrap -# the whole thing up as a single "set" statement. -# -# This will of course break if any of these variables contains a newline or -# an unmatched quote. -# - -eval "set -- $( - printf '%s\n' "$DEFAULT_JVM_OPTS $JAVA_OPTS $GRADLE_OPTS" | - xargs -n1 | - sed ' s~[^-[:alnum:]+,./:=@_]~\\&~g; ' | - tr '\n' ' ' - )" '"$@"' - -exec "$JAVACMD" "$@" diff --git a/docs/getting-started/gradlew.bat b/docs/getting-started/gradlew.bat deleted file mode 100644 index 7101f8e467..0000000000 --- a/docs/getting-started/gradlew.bat +++ /dev/null @@ -1,92 +0,0 @@ -@rem -@rem Copyright 2015 the original author or authors. -@rem -@rem Licensed under the Apache License, Version 2.0 (the "License"); -@rem you may not use this file except in compliance with the License. -@rem You may obtain a copy of the License at -@rem -@rem https://www.apache.org/licenses/LICENSE-2.0 -@rem -@rem Unless required by applicable law or agreed to in writing, software -@rem distributed under the License is distributed on an "AS IS" BASIS, -@rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -@rem See the License for the specific language governing permissions and -@rem limitations under the License. -@rem - -@if "%DEBUG%"=="" @echo off -@rem ########################################################################## -@rem -@rem Gradle startup script for Windows -@rem -@rem ########################################################################## - -@rem Set local scope for the variables with windows NT shell -if "%OS%"=="Windows_NT" setlocal - -set DIRNAME=%~dp0 -if "%DIRNAME%"=="" set DIRNAME=. -@rem This is normally unused -set APP_BASE_NAME=%~n0 -set APP_HOME=%DIRNAME% - -@rem Resolve any "." and ".." in APP_HOME to make it shorter. -for %%i in ("%APP_HOME%") do set APP_HOME=%%~fi - -@rem Add default JVM options here. You can also use JAVA_OPTS and GRADLE_OPTS to pass JVM options to this script. -set DEFAULT_JVM_OPTS="-Xmx64m" "-Xms64m" - -@rem Find java.exe -if defined JAVA_HOME goto findJavaFromJavaHome - -set JAVA_EXE=java.exe -%JAVA_EXE% -version >NUL 2>&1 -if %ERRORLEVEL% equ 0 goto execute - -echo. 1>&2 -echo ERROR: JAVA_HOME is not set and no 'java' command could be found in your PATH. 1>&2 -echo. 1>&2 -echo Please set the JAVA_HOME variable in your environment to match the 1>&2 -echo location of your Java installation. 1>&2 - -goto fail - -:findJavaFromJavaHome -set JAVA_HOME=%JAVA_HOME:"=% -set JAVA_EXE=%JAVA_HOME%/bin/java.exe - -if exist "%JAVA_EXE%" goto execute - -echo. 1>&2 -echo ERROR: JAVA_HOME is set to an invalid directory: %JAVA_HOME% 1>&2 -echo. 1>&2 -echo Please set the JAVA_HOME variable in your environment to match the 1>&2 -echo location of your Java installation. 1>&2 - -goto fail - -:execute -@rem Setup the command line - -set CLASSPATH=%APP_HOME%\gradle\wrapper\gradle-wrapper.jar - - -@rem Execute Gradle -"%JAVA_EXE%" %DEFAULT_JVM_OPTS% %JAVA_OPTS% %GRADLE_OPTS% "-Dorg.gradle.appname=%APP_BASE_NAME%" -classpath "%CLASSPATH%" org.gradle.wrapper.GradleWrapperMain %* - -:end -@rem End local scope for the variables with windows NT shell -if %ERRORLEVEL% equ 0 goto mainEnd - -:fail -rem Set variable GRADLE_EXIT_CONSOLE if you need the _script_ return code instead of -rem the _cmd.exe /c_ return code! -set EXIT_CODE=%ERRORLEVEL% -if %EXIT_CODE% equ 0 set EXIT_CODE=1 -if not ""=="%GRADLE_EXIT_CONSOLE%" exit %EXIT_CODE% -exit /b %EXIT_CODE% - -:mainEnd -if "%OS%"=="Windows_NT" endlocal - -:omega diff --git a/docs/getting-started/scalardb.properties b/docs/getting-started/scalardb.properties deleted file mode 100755 index b1dbcbdec6..0000000000 --- a/docs/getting-started/scalardb.properties +++ /dev/null @@ -1,12 +0,0 @@ -# Comma-separated contact points -scalar.db.contact_points=localhost - -# Port number for all the contact points. Default port number for each database is used if empty. -#scalar.db.contact_port= - -# Credential information to access the database -scalar.db.username=cassandra -scalar.db.password=cassandra - -# Storage implementation. Either cassandra or cosmos or dynamo or jdbc can be set. Default storage is cassandra. -#scalar.db.storage=cassandra diff --git a/docs/getting-started/settings.gradle b/docs/getting-started/settings.gradle deleted file mode 100644 index 744e2a3e71..0000000000 --- a/docs/getting-started/settings.gradle +++ /dev/null @@ -1 +0,0 @@ -rootProject.name = 'getting-started' diff --git a/docs/getting-started/src/main/java/sample/ElectronicMoney.java b/docs/getting-started/src/main/java/sample/ElectronicMoney.java deleted file mode 100644 index 2af60ca27f..0000000000 --- a/docs/getting-started/src/main/java/sample/ElectronicMoney.java +++ /dev/null @@ -1,153 +0,0 @@ -package sample; - -import com.scalar.db.api.DistributedTransaction; -import com.scalar.db.api.DistributedTransactionManager; -import com.scalar.db.api.Get; -import com.scalar.db.api.Put; -import com.scalar.db.api.Result; -import com.scalar.db.exception.transaction.TransactionException; -import com.scalar.db.io.Key; -import com.scalar.db.service.TransactionFactory; -import java.io.IOException; -import java.util.Optional; - -public class ElectronicMoney { - - private static final String NAMESPACE = "emoney"; - private static final String TABLENAME = "account"; - private static final String ID = "id"; - private static final String BALANCE = "balance"; - - private final DistributedTransactionManager manager; - - public ElectronicMoney(String scalarDBProperties) throws IOException { - TransactionFactory factory = TransactionFactory.create(scalarDBProperties); - manager = factory.getTransactionManager(); - } - - public void charge(String id, int amount) throws TransactionException { - // Start a transaction - DistributedTransaction tx = manager.start(); - - try { - // Retrieve the current balance for id - Get get = - Get.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, id)) - .build(); - Optional result = tx.get(get); - - // Calculate the balance - int balance = amount; - if (result.isPresent()) { - int current = result.get().getInt(BALANCE); - balance += current; - } - - // Update the balance - Put put = - Put.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, id)) - .intValue(BALANCE, balance) - .build(); - tx.put(put); - - // Commit the transaction (records are automatically recovered in case of failure) - tx.commit(); - } catch (Exception e) { - tx.abort(); - throw e; - } - } - - public void pay(String fromId, String toId, int amount) throws TransactionException { - // Start a transaction - DistributedTransaction tx = manager.start(); - - try { - // Retrieve the current balances for ids - Get fromGet = - Get.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, fromId)) - .build(); - Get toGet = - Get.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, toId)) - .build(); - Optional fromResult = tx.get(fromGet); - Optional toResult = tx.get(toGet); - - // Calculate the balances (it assumes that both accounts exist) - int newFromBalance = fromResult.get().getInt(BALANCE) - amount; - int newToBalance = toResult.get().getInt(BALANCE) + amount; - if (newFromBalance < 0) { - throw new RuntimeException(fromId + " doesn't have enough balance."); - } - - // Update the balances - Put fromPut = - Put.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, fromId)) - .intValue(BALANCE, newFromBalance) - .build(); - Put toPut = - Put.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, toId)) - .intValue(BALANCE, newToBalance) - .build(); - tx.put(fromPut); - tx.put(toPut); - - // Commit the transaction (records are automatically recovered in case of failure) - tx.commit(); - } catch (Exception e) { - tx.abort(); - throw e; - } - } - - public int getBalance(String id) throws TransactionException { - // Start a transaction - DistributedTransaction tx = manager.start(); - - try { - // Retrieve the current balances for id - Get get = - Get.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, id)) - .build(); - Optional result = tx.get(get); - - int balance = -1; - if (result.isPresent()) { - balance = result.get().getInt(BALANCE); - } - - // Commit the transaction - tx.commit(); - - return balance; - } catch (Exception e) { - tx.abort(); - throw e; - } - } - - public void close() { - manager.close(); - } -} diff --git a/docs/getting-started/src/main/java/sample/ElectronicMoneyMain.java b/docs/getting-started/src/main/java/sample/ElectronicMoneyMain.java deleted file mode 100644 index 533487440e..0000000000 --- a/docs/getting-started/src/main/java/sample/ElectronicMoneyMain.java +++ /dev/null @@ -1,75 +0,0 @@ -package sample; - -import java.io.File; - -public class ElectronicMoneyMain { - - public static void main(String[] args) throws Exception { - String action = null; - int amount = 0; - String to = null; - String from = null; - String id = null; - String scalarDBProperties = null; - - for (int i = 0; i < args.length; ++i) { - if ("-action".equals(args[i])) { - action = args[++i]; - } else if ("-amount".equals(args[i])) { - amount = Integer.parseInt(args[++i]); - } else if ("-to".equals(args[i])) { - to = args[++i]; - } else if ("-from".equals(args[i])) { - from = args[++i]; - } else if ("-id".equals(args[i])) { - id = args[++i]; - } else if ("-config".equals(args[i])) { - scalarDBProperties = args[++i]; - } else if ("-help".equals(args[i])) { - printUsageAndExit(); - return; - } - } - - if (action == null) { - printUsageAndExit(); - return; - } - - ElectronicMoney eMoney; - if (scalarDBProperties != null) { - eMoney = new ElectronicMoney(scalarDBProperties); - } else { - scalarDBProperties = System.getProperty("user.dir") + File.separator + "scalardb.properties"; - eMoney = new ElectronicMoney(scalarDBProperties); - } - - if (action.equalsIgnoreCase("charge")) { - if (to == null || amount < 0) { - printUsageAndExit(); - return; - } - eMoney.charge(to, amount); - } else if (action.equalsIgnoreCase("pay")) { - if (to == null || amount < 0 || from == null) { - printUsageAndExit(); - return; - } - eMoney.pay(from, to, amount); - } else if (action.equalsIgnoreCase("getBalance")) { - if (id == null) { - printUsageAndExit(); - return; - } - int balance = eMoney.getBalance(id); - System.out.println("The balance for " + id + " is " + balance); - } - eMoney.close(); - } - - private static void printUsageAndExit() { - System.err.println( - "ElectronicMoneyMain -action charge/pay/getBalance [-amount number (needed for charge and pay)] [-to id (needed for charge and pay)] [-from id (needed for pay)] [-id id (needed for getBalance)]"); - System.exit(1); - } -} diff --git a/docs/images/data_model.png b/docs/images/data_model.png deleted file mode 100644 index 15a0e4d4bf..0000000000 Binary files a/docs/images/data_model.png and /dev/null differ diff --git a/docs/images/scalardb.png b/docs/images/scalardb.png deleted file mode 100644 index 658486cbb0..0000000000 Binary files a/docs/images/scalardb.png and /dev/null differ diff --git a/docs/images/scalardb_data_model.png b/docs/images/scalardb_data_model.png deleted file mode 100644 index 7a02fa2345..0000000000 Binary files a/docs/images/scalardb_data_model.png and /dev/null differ diff --git a/docs/images/software_stack.png b/docs/images/software_stack.png deleted file mode 100644 index 75fba6e623..0000000000 Binary files a/docs/images/software_stack.png and /dev/null differ diff --git a/docs/images/two_phase_commit_load_balancing.png b/docs/images/two_phase_commit_load_balancing.png deleted file mode 100644 index 5cdc26f085..0000000000 Binary files a/docs/images/two_phase_commit_load_balancing.png and /dev/null differ diff --git a/docs/images/two_phase_commit_sequence_diagram.png b/docs/images/two_phase_commit_sequence_diagram.png deleted file mode 100644 index 116ef635e2..0000000000 Binary files a/docs/images/two_phase_commit_sequence_diagram.png and /dev/null differ diff --git a/docs/index.md b/docs/index.md deleted file mode 100644 index e69fc03951..0000000000 --- a/docs/index.md +++ /dev/null @@ -1,89 +0,0 @@ -> [!CAUTION] -> -> This documentation has been moved to the centralized ScalarDB documentation repository, [docs-internal-scalardb](https://github.com/scalar-labs/docs-internal-scalardb). Please update this documentation in that repository instead. -> -> To view the ScalarDB documentation, visit [ScalarDB Documentation](https://scalardb.scalar-labs.com/docs/). - -# ScalarDB - -[![CI](https://github.com/scalar-labs/scalardb/actions/workflows/ci.yaml/badge.svg?branch=master)](https://github.com/scalar-labs/scalardb/actions/workflows/ci.yaml) - -ScalarDB is a universal transaction manager that achieves: -- database/storage-agnostic ACID transactions in a scalable manner even if an underlying database or storage is not ACID-compliant. -- multi-storage/database/service ACID transactions that can span multiple (possibly different) databases, storages, and services. - -## Install -The library is available on [maven central repository](https://mvnrepository.com/artifact/com.scalar-labs/scalardb). -You can install it in your application using your build tool such as Gradle and Maven. - -To add a dependency on ScalarDB using Gradle, use the following: -```gradle -dependencies { - implementation 'com.scalar-labs:scalardb:3.12.2' -} -``` - -To add a dependency using Maven: -```xml - - com.scalar-labs - scalardb - 3.12.2 - -``` - -## Docs -* [Getting started](getting-started-with-scalardb.md) -* [Java API Guide](api-guide.md) -* [ScalarDB Samples](https://github.com/scalar-labs/scalardb-samples) -* [ScalarDB Server](scalardb-server.md) -* [Multi-storage Transactions](multi-storage-transactions.md) -* [Two-phase Commit Transactions](two-phase-commit-transactions.md) -* [Design document](design.md) -* [Schema Loader](schema-loader.md) -* [Requirements and Recommendations for the Underlying Databases of ScalarDB](requirements.md) -* [How to Back up and Restore](backup-restore.md) -* [ScalarDB supported databases](scalardb-supported-databases.md) -* [Configurations](configurations.md) -* [Storage abstraction](storage-abstraction.md) -* Slides - * [Making Cassandra more capable, faster, and more reliable](https://speakerdeck.com/scalar/making-cassandra-more-capable-faster-and-more-reliable-at-apachecon-at-home-2020) at ApacheCon@Home 2020 - * [Scalar DB: A library that makes non-ACID databases ACID-compliant](https://speakerdeck.com/scalar/scalar-db-a-library-that-makes-non-acid-databases-acid-compliant) at Database Lounge Tokyo #6 2020 - * [Transaction Management on Cassandra](https://speakerdeck.com/scalar/transaction-management-on-cassandra) at Next Generation Cassandra Conference / ApacheCon NA 2019 -* Javadoc - * [scalardb](https://javadoc.io/doc/com.scalar-labs/scalardb/latest/index.html) - ScalarDB: A universal transaction manager that achieves database-agnostic transactions and distributed transactions that span multiple databases - * [scalardb-rpc](https://javadoc.io/doc/com.scalar-labs/scalardb-rpc/latest/index.html) - ScalarDB RPC libraries - * [scalardb-server](https://javadoc.io/doc/com.scalar-labs/scalardb-server/latest/index.html) - ScalarDB Server: A gRPC interface of ScalarDB - * [scalardb-schema-loader](https://javadoc.io/doc/com.scalar-labs/scalardb-schema-loader/latest/index.html) - ScalarDB Schema Loader: A tool for schema creation and schema deletion in ScalarDB -* [Jepsen tests](https://github.com/scalar-labs/scalar-jepsen) -* [TLA+](https://github.com/scalar-labs/scalardb/tree/master/tla+/consensus-commit) - -## Contributing -This library is mainly maintained by the Scalar Engineering Team, but of course we appreciate any help. - -* For asking questions, finding answers and helping other users, please go to [stackoverflow](https://stackoverflow.com/) and use [scalardb](https://stackoverflow.com/questions/tagged/scalardb) tag. -* For filing bugs, suggesting improvements, or requesting new features, help us out by opening an issue. - -Here are the contributors we are especially thankful for: -- [Toshihiro Suzuki](https://github.com/brfrn169) - created [Phoenix adapter](https://github.com/scalar-labs/scalardb-phoenix) for ScalarDB -- [Yonezawa-T2](https://github.com/Yonezawa-T2) - reported bugs around Serializable and proposed a new Serializable strategy (now named Extra-Read) - -## Development - -### Pre-commit hook - -This project uses [pre-commit](https://pre-commit.com/) to automate code format and so on as much as possible. If you're interested in the development of ScalarDB, please [install pre-commit](https://pre-commit.com/#installation) and the git hook script as follows. - -``` -$ ls -a .pre-commit-config.yaml -.pre-commit-config.yaml -$ pre-commit install -``` - -The code formatter is automatically executed when committing files. A commit will fail and be formatted by the formatter when any invalid code format is detected. Try to commit the change again. - -## License -ScalarDB is dual-licensed under both the Apache 2.0 License (found in the LICENSE file in the root directory) and a commercial license. -You may select, at your option, one of the above-listed licenses. -The commercial license includes several enterprise-grade features such as management tools and declarative query interfaces like GraphQL and SQL interfaces. -Regarding the commercial license, please [contact us](https://scalar-labs.com/contact_us/) for more information. diff --git a/docs/multi-storage-transactions.md b/docs/multi-storage-transactions.md deleted file mode 100644 index 9ff54af9ca..0000000000 --- a/docs/multi-storage-transactions.md +++ /dev/null @@ -1,66 +0,0 @@ -> [!CAUTION] -> -> This documentation has been moved to the centralized ScalarDB documentation repository, [docs-internal-scalardb](https://github.com/scalar-labs/docs-internal-scalardb). Please update this documentation in that repository instead. -> -> To view the ScalarDB documentation, visit [ScalarDB Documentation](https://scalardb.scalar-labs.com/docs/). - -# Multi-Storage Transactions - -ScalarDB transactions can span multiple storages or databases while maintaining ACID compliance by using a feature called *multi-storage transactions*. - -This page explains how multi-storage transactions work and how to configure the feature in ScalarDB. - -## How multi-storage transactions work in ScalarDB - -In ScalarDB, the `multi-storage` implementation holds multiple storage instances and has mappings from a namespace name to a proper storage instance. When an operation is executed, the multi-storage transactions feature chooses a proper storage instance from the specified namespace by using the namespace-storage mapping and uses that storage instance. - -## How to configure ScalarDB to support multi-storage transactions - -To enable multi-storage transactions, you need to specify `consensus-commit` as the value for `scalar.db.transaction_manager`, `multi-storage` as the value for `scalar.db.storage`, and configure your databases in the ScalarDB properties file. - -The following is an example of configurations for multi-storage transactions: - -```properties -# Consensus Commit is required to support multi-storage transactions. -scalar.db.transaction_manager=consensus-commit - -# Multi-storage implementation is used for Consensus Commit. -scalar.db.storage=multi-storage - -# Define storage names by using a comma-separated format. -# In this case, "cassandra" and "mysql" are used. -scalar.db.multi_storage.storages=cassandra,mysql - -# Define the "cassandra" storage. -# When setting storage properties, such as `storage`, `contact_points`, `username`, and `password`, for multi-storage transactions, the format is `scalar.db.multi_storage.storages..`. -# For example, to configure the `scalar.db.contact_points` property for Cassandra, specify `scalar.db.multi_storage.storages.cassandra.contact_point`. -scalar.db.multi_storage.storages.cassandra.storage=cassandra -scalar.db.multi_storage.storages.cassandra.contact_points=localhost -scalar.db.multi_storage.storages.cassandra.username=cassandra -scalar.db.multi_storage.storages.cassandra.password=cassandra - -# Define the "mysql" storage. -# When defining JDBC-specific configurations for multi-storage transactions, you can follow a similar format of `scalar.db.multi_storage.storages..`. -# For example, to configure the `scalar.db.jdbc.connection_pool.min_idle` property for MySQL, specify `scalar.db.multi_storage.storages.mysql.jdbc.connection_pool.min_idle`. -scalar.db.multi_storage.storages.mysql.storage=jdbc -scalar.db.multi_storage.storages.mysql.contact_points=jdbc:mysql://localhost:3306/ -scalar.db.multi_storage.storages.mysql.username=root -scalar.db.multi_storage.storages.mysql.password=mysql -# Define the JDBC-specific configurations for the "mysql" storage. -scalar.db.multi_storage.storages.mysql.jdbc.connection_pool.min_idle=5 -scalar.db.multi_storage.storages.mysql.jdbc.connection_pool.max_idle=10 -scalar.db.multi_storage.storages.mysql.jdbc.connection_pool.max_total=25 - -# Define namespace mapping from a namespace name to a storage. -# The format is ":,...". -scalar.db.multi_storage.namespace_mapping=user:cassandra,coordinator:mysql - -# Define the default storage that's used if a specified table doesn't have any mapping. -scalar.db.multi_storage.default_storage=cassandra -``` - -For additional configurations, see [ScalarDB Configurations](configurations.md). - -## Hands-on tutorial - -For a hands-on tutorial, see [Create a Sample Application That Supports Multi-Storage Transactions](https://github.com/scalar-labs/scalardb-samples/tree/main/multi-storage-transaction-sample). diff --git a/docs/overview.md b/docs/overview.md deleted file mode 100644 index 6c23a576f0..0000000000 --- a/docs/overview.md +++ /dev/null @@ -1,69 +0,0 @@ -> [!CAUTION] -> -> This documentation has been moved to the centralized ScalarDB documentation repository, [docs-internal-scalardb](https://github.com/scalar-labs/docs-internal-scalardb). Please update this documentation in that repository instead. -> -> To view the ScalarDB documentation, visit [ScalarDB Documentation](https://scalardb.scalar-labs.com/docs/). - -# ScalarDB Overview - -This page describes what ScalarDB is and its primary use cases. - -## What is ScalarDB? - -ScalarDB is a hybrid transaction/analytical processing (HTAP) engine for diverse databases. It runs as middleware on databases and virtually unifies diverse databases by achieving ACID transactions and real-time analytics across them to simplify the complexity of managing multiple databases or multiple instances of a single database. - -![How ScalarDB simplifies complex data management architecture.](images/scalardb.png) - -As a versatile solution, ScalarDB supports a range of databases, including: - -- Relational databases that support JDBC, such as MariaDB, Microsoft SQL Server, MySQL, Oracle Database, PostgreSQL, SQLite, and their compatible databases, like Amazon Aurora, Google AlloyDB, TiDB, and YugabyteDB. -- NoSQL databases like Amazon DynamoDB, Apache Cassandra, and Azure Cosmos DB. - -For details on which databases ScalarDB supports, refer to [Supported Databases](scalardb-supported-databases.md). - -## Why ScalarDB? - -Several solutions, such as global transaction managers, data federation engines, and HTAP systems, have similar goals, but they are limited in the following perspectives: - -- Global transaction managers (like Oracle MicroTx and Atomikos) are designed to run transactions across a limited set of heterogeneous databases (like only XA-compliant databases). -- Data federation engines (like Denodo and Starburst) are designed to run analytical queries across heterogeneous databases. -- HTAP systems (like TiDB and SingleStore) run both transactions and analytical queries only on homogeneous databases. - -In other words, they virtually unify databases, but with limitations. For example, with data federation engines, users can run read-only analytical queries on a virtualized view across multiple databases. However, they often need to run update queries separately for each database. - -Unlike other solutions, ScalarDB stands out by offering the ability to run both transactional and analytical queries on heterogeneous databases, which can significantly simplify database management. - -The following table summarizes how ScalarDB is different from the other solutions. - -| | Transactions across heterogeneous databases | Analytics across heterogeneous databases | -| :------------------------------------------------------------: | :------------------------------------------------------------------: | :--------------------------------------: | -| Global transaction managers (like Oracle MicroTx and Atomikos) | Yes (but existing solutions support only a limited set of databases) | No | -| Data federation engines (like Denodo and Starburst) | No | Yes | -| HTAP systems (like TiDB and SingleStore) | No (support homogeneous databases only) | No (support homogeneous databases only) | -| **ScalarDB** | **Yes (supports various databases)** | **Yes** | - - -## ScalarDB use cases - -ScalarDB can be used in various ways. Here are the three primary use cases of ScalarDB. - -### Managing siloed databases easily -Many enterprises comprise several organizations, departments, and business units to support agile business operations, which often leads to siloed information systems. In particular, different organizations likely manage different applications with different databases. Managing such siloed databases is challenging because applications must communicate with each database separately and properly deal with the differences between databases. - -ScalarDB simplifies the management of siloed databases with a unified interface, enabling users to treat the databases as if they were a single database. For example, users can run (analytical) join queries over multiple databases without interacting with the databases respectively. - -### Managing consistency between multiple database -Modern architectures, like the microservice architecture, encourage a system to separate a service and its database into smaller subsets to increase system modularity and development efficiency. However, managing diverse databases, especially of different kinds, is challenging because applications must ensure the correct states (or, in other words, consistencies) of those databases, even using transaction management patterns like Saga and TCC. - -ScalarDB simplifies managing such diverse databases with a correctness guarantee (or, in other words, ACID with strict serializability), enabling you to focus on application development without worrying about guaranteeing consistency between databases. - -### Reducing database migration hurdles - -Applications tend to be locked into using a certain database because of the specific capabilities that the database provides. Such database lock-in discourages upgrading or changing the database because doing so often requires rewriting the application. - -ScalarDB provides a unified interface for diverse databases. Thus, once an application is written by using the ScalarDB interface, it becomes portable, which helps to achieve seamless database migration without rewriting the application. - -## Further reading - -- [ScalarDB Technical Overview](https://speakerdeck.com/scalar/scalar-db-universal-transaction-manager) -- [ScalarDB Research Paper [VLDB'23]](https://dl.acm.org/doi/10.14778/3611540.3611563) \ No newline at end of file diff --git a/docs/requirements.md b/docs/requirements.md deleted file mode 100644 index f5238ee8da..0000000000 --- a/docs/requirements.md +++ /dev/null @@ -1,149 +0,0 @@ -> [!CAUTION] -> -> This documentation has been moved to the centralized ScalarDB documentation repository, [docs-internal-scalardb](https://github.com/scalar-labs/docs-internal-scalardb). Please update this documentation in that repository instead. -> -> To view the ScalarDB documentation, visit [ScalarDB Documentation](https://scalardb.scalar-labs.com/docs/). - -# Requirements and Recommendations for the Underlying Databases of ScalarDB - -This document explains the requirements and recommendations in the underlying databases of ScalarDB to make ScalarDB applications work correctly and efficiently. - -## Requirements - -ScalarDB requires each underlying database to provide certain capabilities to run transactions and analytics on the databases. This document explains the general requirements and how to configure each database to achieve the requirements. - -### General requirements - -#### Transactions -{:.no_toc} -ScalarDB requires each underlying database to provide at least the following capabilities to run transactions on the databases: - -- Linearizable read and conditional mutations (write and delete) on a single database record. -- Durability of written database records. -- Ability to store arbitrary data besides application data in each database record. - -#### Analytics -{:.no_toc} -ScalarDB requires each underlying database to provide the following capability to run analytics on the databases: - -- Ability to return only committed records. - -{% capture notice--info %} -**Note** - -You need to have database accounts that have enough privileges to access the databases through ScalarDB since ScalarDB runs on the underlying databases not only for CRUD operations but also for performing operations like creating or altering schemas, tables, or indexes. ScalarDB basically requires a fully privileged account to access the underlying databases. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -### How to configure databases to achieve the general requirements - -Select your database for details on how to configure it to achieve the general requirements. - -
-
- - - - -
- -
- -#### Transactions -{:.no_toc} -- Use a single primary server or synchronized multi-primary servers for all operations (no read operations on read replicas that are asynchronously replicated from a primary database). -- Use read-committed or stricter isolation levels. - -#### Analytics -{:.no_toc} -- Use read-committed or stricter isolation levels. - -
- -
- -#### Transactions -{:.no_toc} -- Use a single primary region for all operations. (No read and write operations on global tables in non-primary regions.) - - There is no concept for primary regions in DynamoDB, so you must designate a primary region by yourself. - -#### Analytics -{:.no_toc} -- Not applicable. DynamoDB always returns committed records, so there are no DynamoDB-specific requirements. - -
- -
- -#### Transactions -{:.no_toc} -- Use a single primary region for all operations with `Strong` or `Bounded Staleness` consistency. - -#### Analytics -{:.no_toc} -- Not applicable. Cosmos DB always returns committed records, so there are no Cosmos DB–specific requirements. - -
- -
- -#### Transactions -{:.no_toc} -- Use a single primary cluster for all operations (no read or write operations in non-primary clusters). -- Use `batch` or `group` for `commitlog_sync`. -- If you're using Cassandra-compatible databases, those databases must properly support lightweight transactions (LWT). - -#### Analytics -{:.no_toc} -- Not applicable. Cassandra always returns committed records, so there are no Cassandra-specific requirements. - -
-
- -## Recommendations - -Properly configuring each underlying database of ScalarDB for high performance and high availability is recommended. The following recommendations include some knobs and configurations to update. - -{% capture notice--info %} -**Note** - -ScalarDB can be seen as an application of underlying databases, so you may want to try updating other knobs and configurations that are commonly used to improve efficiency. -{% endcapture %} -
{{ notice--info | markdownify }}
- -
-
- - - - -
- -
-- Use read-committed isolation for better performance. -- Follow the performance optimization best practices for each database. For example, increasing the buffer size (for example, `shared_buffers` in PostgreSQL) and increasing the number of connections (for example, `max_connections` in PostgreSQL) are usually recommended for better performance. -
- -
-- Increase the number of read capacity units (RCUs) and write capacity units (WCUs) for high throughput. -- Enable point-in-time recovery (PITR). - -{% capture notice--info %} -**Note** - -Since DynamoDB stores data in multiple availability zones by default, you don’t need to adjust any configurations to improve availability. -{% endcapture %} -
{{ notice--info | markdownify }}
-
- -
-- Increase the number of Request Units (RUs) for high throughput. -- Enable point-in-time restore (PITR). -- Enable availability zones. -
- -
-- Increase `concurrent_reads` and `concurrent_writes` for high throughput. For details, see the official Cassandra documentation about [`concurrent_writes`](https://cassandra.apache.org/doc/stable/cassandra/configuration/cass_yaml_file.html#concurrent_writes). -
-
diff --git a/docs/scalardb-server.md b/docs/scalardb-server.md deleted file mode 100644 index 6c77573a36..0000000000 --- a/docs/scalardb-server.md +++ /dev/null @@ -1,184 +0,0 @@ -> [!CAUTION] -> -> This documentation has been moved to the centralized ScalarDB documentation repository, [docs-internal-scalardb](https://github.com/scalar-labs/docs-internal-scalardb). Please update this documentation in that repository instead. -> -> To view the ScalarDB documentation, visit [ScalarDB Documentation](https://scalardb.scalar-labs.com/docs/). - -# ScalarDB Server - -ScalarDB Server is a gRPC server that implements ScalarDB interface. -With ScalarDB Server, you can use ScalarDB features from multiple programming languages that are supported by gRPC. - -Currently, we provide only a Java client officially, and we will support other language clients officially in the future. -Of course, you can generate language-specific client stubs by yourself. -However, note that it is not necessarily straightforward to implement a client since it's using a bidirectional streaming RPC in gRPC, and you need to be familiar with it. - -This document explains how to install and use ScalarDB Server. - -## Install prerequisites - -ScalarDB Server is written in Java. So the following software is required to run it. - -* [Oracle JDK 8](https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html) (OpenJDK 8) or higher - -## Install ScalarDB Server - -We have Docker images in [our repository](https://github.com/orgs/scalar-labs/packages/container/package/scalardb-server) and zip archives of ScalarDB Server available in [releases](https://github.com/scalar-labs/scalardb/releases). - -If you are interested in building from source, run the following command: - -```shell -$ ./gradlew installDist -``` - -Of course, you can archive the jar and libraries by `./gradlew distZip` and so on. - -Also, you can build a Docker image from the source as follows. - -```shell -$ ./gradlew :server:docker -``` - -## Configure ScalarDB Server - -You need a property file holding the configuration for ScalarDB Server. -The property file must contain two sections: ScalarDB Server configurations and transaction manager configurations. - -```properties -# -# ScalarDB Server configurations -# - -# Port number of ScalarDB Server. The default is `60051`. -scalar.db.server.port=60051 - -# Prometheus exporter port. Prometheus exporter will not be started if a negative number is given. The default is `8080`. -scalar.db.server.prometheus_exporter_port=8080 - -# The maximum message size allowed to be received. If not specified, use the gRPC default value. -scalar.db.server.grpc.max_inbound_message_size= - -# The maximum size of metadata allowed to be received. If not specified, use the gRPC default value. -scalar.db.server.grpc.max_inbound_metadata_size= - -# The decommissioning duration in seconds. The default is `30`. -scalar.db.server.decommissioning_duration_secs=30 - -# -# Transaction manager configurations -# - -# Transaction manager implementation. The default is `consensus-commit`. -scalar.db.transaction_manager=consensus-commit - -# Storage implementation used for Consensus Commit. The default is `cassandra`. -scalar.db.storage=cassandra - -# Comma-separated contact points. -scalar.db.contact_points=localhost - -# Port number for all the contact points. -#scalar.db.contact_port= - -# Credential information to access the database. -scalar.db.username=cassandra -scalar.db.password=cassandra - -# Isolation level used for Consensus Commit. Either `SNAPSHOT` or `SERIALIZABLE` can be specified. The default is `SNAPSHOT`. -scalar.db.consensus_commit.isolation_level=SNAPSHOT - -# Serializable strategy used for Consensus Commit. -# Either `EXTRA_READ` or `EXTRA_WRITE` can be specified. The default is `EXTRA_READ`. -# If `SNAPSHOT` is specified in the property `scalar.db.consensus_commit.isolation_level`, this is ignored. -scalar.db.consensus_commit.serializable_strategy= -``` - -You can set some sensitive data (e.g., credentials) as the values of properties using environment variables. - -```properties -scalar.db.username=${env:SCALAR_DB_USERNAME} -scalar.db.password=${env:SCALAR_DB_PASSWORD} -``` - -For details about transaction manager configurations, see [ScalarDB Configurations](configurations.md). - -## Start ScalarDB Server - -### Docker images - -For Docker images, you need to pull the ScalarDB Server image first: -```shell -$ docker pull ghcr.io/scalar-labs/scalardb-server: -``` - -And then, you can start ScalarDB Server with the following command: -```shell -$ docker run -v :/scalardb/server/database.properties -d -p 60051:60051 -p 8080:8080 ghcr.io/scalar-labs/scalardb-server: -``` - -You can also start it with DEBUG logging as follows: -```shell -$ docker run -v :/scalardb/server/database.properties -e SCALAR_DB_LOG_LEVEL=DEBUG -d -p 60051:60051 -p 8080:8080 ghcr.io/scalar-labs/scalardb-server: -```` - -You can also start it with your custom log configuration as follows: -```shell -$ docker run -v :/scalardb/server/database.properties -v :/scalardb/server/log4j2.properties -d -p 60051:60051 -p 8080:8080 ghcr.io/scalar-labs/scalardb-server: -``` - -You can also start it with environment variables as follows: -```shell -$ docker run --env SCALAR_DB_CONTACT_POINTS=cassandra --env SCALAR_DB_CONTACT_PORT=9042 --env SCALAR_DB_USERNAME=cassandra --env SCALAR_DB_PASSWORD=cassandra --env SCALAR_DB_STORAGE=cassandra -d -p 60051:60051 -p 8080:8080 ghcr.io/scalar-labs/scalardb-server: -``` - -You can also start it with JMX as follows: -```shell -$ docker run -v :/scalardb/server/database.properties -e JAVA_OPTS="-Dlog4j.configurationFile=file:log4j2.properties -Djava.rmi.server.hostname= -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.port=9990 -Dcom.sun.management.jmxremote.rmi.port=9990 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false" -d -p 60051:60051 -p 8080:8080 -p 9990:9990 ghcr.io/scalar-labs/scalardb-server: -``` - -### Zip archives - -For zip archives, you can start ScalarDB Server with the following commands: - -```shell -$ unzip scalardb-server-.zip -$ cd scalardb-server- -$ export JAVA_OPTS="" -$ bin/scalardb-server --config -``` - -## Usage of the Java client of ScalarDB Server - -You can use the Java client of ScalarDB Server in almost the same way as other storages/databases. -The difference is that you need to set `scalar.db.transaction_manager` to `grpc` in your client side property file. - -```properties -# Transaction manager implementation. -scalar.db.transaction_manager=grpc - -# Comma-separated contact points. -scalar.db.contact_points= - -# Port number for all the contact points. -scalar.db.contact_port=60051 - -# The deadline duration for gRPC connections. The default is `60000` milliseconds (60 seconds). -scalar.db.grpc.deadline_duration_millis=60000 - -# The maximum message size allowed for a single gRPC frame. If not specified, use the gRPC default value. -scalar.db.grpc.max_inbound_message_size= - -# The maximum size of metadata allowed to be received. If not specified, use the gRPC default value. -scalar.db.grpc.max_inbound_metadata_size= -``` - -## Further reading - -Please see the following sample to learn ScalarDB Server further: - -- [ScalarDB Server Sample](https://github.com/scalar-labs/scalardb-samples/tree/main/scalardb-server-sample) - -Please also see the following documents to learn how to deploy ScalarDB Server: - -- [Deploy ScalarDB Server on AWS](https://github.com/scalar-labs/scalar-kubernetes/blob/master/docs/ManualDeploymentGuideScalarDBServerOnEKS.md) -- [Deploy ScalarDB Server on Azure](https://github.com/scalar-labs/scalar-kubernetes/blob/master/docs/ManualDeploymentGuideScalarDBServerOnAKS.md) diff --git a/docs/scalardb-supported-databases.md b/docs/scalardb-supported-databases.md deleted file mode 100644 index 36c0817f06..0000000000 --- a/docs/scalardb-supported-databases.md +++ /dev/null @@ -1,159 +0,0 @@ -> [!CAUTION] -> -> This documentation has been moved to the centralized ScalarDB documentation repository, [docs-internal-scalardb](https://github.com/scalar-labs/docs-internal-scalardb). Please update this documentation in that repository instead. -> -> To view the ScalarDB documentation, visit [ScalarDB Documentation](https://scalardb.scalar-labs.com/docs/). - -# Supported Databases - -ScalarDB supports the following databases and their versions. - -## Amazon DynamoDB - -| Version | DynamoDB | -|:------------------|:----------| -| **ScalarDB 3.12** | ✅ | -| **ScalarDB 3.11** | ✅ | -| **ScalarDB 3.10** | ✅ | -| **ScalarDB 3.9** | ✅ | -| **ScalarDB 3.8** | ✅ | -| **ScalarDB 3.7** | ✅ | - -## Apache Cassandra - -{% capture notice--info %} -**Note** - -For requirements when using Cassandra or Cassandra-compatible databases, see [How to configure databases to achieve the general requirements](requirements.md#how-to-configure-databases-to-achieve-the-general-requirements). -{% endcapture %} - -
{{ notice--info | markdownify }}
- -| Version | Cassandra 4.1 | Cassandra 4.0 | Cassandra 3.11 | Cassandra 3.0 | -|:------------------|:---------------|:---------------|:----------------|:---------------| -| **ScalarDB 3.12** | ❌ | ❌ | ✅ | ✅ | -| **ScalarDB 3.11** | ❌ | ❌ | ✅ | ✅ | -| **ScalarDB 3.10** | ❌ | ❌ | ✅ | ✅ | -| **ScalarDB 3.9** | ❌ | ❌ | ✅ | ✅ | -| **ScalarDB 3.8** | ❌ | ❌ | ✅ | ✅ | -| **ScalarDB 3.7** | ❌ | ❌ | ✅ | ✅ | - -## Azure Cosmos DB for NoSQL - -| Version | Cosmos DB for NoSQL | -|:------------------|:---------------------| -| **ScalarDB 3.12** | ✅ | -| **ScalarDB 3.11** | ✅ | -| **ScalarDB 3.10** | ✅ | -| **ScalarDB 3.9** | ✅ | -| **ScalarDB 3.8** | ✅ | -| **ScalarDB 3.7** | ✅ | - -## JDBC databases - -{% capture notice--info %} -**Note** - -For recommendations when using JDBC databases, see [Recommendations](requirements.md#recommendations). -{% endcapture %} - -
{{ notice--info | markdownify }}
- -### Amazon Aurora MySQL - -| Version | Aurora MySQL 3 | Aurora MySQL 2 | -|:------------------|:----------------|:----------------| -| **ScalarDB 3.12** | ✅ | ✅ | -| **ScalarDB 3.11** | ✅ | ✅ | -| **ScalarDB 3.10** | ✅ | ✅ | -| **ScalarDB 3.9** | ✅ | ✅ | -| **ScalarDB 3.8** | ✅ | ✅ | -| **ScalarDB 3.7** | ✅ | ✅ | - -### Amazon Aurora PostgreSQL - -| Version | Aurora PostgreSQL 15 | Aurora PostgreSQL 14 | Aurora PostgreSQL 13 | Aurora PostgreSQL 12 | -|:------------------|:----------------------|:----------------------|:----------------------|:----------------------| -| **ScalarDB 3.12** | ✅ | ✅ | ✅ | ✅ | -| **ScalarDB 3.11** | ✅ | ✅ | ✅ | ✅ | -| **ScalarDB 3.10** | ✅ | ✅ | ✅ | ✅ | -| **ScalarDB 3.9** | ✅ | ✅ | ✅ | ✅ | -| **ScalarDB 3.8** | ✅ | ✅ | ✅ | ✅ | -| **ScalarDB 3.7** | ✅ | ✅ | ✅ | ✅ | - -### MariaDB - -| Version | MariaDB 10.11 | -|:------------------|:--------------| -| **ScalarDB 3.12** | ✅ | -| **ScalarDB 3.11** | ✅ | -| **ScalarDB 3.10** | ✅ | -| **ScalarDB 3.9** | ✅ | -| **ScalarDB 3.8** | ✅ | -| **ScalarDB 3.7** | ✅ | - -### Microsoft SQL Server - -| Version | SQL Server 2022 | SQL Server 2019 | SQL Server 2017 | -|:------------------|:-----------------|:-----------------|:-----------------| -| **ScalarDB 3.12** | ✅ | ✅ | ✅ | -| **ScalarDB 3.11** | ✅ | ✅ | ✅ | -| **ScalarDB 3.10** | ✅ | ✅ | ✅ | -| **ScalarDB 3.9** | ✅ | ✅ | ✅ | -| **ScalarDB 3.8** | ✅ | ✅ | ✅ | -| **ScalarDB 3.7** | ✅ | ✅ | ✅ | - -### MySQL - -| Version | MySQL 8.1 | MySQL 8.0 | MySQL 5.7 | -|:------------------|:-----------|:-----------|:-----------| -| **ScalarDB 3.12** | ✅ | ✅ | ✅ | -| **ScalarDB 3.11** | ✅ | ✅ | ✅ | -| **ScalarDB 3.10** | ✅ | ✅ | ✅ | -| **ScalarDB 3.9** | ✅ | ✅ | ✅ | -| **ScalarDB 3.8** | ✅ | ✅ | ✅ | -| **ScalarDB 3.7** | ✅ | ✅ | ✅ | - -### Oracle - -| Version | Oracle 23.2.0-free | Oracle 21.3.0-xe | Oracle 18.4.0-xe | -|:------------------|:--------------------|:------------------|:------------------| -| **ScalarDB 3.12** | ✅ | ✅ | ✅ | -| **ScalarDB 3.11** | ✅ | ✅ | ✅ | -| **ScalarDB 3.10** | ✅ | ✅ | ✅ | -| **ScalarDB 3.9** | ✅ | ✅ | ✅ | -| **ScalarDB 3.8** | ✅ | ✅ | ✅ | -| **ScalarDB 3.7** | ✅ | ✅ | ✅ | - -### PostgreSQL - -| Version | PostgreSQL 15 | PostgreSQL 14 | PostgreSQL 13 | PostgreSQL 12 | -|:------------------|:---------------|:---------------|:---------------|:---------------| -| **ScalarDB 3.12** | ✅ | ✅ | ✅ | ✅ | -| **ScalarDB 3.11** | ✅ | ✅ | ✅ | ✅ | -| **ScalarDB 3.10** | ✅ | ✅ | ✅ | ✅ | -| **ScalarDB 3.9** | ✅ | ✅ | ✅ | ✅ | -| **ScalarDB 3.8** | ✅ | ✅ | ✅ | ✅ | -| **ScalarDB 3.7** | ✅ | ✅ | ✅ | ✅ | - -### SQLite - -| Version | SQLite 3 | -|:------------------|:----------| -| **ScalarDB 3.12** | ✅ | -| **ScalarDB 3.11** | ✅ | -| **ScalarDB 3.10** | ✅ | -| **ScalarDB 3.9** | ✅ | -| **ScalarDB 3.8** | ❌ | -| **ScalarDB 3.7** | ❌ | - -### YugabyteDB - -| Version | YugabyteDB 2 | -|:------------------|:-------------| -| **ScalarDB 3.12** | ❌ | -| **ScalarDB 3.11** | ❌ | -| **ScalarDB 3.10** | ❌ | -| **ScalarDB 3.9** | ❌ | -| **ScalarDB 3.8** | ❌ | -| **ScalarDB 3.7** | ❌ | diff --git a/docs/schema-loader-import.md b/docs/schema-loader-import.md deleted file mode 100644 index b5e2ea393b..0000000000 --- a/docs/schema-loader-import.md +++ /dev/null @@ -1,285 +0,0 @@ -> [!CAUTION] -> -> This documentation has been moved to the centralized ScalarDB documentation repository, [docs-internal-scalardb](https://github.com/scalar-labs/docs-internal-scalardb). Please update this documentation in that repository instead. -> -> To view the ScalarDB documentation, visit [ScalarDB Documentation](https://scalardb.scalar-labs.com/docs/). - -# Importing Existing Tables to ScalarDB by Using ScalarDB Schema Loader - -You might want to use ScalarDB (e.g., for database-spanning transactions) with your existing databases. In that case, you can import those databases under the ScalarDB control using ScalarDB Schema Loader. ScalarDB Schema Loader automatically adds ScalarDB-internal metadata columns in each existing table and metadata tables to enable various ScalarDB functionalities including transaction management across multiple databases. - -## Before you begin - -{% capture notice--warning %} -**Attention** - -You should carefully plan to import a table to ScalarDB in production because it will add transaction metadata columns to your database tables and the ScalarDB metadata tables. In this case, there would also be several differences between your database and ScalarDB, as well as some limitations. -{% endcapture %} - -
{{ notice--warning | markdownify }}
- -### What will be added to your databases - -- **ScalarDB metadata tables:** ScalarDB manages namespace names and table metadata in a namespace (schema or database in underlying databases) called 'scalardb'. -- **Transaction metadata columns:** The Consensus Commit transaction manager requires metadata (for example, transaction ID, record version, and transaction status) stored along with the actual records to handle transactions properly. Thus, this tool adds the metadata columns if you use the Consensus Commit transaction manager. - -{% capture notice--info %} -**Note** - -This tool only changes database metadata. Thus, the processing time does not increase in proportion to the database size and usually takes only several seconds. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -### Requirements - -- [JDBC databases](./scalardb-supported-databases.md#jdbc-databases), except for SQLite, can be imported. -- Each table must have primary key columns. (Composite primary keys can be available.) -- Target tables must only have columns with supported data types. For details, see [Data-type mapping from JDBC databases to ScalarDB](#data-type-mapping-from-jdbc-databases-to-scalardb)). - -### Set up Schema Loader - -To set up Schema Loader for importing existing tables, see [Set up Schema Loader](./schema-loader.md#set-up-schema-loader). - -## Run Schema Loader for importing existing tables - -You can import an existing table in JDBC databases to ScalarDB by using the `--import` option and an import-specific schema file. To import tables, run the following command, replacing the contents in the angle brackets as described: - -```console -$ java -jar scalardb-schema-loader-.jar --config -f --import -``` - -- ``: Version of ScalarDB Schema Loader that you set up. -- ``: Path to a properties file for ScalarDB. For a sample properties file, see [`database.properties`](https://github.com/scalar-labs/scalardb/blob/master/conf/database.properties). -- ``: Path to an import schema file. For a sample, see [Sample import schema file](#sample-import-schema-file). - -If you use the Consensus Commit transaction manager after importing existing tables, run the following command separately, replacing the contents in the angle brackets as described: - -```console -$ java -jar scalardb-schema-loader-.jar --config --coordinator -``` - -## Sample import schema file - -The following is a sample schema for importing tables. For the sample schema file, see [`import_schema_sample.json`](https://github.com/scalar-labs/scalardb/blob/master/schema-loader/sample/import_schema_sample.json). - -```json -{ - "sample_namespace1.sample_table1": { - "transaction": true - }, - "sample_namespace1.sample_table2": { - "transaction": true - }, - "sample_namespace2.sample_table3": { - "transaction": false - } -} -``` - -The import table schema consists of a namespace name, a table name, and a `transaction` field. The `transaction` field indicates whether the table will be imported for transactions or not. If you set the `transaction` field to `true` or don't specify the `transaction` field, this tool creates a table with transaction metadata if needed. If you set the `transaction` field to `false`, this tool imports a table without adding transaction metadata (that is, for a table using the [Storage API](storage-abstraction.md)). - -## Data-type mapping from JDBC databases to ScalarDB - -The following table shows the supported data types in each JDBC database and their mapping to the ScalarDB data types. Select your database and check if your existing tables can be imported. - -
-
- - - - -
- -
- -| MySQL | ScalarDB | Notes | -|--------------|----------|-----------------------| -| bigint | BIGINT | [*1](#warn-data-size) | -| binary | BLOB | | -| bit | BOOLEAN | | -| blob | BLOB | [*2](#warn-data-size) | -| char | TEXT | [*2](#warn-data-size) | -| double | DOUBLE | | -| float | FLOAT | | -| int | INT | | -| int unsigned | BIGINT | [*2](#warn-data-size) | -| integer | INT | | -| longblob | BLOB | | -| longtext | TEXT | | -| mediumblob | BLOB | [*2](#warn-data-size) | -| mediumint | INT | [*2](#warn-data-size) | -| mediumtext | TEXT | [*2](#warn-data-size) | -| smallint | INT | [*2](#warn-data-size) | -| text | TEXT | [*2](#warn-data-size) | -| tinyblob | BLOB | [*2](#warn-data-size) | -| tinyint | INT | [*2](#warn-data-size) | -| tinyint(1) | BOOLEAN | | -| tinytext | TEXT | [*2](#warn-data-size) | -| varbinary | BLOB | [*2](#warn-data-size) | -| varchar | TEXT | [*2](#warn-data-size) | - -Data types not listed in the above are not supported. The following are some common data types that are not supported: - -- bigint unsigned -- bit(n) (n > 1) -- date -- datetime -- decimal -- enum -- geometry -- json -- numeric -- set -- time -- timestamp -- year - -
- -
- -| PostgreSQL/YugabyteDB | ScalarDB | Notes | -|-----------------------|----------|-----------------------| -| bigint | BIGINT | [*1](#warn-data-size) | -| boolean | BOOLEAN | | -| bytea | BLOB | | -| character | TEXT | [*2](#warn-data-size) | -| character varying | TEXT | [*2](#warn-data-size) | -| double precision | DOUBLE | | -| integer | INT | | -| real | FLOAT | | -| smallint | INT | [*2](#warn-data-size) | -| text | TEXT | | - -Data types not listed in the above are not supported. The following are some common data types that are not supported: - -- bigserial -- bit -- box -- cidr -- circle -- date -- inet -- interval -- json -- jsonb -- line -- lseg -- macaddr -- macaddr8 -- money -- numeric -- path -- pg_lsn -- pg_snapshot -- point -- polygon -- smallserial -- serial -- time -- timestamp -- tsquery -- tsvector -- txid_snapshot -- uuid -- xml - -
- -
- -| Oracle | ScalarDB | Notes | -|---------------|-----------------|-----------------------| -| binary_double | DOUBLE | | -| binary_float | FLOAT | | -| blob | BLOB | [*3](#warn-data-size) | -| char | TEXT | [*2](#warn-data-size) | -| clob | TEXT | | -| float | DOUBLE | [*4](#warn-data-size) | -| long | TEXT | | -| long raw | BLOB | | -| nchar | TEXT | [*2](#warn-data-size) | -| nclob | TEXT | | -| number | BIGINT / DOUBLE | [*5](#warn-data-size) | -| nvarchar2 | TEXT | [*2](#warn-data-size) | -| raw | BLOB | [*2](#warn-data-size) | -| varchar2 | TEXT | [*2](#warn-data-size) | - -Data types not listed in the above are not supported. The following are some common data types that are not supported: - -- date -- timestamp -- interval -- rowid -- urowid -- bfile -- json - -
- -
- -| SQL Server | ScalarDB | Notes | -|------------|----------|-----------------------| -| bigint | BIGINT | [*1](#warn-data-size) | -| binary | BLOB | [*2](#warn-data-size) | -| bit | BOOLEAN | | -| char | TEXT | [*2](#warn-data-size) | -| float | DOUBLE | | -| image | BLOB | | -| int | INT | | -| nchar | TEXT | [*2](#warn-data-size) | -| ntext | TEXT | | -| nvarchar | TEXT | [*2](#warn-data-size) | -| real | FLOAT | | -| smallint | INT | [*2](#warn-data-size) | -| text | TEXT | | -| tinyint | INT | [*2](#warn-data-size) | -| varbinary | BLOB | [*2](#warn-data-size) | -| varchar | TEXT | [*2](#warn-data-size) | - -Data types not listed in the above are not supported. The following are some common data types that are not supported: - -- cursor -- date -- datetime -- datetime2 -- datetimeoffset -- decimal -- geography -- geometry -- hierarchyid -- money -- numeric -- rowversion -- smalldatetime -- smallmoney -- sql_variant -- time -- uniqueidentifier -- xml - -
- -
- -{% capture notice--warning %} -**Attention** - -1. The value range of `BIGINT` in ScalarDB is from -2^53 to 2^53, regardless of the size of `bigint` in the underlying database. Thus, if the data out of this range exists in the imported table, ScalarDB cannot read it. -2. For certain data types noted above, ScalarDB may map a data type larger than that of the underlying database. In that case, You will see errors when putting a value with a size larger than the size specified in the underlying database. -3. The maximum size of `BLOB` in ScalarDB is about 2GB (precisely 2^31-1 bytes). In contrast, Oracle `blob` can have (4GB-1)*(number of blocks). Thus, if data larger than 2GB exists in the imported table, ScalarDB cannot read it. -4. ScalarDB does not support Oracle `float` columns that have a higher precision than `DOUBLE` in ScalarDB. -5. ScalarDB does not support Oracle `numeric(p, s)` columns (`p` is precision and `s` is scale) when `p` is larger than 15 due to the maximum size of the data type in ScalarDB. Note that ScalarDB maps the column to `BIGINT` if `s` is zero; otherwise ScalarDB will map the column to `DOUBLE`. For the latter case, be aware that round-up or round-off can happen in the underlying database since the floating-point value will be cast to a fixed-point value. - -{% endcapture %} - -
{{ notice--warning | markdownify }}
- -## Use import function in your application - -You can use the import function in your application by using the following interfaces: - -- [ScalarDB Admin API](./api-guide.md#import-a-table) -- [ScalarDB Schema Loader API](./schema-loader.md#use-schema-loader-in-your-application) diff --git a/docs/schema-loader.md b/docs/schema-loader.md deleted file mode 100644 index 48a4ed1879..0000000000 --- a/docs/schema-loader.md +++ /dev/null @@ -1,792 +0,0 @@ -> [!CAUTION] -> -> This documentation has been moved to the centralized ScalarDB documentation repository, [docs-internal-scalardb](https://github.com/scalar-labs/docs-internal-scalardb). Please update this documentation in that repository instead. -> -> To view the ScalarDB documentation, visit [ScalarDB Documentation](https://scalardb.scalar-labs.com/docs/). - -# ScalarDB Schema Loader - -ScalarDB has its own data model and schema that maps to the implementation-specific data model and schema. In addition, ScalarDB stores internal metadata, such as transaction IDs, record versions, and transaction statuses, to manage transaction logs and statuses when you use the Consensus Commit transaction manager. - -Since managing the schema mapping and metadata for transactions can be difficult, you can use ScalarDB Schema Loader, which is a tool to create schemas that doesn't require you to need in-depth knowledge about schema mapping or metadata. - -You have two options to specify general CLI options in Schema Loader: - -- Pass the ScalarDB properties file and database-specific or storage-specific options. -- Pass database-specific or storage-specific options without the ScalarDB properties file. (Deprecated) - -{% capture notice--info %} -**Note** - -This tool supports only basic options to create, delete, repair, or alter a table. If you want to use the advanced features of a database, you must alter your tables with a database-specific tool after creating the tables with this tool. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -## Set up Schema Loader - -Select your preferred method to set up Schema Loader, and follow the instructions. - -
-
- - -
- -
- -You can download the release versions of Schema Loader from the [ScalarDB Releases](https://github.com/scalar-labs/scalardb/releases) page. -
-
- -You can pull the Docker image from the [Scalar container registry](https://github.com/orgs/scalar-labs/packages/container/package/scalardb-schema-loader) by running the following command, replacing the contents in the angle brackets as described: - -```console -$ docker run --rm -v : [-v :] ghcr.io/scalar-labs/scalardb-schema-loader: -``` - -{% capture notice--info %} -**Note** - -You can specify the same command arguments even if you use the fat JAR or the container. In the [Available commands](#available-commands) section, the JAR is used, but you can run the commands by using the container in the same way by replacing `java -jar scalardb-schema-loader-.jar` with `docker run --rm -v : [-v :] ghcr.io/scalar-labs/scalardb-schema-loader:`. -{% endcapture %} - -
{{ notice--info | markdownify }}
-
-
- -## Run Schema Loader - -This section explains how to run Schema Loader. - -### Available commands - -Select how you would like to configure Schema Loader for your database. The preferred method is to use the properties file since other, database-specific methods are deprecated. - -The following commands are available when using the properties file: - -```console -Usage: java -jar scalardb-schema-loader-.jar [-D] [--coordinator] - [--no-backup] [--no-scaling] -c= - [--compaction-strategy=] [-f=] - [--replication-factor=] - [--replication-strategy=] [--ru=] -Create/Delete schemas in the storage defined in the config file - -A, --alter Alter tables : it will add new columns and create/delete - secondary index for existing tables. It compares the - provided table schema to the existing schema to decide - which columns need to be added and which indexes need - to be created or deleted - -c, --config= - Path to the config file of ScalarDB - --compaction-strategy= - The compaction strategy, must be LCS, STCS or TWCS - (supported in Cassandra) - --coordinator Create/delete/repair Coordinator tables - -D, --delete-all Delete tables - -f, --schema-file= - -I, --import Import tables : it will import existing non-ScalarDB - tables to ScalarDB. - Path to the schema json file - --no-backup Disable continuous backup (supported in DynamoDB) - --no-scaling Disable auto-scaling (supported in DynamoDB, Cosmos DB) - --repair-all Repair namespaces and tables that are in an unknown - state: it re-creates namespaces, tables, secondary - indexes, and their metadata if necessary. - --replication-factor= - The replication factor (supported in Cassandra) - --replication-strategy= - The replication strategy, must be SimpleStrategy or - NetworkTopologyStrategy (supported in Cassandra) - --ru= Base resource unit (supported in DynamoDB, Cosmos DB) - --upgrade Upgrades the ScalarDB environment to support the latest - version of the ScalarDB API. Typically, as indicated in - the release notes, you will need to run this command - after updating the ScalarDB version that your - application environment uses. - -``` - -For a sample properties file, see [`database.properties`](https://github.com/scalar-labs/scalardb/blob/master/conf/database.properties). - -{% capture notice--info %} -**Note** - -The following database-specific methods have been deprecated. Please use the [commands for configuring the properties file](#available-commands) instead. - -
-
- - - - -
- -
- -```console -Usage: java -jar scalardb-schema-loader-.jar --cassandra [-D] - [-c=] -f= -h= - [-n=] [-p=] [-P=] - [-R=] [-u=] -Create/Delete Cassandra schemas - -A, --alter Alter tables : it will add new columns and create/delete - secondary index for existing tables. It compares the - provided table schema to the existing schema to decide - which columns need to be added and which indexes need - to be created or deleted - -c, --compaction-strategy= - Cassandra compaction strategy, must be LCS, STCS or TWCS - -D, --delete-all Delete tables - -f, --schema-file= - Path to the schema json file - -h, --host= Cassandra host IP - -n, --network-strategy= - Cassandra network strategy, must be SimpleStrategy or - NetworkTopologyStrategy - -p, --password= - Cassandra password - -P, --port= Cassandra Port - -R, --replication-factor= - Cassandra replication factor - --repair-all Repair tables : it repairs the table metadata of - existing tables - -u, --user= Cassandra user -``` -
-
- -```console -Usage: java -jar scalardb-schema-loader-.jar --cosmos [-D] - [--no-scaling] -f= -h= -p= [-r=] -Create/Delete Cosmos DB schemas - -A, --alter Alter tables : it will add new columns and create/delete - secondary index for existing tables. It compares the - provided table schema to the existing schema to decide - which columns need to be added and which indexes need - to be created or deleted - -D, --delete-all Delete tables - -f, --schema-file= - Path to the schema json file - -h, --host= Cosmos DB account URI - --no-scaling Disable auto-scaling for Cosmos DB - -p, --password= Cosmos DB key - -r, --ru= Base resource unit - --repair-all Repair tables : it repairs the table metadata of - existing tables and repairs stored procedure - attached to each table -``` -
-
- -```console -Usage: java -jar scalardb-schema-loader-.jar --dynamo [-D] - [--no-backup] [--no-scaling] [--endpoint-override=] - -f= -p= [-r=] --region= - -u= -Create/Delete DynamoDB schemas - -A, --alter Alter tables : it will add new columns and create/delete - secondary index for existing tables. It compares the - provided table schema to the existing schema to decide - which columns need to be added and which indexes need - to be created or deleted - -D, --delete-all Delete tables - --endpoint-override= - Endpoint with which the DynamoDB SDK should - communicate - -f, --schema-file= - Path to the schema json file - --no-backup Disable continuous backup for DynamoDB - --no-scaling Disable auto-scaling for DynamoDB - -p, --password= AWS access secret key - -r, --ru= Base resource unit - --region= AWS region - --repair-all Repair tables : it repairs the table metadata of - existing tables - -u, --user= AWS access key ID -``` -
-
- -```console -Usage: java -jar scalardb-schema-loader-.jar --jdbc [-D] - -f= -j= -p= -u= -Create/Delete JDBC schemas - -A, --alter Alter tables : it will add new columns and create/delete - secondary index for existing tables. It compares the - provided table schema to the existing schema to decide - which columns need to be added and which indexes need - to be created or deleted - -D, --delete-all Delete tables - -f, --schema-file= - Path to the schema json file - -j, --jdbc-url= JDBC URL - -p, --password= - JDBC password - --repair-all Repair tables : it repairs the table metadata of - existing tables - -u, --user= JDBC user -``` -
-
-{% endcapture %} - -
{{ notice--info | markdownify }}
- -### Create namespaces and tables - -To create namespaces and tables by using a properties file, run the following command, replacing the contents in the angle brackets as described: - -```console -$ java -jar scalardb-schema-loader-.jar --config -f [--coordinator] -``` - -If `--coordinator` is specified, a [Coordinator table](api-guide.md#specify-operations-for-the-coordinator-table) will be created. - -{% capture notice--info %} -**Note** - -The following database-specific CLI arguments have been deprecated. Please use the CLI arguments for configuring the properties file instead. - -
-
- - - - -
- -
- -```console -$ java -jar scalardb-schema-loader-.jar --cassandra -h [-P ] [-u ] [-p ] -f [-n ] [-R ] -``` - -- If `-P ` is not supplied, it defaults to `9042`. -- If `-u ` is not supplied, it defaults to `cassandra`. -- If `-p ` is not supplied, it defaults to `cassandra`. -- `` should be `SimpleStrategy` or `NetworkTopologyStrategy` -
-
- -```console -$ java -jar scalardb-schema-loader-.jar --cosmos -h -p -f [-r BASE_RESOURCE_UNIT] -``` - -- `` you can use a primary key or a secondary key. -- `-r BASE_RESOURCE_UNIT` is an option. You can specify the RU of each database. The maximum RU in tables in the database will be set. If you don't specify RU of tables, the database RU will be set with this option. By default, it's 400. -
-
- -```console -$ java -jar scalardb-schema-loader-.jar --dynamo -u -p --region -f [-r BASE_RESOURCE_UNIT] -``` - -- `` should be a string to specify an AWS region like `ap-northeast-1`. -- `-r` option is almost the same as Cosmos DB for NoSQL option. However, the unit means DynamoDB capacity unit. The read and write capacity units are set the same value. -
-
- -```console -$ java -jar scalardb-schema-loader-.jar --jdbc -j -u -p -f -``` -
-
-{% endcapture %} - -
{{ notice--info | markdownify }}
- -### Alter tables - -You can use a command to add new columns to and create or delete a secondary index for existing tables. This command compares the provided table schema to the existing schema to decide which columns need to be added and which indexes need to be created or deleted. - -To add new colums to and create or delete a secondary index for existing tables, run the following command, replacing the contents in the angle brackets as described: - -```console -$ java -jar scalardb-schema-loader-.jar --config -f --alter -``` - -{% capture notice--info %} -**Note** - -The following database-specific CLI arguments have been deprecated. Please use the CLI arguments for configuring the properties file instead. - -
-
- - - - -
- -
- -```console -$ java -jar scalardb-schema-loader-.jar --cassandra -h [-P ] [-u ] [-p ] -f --alter -``` -
-
- -```console -$ java -jar scalardb-schema-loader-.jar --cosmos -h -p -f --alter -``` -
-
- -```console -$ java -jar scalardb-schema-loader-.jar --dynamo -u -p --region -f --alter -``` -
-
- -```console -$ java -jar scalardb-schema-loader-.jar --jdbc -j -u -p -f --alter -``` -
-
-{% endcapture %} - -
{{ notice--info | markdownify }}
- -### Delete tables - -You can delete tables by using the properties file. To delete tables, run the following command, replacing the contents in the angle brackets as described: - -```console -$ java -jar scalardb-schema-loader-.jar --config -f [--coordinator] -D -``` - -If `--coordinator` is specified, the Coordinator table will be deleted as well. - -{% capture notice--info %} -**Note** - -The following database-specific CLI arguments have been deprecated. Please use the CLI arguments for configuring the properties file instead. - -
-
- - - - -
- -
- -```console -$ java -jar scalardb-schema-loader-.jar --cassandra -h [-P ] [-u ] [-p ] -f -D -``` -
-
- -```console -$ java -jar scalardb-schema-loader-.jar --cosmos -h -p -f -D -``` -
-
- -```console -$ java -jar scalardb-schema-loader-.jar --dynamo -u -p --region -f -D -``` -
-
- -```console -$ java -jar scalardb-schema-loader-.jar --jdbc -j -u -p -f -D -``` -
-
-{% endcapture %} - -
{{ notice--info | markdownify }}
- -### Repair namespaces and tables - -You can repair namespaces and tables by using the properties file. The reason for repairing namespaces and tables is because they can be in an unknown state, such as a namespace or table exists in the underlying storage but not its ScalarDB metadata or vice versa. Repairing the namespaces, the tables, the secondary indexes, and their metadata requires re-creating them if necessary. To repair them, run the following command, replacing the contents in the angle brackets as described: - -```console -$ java -jar scalardb-schema-loader-.jar --config -f [--coordinator] --repair-all -``` - -If `--coordinator` is specified, the Coordinator table will be repaired as well. In addition, if you're using Cosmos DB for NoSQL, running this command will also repair stored procedures attached to each table. - -{% capture notice--info %} -**Note** - -The following database-specific CLI arguments have been deprecated. Please use the CLI arguments for configuring the properties file instead. - -
-
- - - - -
- -
- -```console -$ java -jar scalardb-schema-loader-.jar --cassandra -h [-P ] [-u ] [-p ] -f --repair-all -``` -
-
- -```console -$ java -jar scalardb-schema-loader-.jar --cosmos -h -p -f --repair-all -``` -
-
- -```console -$ java -jar scalardb-schema-loader-.jar --dynamo -u -p --region [--no-backup] -f --repair-all -``` -
-
- -```console -$ java -jar scalardb-schema-loader-.jar --jdbc -j -u -p -f --repair-all -``` -
-
-{% endcapture %} - -
{{ notice--info | markdownify }}
- -### Import tables - -You can import an existing table in JDBC databases to ScalarDB by using the `--import` option and an import-specific schema file. For details, see [Importing Existing Tables to ScalarDB by Using ScalarDB Schema Loader](./schema-loader-import.md). - -### Upgrade the environment to support the latest ScalarDB API - -You can upgrade the ScalarDB environment to support the latest version of the ScalarDB API. Typically, as indicated in the release notes, you will need to run this command after updating the ScalarDB version that your application environment uses. When running the following command, be sure to replace the contents in the angle brackets as described: - -```console -$ java -jar scalardb-schema-loader-.jar --config --upgrade -``` - -### Sample schema file - -The following is a sample schema. For a sample schema file, see [`schema_sample.json`](https://github.com/scalar-labs/scalardb/blob/master/schema-loader/sample/schema_sample.json). - -```json -{ - "sample_db.sample_table": { - "transaction": false, - "partition-key": [ - "c1" - ], - "clustering-key": [ - "c4 ASC", - "c6 DESC" - ], - "columns": { - "c1": "INT", - "c2": "TEXT", - "c3": "BLOB", - "c4": "INT", - "c5": "BOOLEAN", - "c6": "INT" - }, - "secondary-index": [ - "c2", - "c4" - ] - }, - - "sample_db.sample_table1": { - "transaction": true, - "partition-key": [ - "c1" - ], - "clustering-key": [ - "c4" - ], - "columns": { - "c1": "INT", - "c2": "TEXT", - "c3": "INT", - "c4": "INT", - "c5": "BOOLEAN" - } - }, - - "sample_db.sample_table2": { - "transaction": false, - "partition-key": [ - "c1" - ], - "clustering-key": [ - "c4", - "c3" - ], - "columns": { - "c1": "INT", - "c2": "TEXT", - "c3": "INT", - "c4": "INT", - "c5": "BOOLEAN" - } - } -} -``` - -The schema has table definitions that include `columns`, `partition-key`, `clustering-key`, `secondary-index`, and `transaction` fields. - -- The `columns` field defines columns of the table and their data types. -- The `partition-key` field defines which columns the partition key is composed of. -- The `clustering-key` field defines which columns the clustering key is composed of. -- The `secondary-index` field defines which columns are indexed. -- The `transaction` field indicates whether the table is for transactions or not. - - If you set the `transaction` field to `true` or don't specify the `transaction` field, this tool creates a table with transaction metadata if needed. - - If you set the `transaction` field to `false`, this tool creates a table without any transaction metadata (that is, for a table with [Storage API](storage-abstraction.md)). - -You can also specify database or storage-specific options in the table definition as follows: - -```json -{ - "sample_db.sample_table3": { - "partition-key": [ - "c1" - ], - "columns": { - "c1": "INT", - "c2": "TEXT", - "c3": "BLOB" - }, - "compaction-strategy": "LCS", - "ru": 5000 - } -} -``` - -The database or storage-specific options you can specify are as follows: - -
-
- - - - -
- -
- -The `compaction-strategy` option is the compaction strategy used. This option should be `STCS` (SizeTieredCompaction), `LCS` (LeveledCompactionStrategy), or `TWCS` (TimeWindowCompactionStrategy). -
-
- -The `ru` option stands for Request Units. For details, see [RUs](#rus). -
-
- -The `ru` option stands for Request Units. For details, see [RUs](#rus). -
-
- -No options are available for JDBC databases. -
-
- -## Scale for performance when using Cosmos DB for NoSQL or DynamoDB - -When using Cosmos DB for NoSQL or DynamoDB, you can scale by using Request Units (RUs) or auto-scaling. - -### RUs - -You can scale the throughput of Cosmos DB for NoSQL and DynamoDB by specifying the `--ru` option. When specifying this option, scaling applies to all tables or the `ru` parameter for each table. - -If the `--ru` option is not set, the default values will be `400` for Cosmos DB for NoSQL and `10` for DynamoDB. - -{% capture notice--info %} -**Note** - -- Schema Loader abstracts [Request Units](https://docs.microsoft.com/azure/cosmos-db/request-units) for Cosmos DB for NoSQL and [Capacity Units](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadWriteCapacityMode.html#HowItWorks.ProvisionedThroughput.Manual) for DynamoDB with `RU`. Therefore, be sure to set an appropriate value depending on the database implementation. -- Be aware that Schema Loader sets the same value to both read capacity unit and write capacity unit for DynamoDB. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -### Auto-scaling - -By default, Schema Loader enables auto-scaling of RUs for all tables: RUs scale between 10 percent and 100 percent of a specified RU depending on the workload. For example, if you specify `-r 10000`, the RUs of each table auto-scales between `1000` and `10000`. - -{% capture notice--info %} -**Note** - -Auto-scaling for Cosmos DB for NoSQL is enabled only when this option is set to `4000` or more. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -## Data-type mapping between ScalarDB and other databases - -The following table shows the supported data types in ScalarDB and their mapping to the data types of other databases. - -| ScalarDB | Cassandra | Cosmos DB for NoSQL | DynamoDB | MySQL | PostgreSQL/YugabyteDB | Oracle | SQL Server | SQLite | -|-----------|-----------|---------------------|----------|----------|-----------------------|----------------|-----------------|---------| -| BOOLEAN | boolean | boolean (JSON) | BOOL | boolean | boolean | number(1) | bit | boolean | -| INT | int | number (JSON) | N | int | int | number(10) | int | int | -| BIGINT | bigint | number (JSON) | N | bigint | bigint | number(19) | bigint | bigint | -| FLOAT | float | number (JSON) | N | real | real | binary_float | float(24) | float | -| DOUBLE | double | number (JSON) | N | double | double precision | binary_double | float | double | -| TEXT | text | string (JSON) | S | longtext | text | varchar2(4000) | varchar(8000) | text | -| BLOB | blob | string (JSON) | B | longblob | bytea | RAW(2000) | varbinary(8000) | blob | - -However, the following data types in JDBC databases are converted differently when they are used as a primary key or a secondary index key. This is due to the limitations of RDB data types. - -| ScalarDB | MySQL | PostgreSQL/YugabyteDB | Oracle | -|----------|---------------|-----------------------|--------------| -| TEXT | VARCHAR(64) | VARCHAR(10485760) | VARCHAR2(64) | -| BLOB | VARBINARY(64) | | RAW(64) | - -The value range of `BIGINT` in ScalarDB is from -2^53 to 2^53, regardless of the underlying database. - -{% capture notice--info %} -**Note** - -YugabyteDB has limitations that prevent floating point types (FLOAT and DOUBLE) from functioning correctly as a primary key, clustering keys, or secondary index keys. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -If this data-type mapping doesn't match your application, please alter the tables to change the data types after creating them by using this tool. - -## Internal metadata for Consensus Commit - -The Consensus Commit transaction manager manages metadata (for example, transaction ID, record version, and transaction status) stored along with the actual records to handle transactions properly. - -Thus, along with any columns that the application requires, additional columns for the metadata need to be defined in the schema. Additionally, this tool creates a table with the metadata if you use the Consensus Commit transaction manager. - -## Use Schema Loader in your application - -You can check the version of Schema Loader from the [Maven Central Repository](https://mvnrepository.com/artifact/com.scalar-labs/scalardb-schema-loader). For example in Gradle, you can add the following dependency to your `build.gradle` file, replacing `` with the version of Schema Loader that you want to use: - -```gradle -dependencies { - implementation 'com.scalar-labs:scalardb-schema-loader:' -} -``` - -### Sample usage - -By using the `SchemaLoader` class, you can execute the same commands as the CLI: - -- To create, alter, delete, or repair tables that are defined in the schema, you can pass a ScalarDB properties file, schema, and additional options, if needed. -- To upgrade the environment, you can pass a ScalarDB properties and additional options, if needed. - -```java -public class SchemaLoaderSample { - public static int main(String... args) throws SchemaLoaderException { - Path configFilePath = Paths.get("database.properties"); - // "sample_schema.json" and "altered_sample_schema.json" can be found in the "/sample" directory. - Path schemaFilePath = Paths.get("sample_schema.json"); - Path alteredSchemaFilePath = Paths.get("altered_sample_schema.json"); - boolean createCoordinatorTables = true; // whether to create the Coordinator table or not - boolean deleteCoordinatorTables = true; // whether to delete the Coordinator table or not - boolean repairCoordinatorTables = true; // whether to repair the Coordinator table or not - - Map tableCreationOptions = new HashMap<>(); - - tableCreationOptions.put( - CassandraAdmin.REPLICATION_STRATEGY, ReplicationStrategy.SIMPLE_STRATEGY.toString()); - tableCreationOptions.put(CassandraAdmin.COMPACTION_STRATEGY, CompactionStrategy.LCS.toString()); - tableCreationOptions.put(CassandraAdmin.REPLICATION_FACTOR, "1"); - - tableCreationOptions.put(DynamoAdmin.REQUEST_UNIT, "1"); - tableCreationOptions.put(DynamoAdmin.NO_SCALING, "true"); - tableCreationOptions.put(DynamoAdmin.NO_BACKUP, "true"); - - Map indexCreationOptions = new HashMap<>(); - indexCreationOptions.put(DynamoAdmin.NO_SCALING, "true"); - - Map reparationOptions = new HashMap<>(); - reparationOptions.put(DynamoAdmin.NO_BACKUP, "true"); - - Map upgradeOptions = new HashMap<>(tableCreationOptions); - - // Create tables. - SchemaLoader.load(configFilePath, schemaFilePath, tableCreationOptions, createCoordinatorTables); - - // Alter tables. - SchemaLoader.alterTables(configFilePath, alteredSchemaFilePath, indexCreationOptions); - - // Repair namespaces and tables. - SchemaLoader.repairAll(configFilePath, schemaFilePath, reparationOptions, repairCoordinatorTables); - - // Delete tables. - SchemaLoader.unload(configFilePath, schemaFilePath, deleteCoordinatorTables); - - // Upgrade the environment - SchemaLoader.upgrade(configFilePath, upgradeOptions); - - return 0; - } -} -``` - -You can also create, delete, or repair a schema by passing a serialized-schema JSON string (the raw text of a schema file) as shown below: - -```java -// Create tables. -SchemaLoader.load(configFilePath, serializedSchemaJson, tableCreationOptions, createCoordinatorTables); - -// Alter tables. -SchemaLoader.alterTables(configFilePath, serializedAlteredSchemaFilePath, indexCreationOptions); - -// Repair namespaces and tables. -SchemaLoader.repairAll(configFilePath, serializedSchemaJson, reparationOptions, repairCoordinatorTables); - -// Delete tables. -SchemaLoader.unload(configFilePath, serializedSchemaJson, deleteCoordinatorTables); -``` - -When configuring ScalarDB, you can use a `Properties` object as well, as shown below: - -```java -// Create tables. -SchemaLoader.load(properties, serializedSchemaJson, tableCreationOptions, createCoordinatorTables); - -// Alter tables. -SchemaLoader.alterTables(properties, serializedAlteredSchemaFilePath, indexCreationOptions); - -// Repair namespaces and tables. -SchemaLoader.repairAll(properties, serializedSchemaJson, reparationOptions, repairCoordinatorTables); - -// Delete tables. -SchemaLoader.unload(properties, serializedSchemaJson, deleteCoordinatorTables); - -// Upgrade the environment -SchemaLoader.upgrade(properties, upgradeOptions); -``` - -### Import tables - -You can import an existing JDBC database table to ScalarDB by using the `--import` option and an import-specific schema file, in a similar manner as shown in [Sample schema file](#sample-schema-file). For details, see [Importing Existing Tables to ScalarDB by Using ScalarDB Schema Loader](./schema-loader-import.md). - -{% capture notice--warning %} -**Attention** - -You should carefully plan to import a table to ScalarDB in production because it will add transaction metadata columns to your database tables and the ScalarDB metadata tables. In this case, there would also be several differences between your database and ScalarDB, as well as some limitations. -{% endcapture %} - -
{{ notice--warning | markdownify }}
- -The following is an import sample: - -```java -public class SchemaLoaderImportSample { - public static int main(String... args) throws SchemaLoaderException { - Path configFilePath = Paths.get("database.properties"); - // "import_sample_schema.json" can be found in the "/sample" directory. - Path schemaFilePath = Paths.get("import_sample_schema.json"); - Map tableImportOptions = new HashMap<>(); - - // Import tables. - // You can also use a Properties object instead of configFilePath and a serialized-schema JSON - // string instead of schemaFilePath. - SchemaLoader.importTables(configFilePath, schemaFilePath, tableImportOptions); - - return 0; - } -} -``` diff --git a/docs/slides/TransactionManagementOnCassandra.pdf b/docs/slides/TransactionManagementOnCassandra.pdf deleted file mode 100644 index 40e855bdff..0000000000 Binary files a/docs/slides/TransactionManagementOnCassandra.pdf and /dev/null differ diff --git a/docs/storage-abstraction.md b/docs/storage-abstraction.md deleted file mode 100644 index a7430cf7b2..0000000000 --- a/docs/storage-abstraction.md +++ /dev/null @@ -1,900 +0,0 @@ -> [!CAUTION] -> -> This documentation has been moved to the centralized ScalarDB documentation repository, [docs-internal-scalardb](https://github.com/scalar-labs/docs-internal-scalardb). Please update this documentation in that repository instead. -> -> To view the ScalarDB documentation, visit [ScalarDB Documentation](https://scalardb.scalar-labs.com/docs/). - -# Storage Abstraction and API Guide - -This page explains how to use the Storage API for users who are experts in ScalarDB. - -One of the keys to achieving storage-agnostic or database-agnostic ACID transactions on top of existing storage and database systems is the storage abstraction capabilities that ScalarDB provides. Storage abstraction defines a [data model](design.md#data-model) and the APIs (Storage API) that issue operations on the basis of the data model. - -Although you will likely use the [Transactional API](api-guide.md#transactional-api) in most cases, another option is to use the Storage API. - -The benefits of using the Storage API include the following: - -- As with the Transactional API, you can write your application code without worrying too much about the underlying storage implementation. -- If you don't need transactions for some of the data in your application, you can use the Storage API to partially avoid transactions, which results in faster execution. - -{% capture notice--warning %} -**Attention** - -Directly using the Storage API or mixing the Transactional API and the Storage API could cause unexpected behavior. For example, since the Storage API cannot provide transaction capability, the API could cause anomalies or data inconsistency if failures occur when executing operations. - -Therefore, you should be *very* careful about using the Storage API and use it only if you know exactly what you are doing. -{% endcapture %} - -
{{ notice--warning | markdownify }}
- -## Storage API Example - -This section explains how the Storage API can be used in a basic electronic money application. - -{% capture notice--warning %} -**Attention** - -The electronic money application is simplified for this example and isn’t suitable for a production environment. -{% endcapture %} - -
{{ notice--warning | markdownify }}
- -### ScalarDB configuration - -Before you begin, you should configure ScalarDB in the same way mentioned in [Getting Started with ScalarDB](getting-started-with-scalardb.md). - -With that in mind, this Storage API example assumes that the configuration file `scalardb.properties` exists. - -### Set up the database schema - -You need to define the database schema (the method in which the data will be organized) in the application. For details about the supported data types, see [Data type mapping between ScalarDB and other databases](https://scalardb.scalar-labs.com/docs/latest/schema-loader/#data-type-mapping-between-scalardb-and-the-other-databases). - -For this example, create a file named `emoney-storage.json` in the `scalardb/docs/getting-started` directory. Then, add the following JSON code to define the schema. - -{% capture notice--info %} -**Note** - -In the following JSON, the `transaction` field is set to `false`, which indicates that you should use this table with the Storage API. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -```json -{ - "emoney.account": { - "transaction": false, - "partition-key": [ - "id" - ], - "clustering-key": [], - "columns": { - "id": "TEXT", - "balance": "INT" - } - } -} -``` - -To apply the schema, go to the [ScalarDB Releases](https://github.com/scalar-labs/scalardb/releases) page and download the ScalarDB Schema Loader that matches the version of ScalarDB that you are using to the `getting-started` folder. - -Then, run the following command, replacing `` with the version of the ScalarDB Schema Loader that you downloaded: - -```console -$ java -jar scalardb-schema-loader-.jar --config scalardb.properties -f emoney-storage.json -``` - -### Example code - -The following is example source code for the electronic money application that uses the Storage API. - -{% capture notice--warning %} -**Attention** - -As previously mentioned, since the Storage API cannot provide transaction capability, the API could cause anomalies or data inconsistency if failures occur when executing operations. Therefore, you should be *very* careful about using the Storage API and use it only if you know exactly what you are doing. -{% endcapture %} - -
{{ notice--warning | markdownify }}
- -```java -public class ElectronicMoney { - - private static final String SCALARDB_PROPERTIES = - System.getProperty("user.dir") + File.separator + "scalardb.properties"; - private static final String NAMESPACE = "emoney"; - private static final String TABLENAME = "account"; - private static final String ID = "id"; - private static final String BALANCE = "balance"; - - private final DistributedStorage storage; - - public ElectronicMoney() throws IOException { - StorageFactory factory = StorageFactory.create(SCALARDB_PROPERTIES); - storage = factory.getStorage(); - } - - public void charge(String id, int amount) throws ExecutionException { - // Retrieve the current balance for id - Get get = - Get.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, id)) - .build(); - Optional result = storage.get(get); - - // Calculate the balance - int balance = amount; - if (result.isPresent()) { - int current = result.get().getInt(BALANCE); - balance += current; - } - - // Update the balance - Put put = - Put.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, id)) - .intValue(BALANCE, balance) - .build(); - storage.put(put); - } - - public void pay(String fromId, String toId, int amount) throws ExecutionException { - // Retrieve the current balances for ids - Get fromGet = - Get.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, fromId)) - .build(); - Get toGet = - Get.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, toId)) - .build(); - Optional fromResult = storage.get(fromGet); - Optional toResult = storage.get(toGet); - - // Calculate the balances (it assumes that both accounts exist) - int newFromBalance = fromResult.get().getInt(BALANCE) - amount; - int newToBalance = toResult.get().getInt(BALANCE) + amount; - if (newFromBalance < 0) { - throw new RuntimeException(fromId + " doesn't have enough balance."); - } - - // Update the balances - Put fromPut = - Put.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, fromId)) - .intValue(BALANCE, newFromBalance) - .build(); - Put toPut = - Put.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, toId)) - .intValue(BALANCE, newToBalance) - .build(); - storage.put(fromPut); - storage.put(toPut); - } - - public int getBalance(String id) throws ExecutionException { - // Retrieve the current balances for id - Get get = - Get.newBuilder() - .namespace(NAMESPACE) - .table(TABLENAME) - .partitionKey(Key.ofText(ID, id)) - .build(); - Optional result = storage.get(get); - - int balance = -1; - if (result.isPresent()) { - balance = result.get().getInt(BALANCE); - } - return balance; - } - - public void close() { - storage.close(); - } -} -``` - -## Storage API guide - -The Storage API is composed of the Administrative API and CRUD API. - -### Administrative API - -You can execute administrative operations programmatically as described in this section. - -{% capture notice--info %} -**Note** - -Another method that you could use to execute administrative operations is by using [Schema Loader](schema-loader.md). -{% endcapture %} - -
{{ notice--info | markdownify }}
- -#### Get a `DistributedStorageAdmin` instance - -To execute administrative operations, you first need to get a `DistributedStorageAdmin` instance. You can obtain the `DistributedStorageAdmin` instance from `StorageFactory` as follows: - -```java -StorageFactory storageFactory = StorageFactory.create(""); -DistributedStorageAdmin admin = storageFactory.getStorageAdmin(); -``` - -For details about configurations, see [ScalarDB Configurations](configurations.md). - -After you have executed all administrative operations, you should close the `DistributedStorageAdmin` instance as follows: - -```java -admin.close(); -``` - -#### Create a namespace - -Before creating tables, namespaces must be created since a table belongs to one namespace. - -You can create a namespace as follows: - -```java -// Create the namespace "ns". If the namespace already exists, an exception will be thrown. -admin.createNamespace("ns"); - -// Create the namespace only if it does not already exist. -boolean ifNotExists = true; -admin.createNamespace("ns", ifNotExists); - -// Create the namespace with options. -Map options = ...; -admin.createNamespace("ns", options); -``` - -For details about creation options, see [Creation options](api-guide.md#creation-options). - -#### Create a table - -When creating a table, you should define the table metadata and then create the table. - -To define the table metadata, you can use `TableMetadata`. The following shows how to define the columns, partition key, clustering key including clustering orders, and secondary indexes of a table: - -```java -// Define the table metadata. -TableMetadata tableMetadata = - TableMetadata.newBuilder() - .addColumn("c1", DataType.INT) - .addColumn("c2", DataType.TEXT) - .addColumn("c3", DataType.BIGINT) - .addColumn("c4", DataType.FLOAT) - .addColumn("c5", DataType.DOUBLE) - .addPartitionKey("c1") - .addClusteringKey("c2", Scan.Ordering.Order.DESC) - .addClusteringKey("c3", Scan.Ordering.Order.ASC) - .addSecondaryIndex("c4") - .build(); -``` - -For details about the data model of ScalarDB, see [Data Model](design.md#data-model). - -Then, create a table as follows: - -```java -// Create the table "ns.tbl". If the table already exists, an exception will be thrown. -admin.createTable("ns", "tbl", tableMetadata); - -// Create the table only if it does not already exist. -boolean ifNotExists = true; -admin.createTable("ns", "tbl", tableMetadata, ifNotExists); - -// Create the table with options. -Map options = ...; -admin.createTable("ns", "tbl", tableMetadata, options); -``` - -#### Create a secondary index - -You can create a secondary index as follows: - -```java -// Create a secondary index on column "c5" for table "ns.tbl". If a secondary index already exists, an exception will be thrown. -admin.createIndex("ns", "tbl", "c5"); - -// Create the secondary index only if it does not already exist. -boolean ifNotExists = true; -admin.createIndex("ns", "tbl", "c5", ifNotExists); - -// Create the secondary index with options. -Map options = ...; -admin.createIndex("ns", "tbl", "c5", options); -``` - -#### Add a new column to a table - -You can add a new, non-partition key column to a table as follows: - -```java -// Add a new column "c6" with the INT data type to the table "ns.tbl". -admin.addNewColumnToTable("ns", "tbl", "c6", DataType.INT) -``` - -{% capture notice--warning %} -**Attention** - -You should carefully consider adding a new column to a table because the execution time may vary greatly depending on the underlying storage. Please plan accordingly and consider the following, especially if the database runs in production: - -- **For Cosmos DB for NoSQL and DynamoDB:** Adding a column is almost instantaneous as the table schema is not modified. Only the table metadata stored in a separate table is updated. -- **For Cassandra:** Adding a column will only update the schema metadata and will not modify the existing schema records. The cluster topology is the main factor for the execution time. Changes to the schema metadata are shared to each cluster node via a gossip protocol. Because of this, the larger the cluster, the longer it will take for all nodes to be updated. -- **For relational databases (MySQL, Oracle, etc.):** Adding a column shouldn't take a long time to execute. -{% endcapture %} - -
{{ notice--warning | markdownify }}
- -#### Truncate a table - -You can truncate a table as follows: - -```java -// Truncate the table "ns.tbl". -admin.truncateTable("ns", "tbl"); -``` - -#### Drop a secondary index - -You can drop a secondary index as follows: - -```java -// Drop the secondary index on column "c5" from table "ns.tbl". If the secondary index does not exist, an exception will be thrown. -admin.dropIndex("ns", "tbl", "c5"); - -// Drop the secondary index only if it exists. -boolean ifExists = true; -admin.dropIndex("ns", "tbl", "c5", ifExists); -``` - -#### Drop a table - -You can drop a table as follows: - -```java -// Drop the table "ns.tbl". If the table does not exist, an exception will be thrown. -admin.dropTable("ns", "tbl"); - -// Drop the table only if it exists. -boolean ifExists = true; -admin.dropTable("ns", "tbl", ifExists); -``` - -#### Drop a namespace - -You can drop a namespace as follows: - -```java -// Drop the namespace "ns". If the namespace does not exist, an exception will be thrown. -admin.dropNamespace("ns"); - -// Drop the namespace only if it exists. -boolean ifExists = true; -admin.dropNamespace("ns", ifExists); -``` - -#### Get existing namespaces - -You can get the existing namespaces as follows: - -```java -Set namespaces = admin.getNamespaceNames(); -``` - -### Get the tables of a namespace - -You can get the tables of a namespace as follows: - -```java -// Get the tables of the namespace "ns". -Set tables = admin.getNamespaceTableNames("ns"); -``` - -#### Get table metadata - -You can get table metadata as follows: - -```java -// Get the table metadata for "ns.tbl". -TableMetadata tableMetadata = admin.getTableMetadata("ns", "tbl"); -``` - -### Repair a namespace - -If a namespace is in an unknown state, such as the namespace exists in the underlying storage but not its ScalarDB metadata or vice versa, this method will re-create the namespace and its metadata if necessary. - -You can repair the namespace as follows: - -```java -// Repair the namespace "ns" with options. -Map options = ...; - admin.repairNamespace("ns", options); -``` - -### Repair a table - -If a table is in an unknown state, such as the table exists in the underlying storage but not its ScalarDB metadata or vice versa, this method will re-create the table, its secondary indexes, and their metadata if necessary. - -You can repair the table as follows: - -```java -// Repair the table "ns.tbl" with options. -TableMetadata tableMetadata = - TableMetadata.newBuilder() - ... - .build(); -Map options = ...; -admin.repairTable("ns", "tbl", tableMetadata, options); -``` - -### Upgrade the environment to support the latest ScalarDB API - -You can upgrade the ScalarDB environment to support the latest version of the ScalarDB API. Typically, as indicated in the release notes, you will need to run this method after updating the ScalarDB version that your application environment uses. - -```java -// Upgrade the ScalarDB environment. -Map options = ...; -admin.upgrade(options); -``` - -### Implement CRUD operations - -The following sections describe CRUD operations. - -#### Get a `DistributedStorage` instance - -To execute CRUD operations in the Storage API, you need to get a `DistributedStorage` instance. - -You can get an instance as follows: - -```java -StorageFactory storageFactory = StorageFactory.create(""); -DistributedStorage storage = storageFactory.getStorage(); -``` - -After you have executed all CRUD operations, you should close the `DistributedStorage` instance as follows: - -```java -storage.close(); -``` - -#### `Get` operation - -`Get` is an operation to retrieve a single record specified by a primary key. - -You need to create a `Get` object first, and then you can execute the object by using the `storage.get()` method as follows: - -```java -// Create a `Get` operation. -Key partitionKey = Key.ofInt("c1", 10); -Key clusteringKey = Key.of("c2", "aaa", "c3", 100L); - -Get get = - Get.newBuilder() - .namespace("ns") - .table("tbl") - .partitionKey(partitionKey) - .clusteringKey(clusteringKey) - .projections("c1", "c2", "c3", "c4") - .build(); - -// Execute the `Get` operation. -Optional result = storage.get(get); -``` - -You can also specify projections to choose which columns are returned. - -For details about how to construct `Key` objects, see [Key construction](api-guide.md#key-construction). And, for details about how to handle `Result` objects, see [Handle Result objects](api-guide.md#handle-result-objects). - -##### Specify a consistency level - -You can specify a consistency level in each operation (`Get`, `Scan`, `Put`, and `Delete`) in the Storage API as follows: - -```java -Get get = - Get.newBuilder() - .namespace("ns") - .table("tbl") - .partitionKey(partitionKey) - .clusteringKey(clusteringKey) - .consistency(Consistency.LINEARIZABLE) // Consistency level - .build(); -``` - -The following table describes the three consistency levels: - -| Consistency level | Description | -| ------------------- | ----------- | -| `SEQUENTIAL` | Sequential consistency assumes that the underlying storage implementation makes all operations appear to take effect in some sequential order and the operations of each individual process appear in this sequence. | -| `EVENTUAL` | Eventual consistency assumes that the underlying storage implementation makes all operations take effect eventually. | -| `LINEARIZABLE` | Linearizable consistency assumes that the underlying storage implementation makes each operation appear to take effect atomically at some point between its invocation and completion. | - -##### Execute `Get` by using a secondary index - -You can execute a `Get` operation by using a secondary index. - -Instead of specifying a partition key, you can specify an index key (indexed column) to use a secondary index as follows: - -```java -// Create a `Get` operation by using a secondary index. -Key indexKey = Key.ofFloat("c4", 1.23F); - -Get get = - Get.newBuilder() - .namespace("ns") - .table("tbl") - .indexKey(indexKey) - .projections("c1", "c2", "c3", "c4") - .build(); - -// Execute the `Get` operation. -Optional result = storage.get(get); -``` - -{% capture notice--info %} -**Note** - -If the result has more than one record, `storage.get()` will throw an exception. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -#### `Scan` operation - -`Scan` is an operation to retrieve multiple records within a partition. You can specify clustering-key boundaries and orderings for clustering-key columns in `Scan` operations. - -You need to create a `Scan` object first, and then you can execute the object by using the `storage.scan()` method as follows: - -```java -// Create a `Scan` operation. -Key partitionKey = Key.ofInt("c1", 10); -Key startClusteringKey = Key.of("c2", "aaa", "c3", 100L); -Key endClusteringKey = Key.of("c2", "aaa", "c3", 300L); - -Scan scan = - Scan.newBuilder() - .namespace("ns") - .table("tbl") - .partitionKey(partitionKey) - .start(startClusteringKey, true) // Include startClusteringKey - .end(endClusteringKey, false) // Exclude endClusteringKey - .projections("c1", "c2", "c3", "c4") - .orderings(Scan.Ordering.desc("c2"), Scan.Ordering.asc("c3")) - .limit(10) - .build(); - -// Execute the `Scan` operation. -Scanner scanner = storage.scan(scan); -``` - -You can omit the clustering-key boundaries or specify either a `start` boundary or an `end` boundary. If you don't specify `orderings`, you will get results ordered by the clustering order that you defined when creating the table. - -In addition, you can specify `projections` to choose which columns are returned and use `limit` to specify the number of records to return in `Scan` operations. - -##### Handle `Scanner` objects - -A `Scan` operation in the Storage API returns a `Scanner` object. - -If you want to get results one by one from the `Scanner` object, you can use the `one()` method as follows: - -```java -Optional result = scanner.one(); -``` - -Or, if you want to get a list of all results, you can use the `all()` method as follows: - -```java -List results = scanner.all(); -``` - -In addition, since `Scanner` implements `Iterable`, you can use `Scanner` in a for-each loop as follows: - -```java -for (Result result : scanner) { - ... -} -``` - -Remember to close the `Scanner` object after getting the results: - -```java -scanner.close(); -``` - -Or you can use `try`-with-resources as follows: - -```java -try (Scanner scanner = storage.scan(scan)) { - ... -} -``` - -##### Execute `Scan` by using a secondary index - -You can execute a `Scan` operation by using a secondary index. - -Instead of specifying a partition key, you can specify an index key (indexed column) to use a secondary index as follows: - -```java -// Create a `Scan` operation by using a secondary index. -Key indexKey = Key.ofFloat("c4", 1.23F); - -Scan scan = - Scan.newBuilder() - .namespace("ns") - .table("tbl") - .indexKey(indexKey) - .projections("c1", "c2", "c3", "c4") - .limit(10) - .build(); - -// Execute the `Scan` operation. -Scanner scanner = storage.scan(scan); -``` - -{% capture notice--info %} -**Note** - -You can't specify clustering-key boundaries and orderings in `Scan` by using a secondary index. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -##### Execute `Scan` without specifying a partition key to retrieve all the records of a table - -You can execute a `Scan` operation without specifying a partition key. - -Instead of calling the `partitionKey()` method in the builder, you can call the `all()` method to scan a table without specifying a partition key as follows: - -```java -// Create a `Scan` operation without specifying a partition key. -Key partitionKey = Key.ofInt("c1", 10); -Key startClusteringKey = Key.of("c2", "aaa", "c3", 100L); -Key endClusteringKey = Key.of("c2", "aaa", "c3", 300L); - -Scan scan = - Scan.newBuilder() - .namespace("ns") - .table("tbl") - .all() - .projections("c1", "c2", "c3", "c4") - .limit(10) - .build(); - -// Execute the `Scan` operation. -Scanner scanner = storage.scan(scan); -``` - -{% capture notice--info %} -**Note** - -You can't specify clustering-key boundaries and orderings in `Scan` without specifying a partition key. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -#### `Put` operation - -`Put` is an operation to put a record specified by a primary key. The operation behaves as an upsert operation for a record, in which the operation updates the record if the record exists or inserts the record if the record does not exist. - -You need to create a `Put` object first, and then you can execute the object by using the `storage.put()` method as follows: - -```java -// Create a `Put` operation. -Key partitionKey = Key.ofInt("c1", 10); -Key clusteringKey = Key.of("c2", "aaa", "c3", 100L); - -Put put = - Put.newBuilder() - .namespace("ns") - .table("tbl") - .partitionKey(partitionKey) - .clusteringKey(clusteringKey) - .floatValue("c4", 1.23F) - .doubleValue("c5", 4.56) - .build(); - -// Execute the `Put` operation. -storage.put(put); -``` - -You can also put a record with `null` values as follows: - -```java -Put put = - Put.newBuilder() - .namespace("ns") - .table("tbl") - .partitionKey(partitionKey) - .clusteringKey(clusteringKey) - .floatValue("c4", null) - .doubleValue("c5", null) - .build(); -``` - -{% capture notice--info %} -**Note** - -If you specify `enableImplicitPreRead()`, `disableImplicitPreRead()`, or `implicitPreReadEnabled()` in the `Put` operation builder, they will be ignored. - -{% endcapture %} - -
{{ notice--info | markdownify }}
- -#### `Delete` operation - -`Delete` is an operation to delete a record specified by a primary key. - -You need to create a `Delete` object first, and then you can execute the object by using the `storage.delete()` method as follows: - -```java -// Create a `Delete` operation. -Key partitionKey = Key.ofInt("c1", 10); -Key clusteringKey = Key.of("c2", "aaa", "c3", 100L); - -Delete delete = - Delete.newBuilder() - .namespace("ns") - .table("tbl") - .partitionKey(partitionKey) - .clusteringKey(clusteringKey) - .build(); - -// Execute the `Delete` operation. -storage.delete(delete); -``` - -#### `Put` and `Delete` with a condition - -You can write arbitrary conditions (for example, a bank account balance must be equal to or more than zero) that you require an operation to meet before being executed by implementing logic that checks the conditions. Alternatively, you can write simple conditions in a mutation operation, such as `Put` and `Delete`. - -When a `Put` or `Delete` operation includes a condition, the operation is executed only if the specified condition is met. If the condition is not met when the operation is executed, an exception called `NoMutationException` will be thrown. - -##### Conditions for `Put` - -In a `Put` operation in the Storage API, you can specify a condition that causes the `Put` operation to be executed only when the specified condition matches. This operation is like a compare-and-swap operation where the condition is compared and the update is performed atomically. - -You can specify a condition in a `Put` operation as follows: - -```java -// Build a condition. -MutationCondition condition = - ConditionBuilder.putIf(ConditionBuilder.column("c4").isEqualToFloat(0.0F)) - .and(ConditionBuilder.column("c5").isEqualToDouble(0.0)) - .build(); - -Put put = - Put.newBuilder() - .namespace("ns") - .table("tbl") - .partitionKey(partitionKey) - .clusteringKey(clusteringKey) - .floatValue("c4", 1.23F) - .doubleValue("c5", 4.56) - .condition(condition) // condition - .build(); -``` - -Other than the `putIf` condition, you can specify the `putIfExists` and `putIfNotExists` conditions as follows: - -```java -// Build a `putIfExists` condition. -MutationCondition putIfExistsCondition = ConditionBuilder.putIfExists(); - -// Build a `putIfNotExists` condition. -MutationCondition putIfNotExistsCondition = ConditionBuilder.putIfNotExists(); -``` - -##### Conditions for `Delete` - -Similar to a `Put` operation, you can specify a condition in a `Delete` operation in the Storage API. - -You can specify a condition in a `Delete` operation as follows: - -```java -// Build a condition. -MutationCondition condition = - ConditionBuilder.deleteIf(ConditionBuilder.column("c4").isEqualToFloat(0.0F)) - .and(ConditionBuilder.column("c5").isEqualToDouble(0.0)) - .build(); - -Delete delete = - Delete.newBuilder() - .namespace("ns") - .table("tbl") - .partitionKey(partitionKey) - .clusteringKey(clusteringKey) - .condition(condition) // condition - .build(); -``` - -In addition to using the `deleteIf` condition, you can specify the `deleteIfExists` condition as follows: - -```java -// Build a `deleteIfExists` condition. -MutationCondition deleteIfExistsCondition = ConditionBuilder.deleteIfExists(); -``` - -#### Mutate operation - -Mutate is an operation to execute multiple mutations (`Put` and `Delete` operations) in a single partition. - -You need to create mutation objects first, and then you can execute the objects by using the `storage.mutate()` method as follows: - -```java -// Create `Put` and `Delete` operations. -Key partitionKey = Key.ofInt("c1", 10); - -Key clusteringKeyForPut = Key.of("c2", "aaa", "c3", 100L); - -Put put = - Put.newBuilder() - .namespace("ns") - .table("tbl") - .partitionKey(partitionKey) - .clusteringKey(clusteringKeyForPut) - .floatValue("c4", 1.23F) - .doubleValue("c5", 4.56) - .build(); - -Key clusteringKeyForDelete = Key.of("c2", "bbb", "c3", 200L); - -Delete delete = - Delete.newBuilder() - .namespace("ns") - .table("tbl") - .partitionKey(partitionKey) - .clusteringKey(clusteringKeyForDelete) - .build(); - -// Execute the operations. -storage.mutate(Arrays.asList(put, delete)); -``` - -{% capture notice--info %} -**Note** - -A Mutate operation only accepts mutations for a single partition; otherwise, an exception will be thrown. - -In addition, if you specify multiple conditions in a Mutate operation, the operation will be executed only when all the conditions match. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -#### Default namespace for CRUD operations - -A default namespace for all CRUD operations can be set by using a property in the ScalarDB configuration. - -```properties -scalar.db.default_namespace_name= -``` - -Any operation that does not specify a namespace will use the default namespace set in the configuration. - -```java -// This operation will target the default namespace. -Scan scanUsingDefaultNamespace = - Scan.newBuilder() - .table("tbl") - .all() - .build(); -// This operation will target the "ns" namespace. -Scan scanUsingSpecifiedNamespace = - Scan.newBuilder() - .namespace("ns") - .table("tbl") - .all() - .build(); -``` diff --git a/docs/two-phase-commit-transactions.md b/docs/two-phase-commit-transactions.md deleted file mode 100644 index 8d1f97d5c6..0000000000 --- a/docs/two-phase-commit-transactions.md +++ /dev/null @@ -1,760 +0,0 @@ -> [!CAUTION] -> -> This documentation has been moved to the centralized ScalarDB documentation repository, [docs-internal-scalardb](https://github.com/scalar-labs/docs-internal-scalardb). Please update this documentation in that repository instead. -> -> To view the ScalarDB documentation, visit [ScalarDB Documentation](https://scalardb.scalar-labs.com/docs/). - -# Transactions with a Two-Phase Commit Interface - -ScalarDB supports executing transactions with a two-phase commit interface. With the two-phase commit interface, you can execute a transaction that spans multiple processes or applications, like in a microservice architecture. - -This page explains how transactions with a two-phase commit interface work in ScalarDB and how to configure and execute them in ScalarDB. - -## How transactions with a two-phase commit interface work in ScalarDB - -ScalarDB normally executes transactions in a single transaction manager instance with a one-phase commit interface. In transactions with a one-phase commit interface, you begin a transaction, execute CRUD operations, and commit the transaction in the same transaction manager instance. - -In ScalarDB, you can execute transactions with a two-phase commit interface that span multiple transaction manager instances. The transaction manager instances can be in the same process or application, or the instances can be in different processes or applications. For example, if you have transaction manager instances in multiple microservices, you can execute a transaction that spans multiple microservices. - -In transactions with a two-phase commit interface, there are two roles—Coordinator and a participant—that collaboratively execute a single transaction. - -The Coordinator process and the participant processes all have different transaction manager instances. The Coordinator process first begins or starts a transaction, and the participant processes join the transaction. After executing CRUD operations, the Coordinator process and the participant processes commit the transaction by using the two-phase interface. - -## How to execute transactions with a two-phase commit interface - -To execute a two-phase commit transaction, you must get the transaction manager instance. Then, the Coordinator process can begin or start the transaction, and the participant can process the transaction. - -### Get a `TwoPhaseCommitTransactionManager` instance - -You first need to get a `TwoPhaseCommitTransactionManager` instance to execute transactions with a two-phase commit interface. - -To get a `TwoPhaseCommitTransactionManager` instance, you can use `TransactionFactory` as follows: - -```java -TransactionFactory factory = TransactionFactory.create(""); -TwoPhaseCommitTransactionManager transactionManager = factory.getTwoPhaseCommitTransactionManager(); -``` - -### Begin or start a transaction (for Coordinator) - -For the process or application that begins the transaction to act as Coordinator, you should use the following `begin` method: - -```java -// Begin a transaction. -TwoPhaseCommitTransaction tx = transactionManager.begin(); -``` - -Or, for the process or application that begins the transaction to act as Coordinator, you should use the following `start` method: - -```java -// Start a transaction. -TwoPhaseCommitTransaction tx = transactionManager.start(); -``` - -Alternatively, you can use the `begin` method for a transaction by specifying a transaction ID as follows: - -```java -// Begin a transaction by specifying a transaction ID. -TwoPhaseCommitTransaction tx = transactionManager.begin(""); -``` - -Or, you can use the `start` method for a transaction by specifying a transaction ID as follows: - -```java -// Start a transaction by specifying a transaction ID. -TwoPhaseCommitTransaction tx = transactionManager.start(""); -``` - -### Join a transaction (for participants) - -For participants, you can join a transaction by specifying the transaction ID associated with the transaction that Coordinator has started or begun as follows: - -```java -TwoPhaseCommitTransaction tx = transactionManager.join(""); -``` - -{% capture notice--info %} -**Note** - -To get the transaction ID with `getId()`, you can specify the following: - -```java -tx.getId(); -``` -{% endcapture %} - -
{{ notice--info | markdownify }}
- -### CRUD operations for the transaction - -The CRUD operations for `TwoPhaseCommitTransacton` are the same as the operations for `DistributedTransaction`. For details, see [CRUD operations](api-guide.md#crud-operations). - -The following is example code for CRUD operations in transactions with a two-phase commit interface: - -```java -TwoPhaseCommitTransaction tx = ... - -// Retrieve the current balances by ID. -Get fromGet = - Get.newBuilder() - .namespace(NAMESPACE) - .table(TABLE) - .partitionKey(new Key(ID, fromId)) - .build(); - -Get toGet = - Get.newBuilder() - .namespace(NAMESPACE) - .table(TABLE) - .partitionKey(new Key(ID, toId)) - .build(); - -Optional fromResult = tx.get(fromGet); -Optional toResult = tx.get(toGet); - -// Calculate the balances (assuming that both accounts exist). -int newFromBalance = fromResult.get().getInt(BALANCE) - amount; -int newToBalance = toResult.get().getInt(BALANCE) + amount; - -// Update the balances. -Put fromPut = - Put.newBuilder() - .namespace(NAMESPACE) - .table(TABLE) - .partitionKey(new Key(ID, fromId)) - .intValue(BALANCE, newFromBalance) - .build(); - -Put toPut = - Put.newBuilder() - .namespace(NAMESPACE) - .table(TABLE) - .partitionKey(new Key(ID, toId)) - .intValue(BALANCE, newToBalance) - .build(); - -tx.put(fromPut); -tx.put(toPut); -``` - -### Prepare, commit, or roll back a transaction - -After finishing CRUD operations, you need to commit the transaction. As with the standard two-phase commit protocol, there are two phases: prepare and commit. - -In all the Coordinator and participant processes, you need to prepare and then commit the transaction as follows: - -```java -TwoPhaseCommitTransaction tx = ... - -try { - // Execute CRUD operations in the Coordinator and participant processes. - ... - - // Prepare phase: Prepare the transaction in all the Coordinator and participant processes. - tx.prepare(); - ... - - // Commit phase: Commit the transaction in all the Coordinator and participant processes. - tx.commit(); - ... -} catch (TransactionException e) { - // If an error happens, you will need to roll back the transaction in all the Coordinator and participant processes. - tx.rollback(); - ... -} -``` - -For `prepare()`, if any of the Coordinator or participant processes fail to prepare the transaction, you will need to call `rollback()` (or `abort()`) in all the Coordinator and participant processes. - -For `commit()`, if any of the Coordinator or participant processes successfully commit the transaction, you can consider the transaction as committed. When a transaction has been committed, you can ignore any errors in the other Coordinator and participant processes. If all the Coordinator and participant processes fail to commit the transaction, you will need to call `rollback()` (or `abort()`) in all the Coordinator and participant processes. - -For better performance, you can call `prepare()`, `commit()`, and `rollback()` in the Coordinator and participant processes in parallel, respectively. - -#### Validate the transaction - -Depending on the concurrency control protocol, you need to call `validate()` in all the Coordinator and participant processes after `prepare()` and before `commit()`, as shown below: - -```java -// Prepare phase 1: Prepare the transaction in all the Coordinator and participant processes. -tx.prepare(); -... - -// Prepare phase 2: Validate the transaction in all the Coordinator and participant processes. -tx.validate(); -... - -// Commit phase: Commit the transaction in all the Coordinator and participant processes. -tx.commit(); -... -``` - -Similar to `prepare()`, if any of the Coordinator or participant processes fail to validate the transaction, you will need to call `rollback()` (or `abort()`) in all the Coordinator and participant processes. In addition, you can call `validate()` in the Coordinator and participant processes in parallel for better performance. - -{% capture notice--info %} -**Note** - -When using the [Consensus Commit](configurations/#consensus-commit) transaction manager with `EXTRA_READ` set as the value for `scalar.db.consensus_commit.serializable_strategy` and `SERIALIZABLE` set as the value for `scalar.db.consensus_commit.isolation_level`, you need to call `validate()`. However, if you are not using Consensus Commit, specifying `validate()` will not have any effect. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -### Execute a transaction by using multiple transaction manager instances - -By using the APIs described above, you can execute a transaction by using multiple transaction manager instances as follows: - -```java -TransactionFactory factory1 = - TransactionFactory.create(""); -TwoPhaseCommitTransactionManager transactionManager1 = - factory1.getTwoPhaseCommitTransactionManager(); - -TransactionFactory factory2 = - TransactionFactory.create(""); -TwoPhaseCommitTransactionManager transactionManager2 = - factory2.getTwoPhaseCommitTransactionManager(); - -TwoPhaseCommitTransaction transaction1 = null; -TwoPhaseCommitTransaction transaction2 = null; -try { - // Begin a transaction. - transaction1 = transactionManager1.begin(); - - // Join the transaction begun by `transactionManager1` by getting the transaction ID. - transaction2 = transactionManager2.join(transaction1.getId()); - - // Execute CRUD operations in the transaction. - Optional result = transaction1.get(...); - List results = transaction2.scan(...); - transaction1.put(...); - transaction2.delete(...); - - // Prepare the transaction. - transaction1.prepare(); - transaction2.prepare(); - - // Validate the transaction. - transaction1.validate(); - transaction2.validate(); - - // Commit the transaction. If any of the transactions successfully commit, - // you can regard the transaction as committed. - AtomicReference exception = new AtomicReference<>(); - boolean anyMatch = - Stream.of(transaction1, transaction2) - .anyMatch( - t -> { - try { - t.commit(); - return true; - } catch (TransactionException e) { - exception.set(e); - return false; - } - }); - - // If all the transactions fail to commit, throw the exception and roll back the transaction. - if (!anyMatch) { - throw exception.get(); - } -} catch (TransactionException e) { - // Roll back the transaction. - if (transaction1 != null) { - try { - transaction1.rollback(); - } catch (RollbackException e1) { - // Handle the exception. - } - } - if (transaction2 != null) { - try { - transaction2.rollback(); - } catch (RollbackException e1) { - // Handle the exception. - } - } -} -``` - -For simplicity, the above example code doesn't handle the exceptions that the APIs may throw. For details about handling exceptions, see [How to handle exceptions](#how-to-handle-exceptions). - -As previously mentioned, for `commit()`, if any of the Coordinator or participant processes succeed in committing the transaction, you can consider the transaction as committed. Also, for better performance, you can execute `prepare()`, `validate()`, and `commit()` in parallel, respectively. - -### Resume or re-join a transaction - -Given that processes or applications that use transactions with a two-phase commit interface usually involve multiple request and response exchanges, you might need to execute a transaction across various endpoints or APIs. For such scenarios, you can use `resume()` or `join()` to get a transaction object (an instance of `TwoPhaseCommitTransaction`) for the transaction that you previously joined. - -The following shows how `resume()` and `join()` work: - -```java -// Join (or begin) the transaction. -TwoPhaseCommitTransaction tx = transactionManager.join(""); - -... - -// Resume the transaction by using the transaction ID. -TwoPhaseCommitTransaction tx1 = transactionManager.resume(""); - -// Or you can re-join the transaction by using the transaction ID. -TwoPhaseCommitTransaction tx2 = transactionManager.join(""); -``` - -{% capture notice--info %} -**Note** - -To get the transaction ID with `getId()`, you can specify the following: - -```java -tx.getId(); -``` - -In addition, when using `join()` to re-join a transaction, if you have not joined the transaction before, a new transaction object will be returned. On the other hand, when using `resume()` to resume a transaction, if you have not joined the transaction before, `TransactionNotFoundException` will be thrown. - -{% endcapture %} - -
{{ notice--info | markdownify }}
- -The following is an example of two services that have multiple endpoints: - -```java -interface ServiceA { - void facadeEndpoint() throws Exception; -} - -interface ServiceB { - void endpoint1(String txId) throws Exception; - - void endpoint2(String txId) throws Exception; - - void prepare(String txId) throws Exception; - - void commit(String txId) throws Exception; - - void rollback(String txId) throws Exception; -} -``` - -The following is an example of a client calling `ServiceA.facadeEndpoint()` that begins a transaction that spans the two services (`ServiceA` and `ServiceB`): - -```java -public class ServiceAImpl implements ServiceA { - - private TwoPhaseCommitTransactionManager transactionManager = ...; - private ServiceB serviceB = ...; - - ... - - @Override - public void facadeEndpoint() throws Exception { - TwoPhaseCommitTransaction tx = transactionManager.begin(); - - try { - ... - - // Call `ServiceB` `endpoint1`. - serviceB.endpoint1(tx.getId()); - - ... - - // Call `ServiceB` `endpoint2`. - serviceB.endpoint2(tx.getId()); - - ... - - // Prepare. - tx.prepare(); - serviceB.prepare(tx.getId()); - - // Commit. - tx.commit(); - serviceB.commit(tx.getId()); - } catch (Exception e) { - // Roll back. - tx.rollback(); - serviceB.rollback(tx.getId()); - } - } -} -``` - -As shown above, the facade endpoint in `ServiceA` calls multiple endpoints (`endpoint1()`, `endpoint2()`, `prepare()`, `commit()`, and `rollback()`) of `ServiceB`. In addition, in transactions with a two-phase commit interface, you need to use the same transaction object across the endpoints. - -In this situation, you can resume the transaction. The implementation of `ServiceB` is as follows: - -```java -public class ServiceBImpl implements ServiceB { - - private TwoPhaseCommitTransactionManager transactionManager = ...; - - ... - - @Override - public void endpoint1(String txId) throws Exception { - // Join the transaction. - TwoPhaseCommitTransaction tx = transactionManager.join(txId); - - ... - } - - @Override - public void endpoint2(String txId) throws Exception { - // Resume the transaction that you joined in `endpoint1()`. - TwoPhaseCommitTransaction tx = transactionManager.resume(txId); - - // Or re-join the transaction that you joined in `endpoint1()`. - // TwoPhaseCommitTransaction tx = transactionManager.join(txId); - - ... - } - - @Override - public void prepare(String txId) throws Exception { - // Resume the transaction. - TwoPhaseCommitTransaction tx = transactionManager.resume(txId); - - // Or re-join the transaction. - // TwoPhaseCommitTransaction tx = transactionManager.join(txId); - - ... - - // Prepare. - tx.prepare(); - } - - @Override - public void commit(String txId) throws Exception { - // Resume the transaction. - TwoPhaseCommitTransaction tx = transactionManager.resume(txId); - - // Or re-join the transaction. - // TwoPhaseCommitTransaction tx = transactionManager.join(txId); - - ... - - // Commit. - tx.commit(); - } - - @Override - public void rollback(String txId) throws Exception { - // Resume the transaction. - TwoPhaseCommitTransaction tx = transactionManager.resume(txId); - - // Or re-join the transaction. - // TwoPhaseCommitTransaction tx = transactionManager.join(txId); - - ... - - // Roll back. - tx.rollback(); - } -} -``` - -As shown above, by resuming or re-joining the transaction, you can share the same transaction object across multiple endpoints in `ServiceB`. - -## How to handle exceptions - -When executing a transaction by using multiple transaction manager instances, you will also need to handle exceptions properly. - -{% capture notice--warning %} -**Attention** - -If you don't handle exceptions properly, you may face anomalies or data inconsistency. -{% endcapture %} - -
{{ notice--warning | markdownify }}
- -For instance, in the example code in [Execute a transaction by using multiple transaction manager instances](#execute-a-transaction-by-using-multiple-transaction-manager-instances), multiple transaction managers (`transactionManager1` and `transactionManager2`) are used in a single process for ease of explanation. However, that example code doesn't include a way to handle exceptions. - -The following example code shows how to handle exceptions in transactions with a two-phase commit interface: - -```java -public class Sample { - public static void main(String[] args) throws Exception { - TransactionFactory factory1 = - TransactionFactory.create(""); - TwoPhaseCommitTransactionManager transactionManager1 = - factory1.getTwoPhaseCommitTransactionManager(); - - TransactionFactory factory2 = - TransactionFactory.create(""); - TwoPhaseCommitTransactionManager transactionManager2 = - factory2.getTwoPhaseCommitTransactionManager(); - - int retryCount = 0; - TransactionException lastException = null; - - while (true) { - if (retryCount++ > 0) { - // Retry the transaction three times maximum in this sample code. - if (retryCount >= 3) { - // Throw the last exception if the number of retries exceeds the maximum. - throw lastException; - } - - // Sleep 100 milliseconds before retrying the transaction in this sample code. - TimeUnit.MILLISECONDS.sleep(100); - } - - TwoPhaseCommitTransaction transaction1 = null; - TwoPhaseCommitTransaction transaction2 = null; - try { - // Begin a transaction. - transaction1 = transactionManager1.begin(); - - // Join the transaction that `transactionManager1` begun by using the transaction ID. - transaction2 = transactionManager2.join(transaction1.getId()); - - // Execute CRUD operations in the transaction. - Optional result = transaction1.get(...); - List results = transaction2.scan(...); - transaction1.put(...); - transaction2.delete(...); - - // Prepare the transaction. - prepare(transaction1, transaction2); - - // Validate the transaction. - validate(transaction1, transaction2); - - // Commit the transaction. - commit(transaction1, transaction2); - } catch (UnsatisfiedConditionException e) { - // You need to handle `UnsatisfiedConditionException` only if a mutation operation specifies - // a condition. This exception indicates the condition for the mutation operation is not met. - - rollback(transaction1, transaction2); - - // You can handle the exception here, according to your application requirements. - - return; - } catch (UnknownTransactionStatusException e) { - // If you catch `UnknownTransactionStatusException` when committing the transaction, - // it indicates that the status of the transaction, whether it was successful or not, is unknown. - // In such a case, you need to check if the transaction is committed successfully or not and - // retry the transaction if it failed. How to identify a transaction status is delegated to users. - return; - } catch (TransactionException e) { - // For other exceptions, you can try retrying the transaction. - - // For `CrudConflictException`, `PreparationConflictException`, `ValidationConflictException`, - // `CommitConflictException`, and `TransactionNotFoundException`, you can basically retry the - // transaction. However, for the other exceptions, the transaction will still fail if the cause of - // the exception is non-transient. In such a case, you will exhaust the number of retries and - // throw the last exception. - - rollback(transaction1, transaction2); - - lastException = e; - } - } - } - - private static void prepare(TwoPhaseCommitTransaction... transactions) - throws TransactionException { - // You can execute `prepare()` in parallel. - List exceptions = - Stream.of(transactions) - .parallel() - .map( - t -> { - try { - t.prepare(); - return null; - } catch (TransactionException e) { - return e; - } - }) - .filter(Objects::nonNull) - .collect(Collectors.toList()); - - // If any of the transactions failed to prepare, throw the exception. - if (!exceptions.isEmpty()) { - throw exceptions.get(0); - } - } - - private static void validate(TwoPhaseCommitTransaction... transactions) - throws TransactionException { - // You can execute `validate()` in parallel. - List exceptions = - Stream.of(transactions) - .parallel() - .map( - t -> { - try { - t.validate(); - return null; - } catch (TransactionException e) { - return e; - } - }) - .filter(Objects::nonNull) - .collect(Collectors.toList()); - - // If any of the transactions failed to validate, throw the exception. - if (!exceptions.isEmpty()) { - throw exceptions.get(0); - } - } - - private static void commit(TwoPhaseCommitTransaction... transactions) - throws TransactionException { - // You can execute `commit()` in parallel. - List exceptions = - Stream.of(transactions) - .parallel() - .map( - t -> { - try { - t.commit(); - return null; - } catch (TransactionException e) { - return e; - } - }) - .filter(Objects::nonNull) - .collect(Collectors.toList()); - - // If any of the transactions successfully committed, you can regard the transaction as committed. - if (exceptions.size() < transactions.length) { - if (!exceptions.isEmpty()) { - // You can log the exceptions here if you want. - } - - return; // Commit was successful. - } - - // - // If all the transactions failed to commit: - // - - // If any of the transactions failed to commit due to `UnknownTransactionStatusException`, throw - // it because you should not retry the transaction in such a case. - Optional unknownTransactionStatusException = - exceptions.stream().filter(e -> e instanceof UnknownTransactionStatusException).findFirst(); - if (unknownTransactionStatusException.isPresent()) { - throw unknownTransactionStatusException.get(); - } - - // Otherwise, throw the first exception. - throw exceptions.get(0); - } - - private static void rollback(TwoPhaseCommitTransaction... transactions) { - Stream.of(transactions) - .parallel() - .filter(Objects::nonNull) - .forEach( - t -> { - try { - t.rollback(); - } catch (RollbackException e) { - // Rolling back the transaction failed. The transaction should eventually recover, - // so you don't need to do anything further. You can simply log the occurrence here. - } - }); - } -} -``` - -### `TransactionException` and `TransactionNotFoundException` - -The `begin()` API could throw `TransactionException` or `TransactionNotFoundException`: - -- If you catch `TransactionException`, this exception indicates that the transaction has failed to begin due to transient or non-transient faults. You can try retrying the transaction, but you may not be able to begin the transaction due to non-transient faults. -- If you catch `TransactionNotFoundException`, this exception indicates that the transaction has failed to begin due to transient faults. In this case, you can retry the transaction. - -The `join()` API could also throw `TransactionNotFoundException`. You can handle this exception in the same way that you handle the exceptions for the `begin()` API. - -### `CrudException` and `CrudConflictException` - -The APIs for CRUD operations (`get()`, `scan()`, `put()`, `delete()`, and `mutate()`) could throw `CrudException` or `CrudConflictException`: - -- If you catch `CrudException`, this exception indicates that the transaction CRUD operation has failed due to transient or non-transient faults. You can try retrying the transaction from the beginning, but the transaction will still fail if the cause is non-transient. -- If you catch `CrudConflictException`, this exception indicates that the transaction CRUD operation has failed due to transient faults (for example, a conflict error). In this case, you can retry the transaction from the beginning. - -### `UnsatisfiedConditionException` - -The APIs for mutation operations (`put()`, `delete()`, and `mutate()`) could also throw `UnsatisfiedConditionException`. - -If you catch `UnsatisfiedConditionException`, this exception indicates that the condition for the mutation operation is not met. You can handle this exception according to your application requirements. - -### `PreparationException` and `PreparationConflictException` - -The `prepare()` API could throw `PreparationException` or `PreparationConflictException`: - -- If you catch `PreparationException`, this exception indicates that preparing the transaction fails due to transient or non-transient faults. You can try retrying the transaction from the beginning, but the transaction will still fail if the cause is non-transient. -- If you catch `PreparationConflictException`, this exception indicates that preparing the transaction has failed due to transient faults (for example, a conflict error). In this case, you can retry the transaction from the beginning. - -### `ValidationException` and `ValidationConflictException` - -The `validate()` API could throw `ValidationException` or `ValidationConflictException`: - -- If you catch `ValidationException`, this exception indicates that validating the transaction fails due to transient or non-transient faults. You can try retrying the transaction from the beginning, but the transaction will still fail if the cause is non-transient. -- If you catch `ValidationConflictException`, this exception indicates that validating the transaction has failed due to transient faults (for example, a conflict error). In this case, you can retry the transaction from the beginning. - -### `CommitException`, `CommitConflictException`, and `UnknownTransactionStatusException` - -The `commit()` API could throw `CommitException`, `CommitConflictException`, or `UnknownTransactionStatusException`: - -- If you catch `CommitException`, this exception indicates that committing the transaction fails due to transient or non-transient faults. You can try retrying the transaction from the beginning, but the transaction will still fail if the cause is non-transient. -- If you catch `CommitConflictException`, this exception indicates that committing the transaction has failed due to transient faults (for example, a conflict error). In this case, you can retry the transaction from the beginning. -- If you catch `UnknownTransactionStatusException`, this exception indicates that the status of the transaction, whether it was successful or not, is unknown. In this case, you need to check if the transaction is committed successfully and retry the transaction if it has failed. - -How to identify a transaction status is delegated to users. You may want to create a transaction status table and update it transactionally with other application data so that you can get the status of a transaction from the status table. - -### Notes about some exceptions - -Although not illustrated in the example code, the `resume()` API could also throw `TransactionNotFoundException`. This exception indicates that the transaction associated with the specified ID was not found and/or the transaction might have expired. In either case, you can retry the transaction from the beginning since the cause of this exception is basically transient. - -In the sample code, for `UnknownTransactionStatusException`, the transaction is not retried because the application must check if the transaction was successful to avoid potential duplicate operations. Also, for `UnsatisfiedConditionException`, the transaction is not retried because how to handle this exception depends on your application requirements. For other exceptions, the transaction is retried because the cause of the exception is transient or non-transient. If the cause of the exception is transient, the transaction may succeed if you retry it. However, if the cause of the exception is non-transient, the transaction will still fail even if you retry it. In such a case, you will exhaust the number of retries. - -{% capture notice--info %} -**Note** - -If you begin a transaction by specifying a transaction ID, you must use a different ID when you retry the transaction. - -In addition, in the sample code, the transaction is retried three times maximum and sleeps for 100 milliseconds before it is retried. But you can choose a retry policy, such as exponential backoff, according to your application requirements. -{% endcapture %} - -
{{ notice--info | markdownify }}
- -## Request routing in transactions with a two-phase commit interface - -Services that use transactions with a two-phase commit interface usually execute a transaction by exchanging multiple requests and responses, as shown in the following diagram: - -![Sequence diagram for transactions with a two-phase commit interface](images/two_phase_commit_sequence_diagram.png) - -In addition, each service typically has multiple servers (or hosts) for scalability and availability and uses server-side (proxy) or client-side load balancing to distribute requests to the servers. In such a case, since transaction processing in transactions with a two-phase commit interface is stateful, requests in a transaction must be routed to the same servers while different transactions need to be distributed to balance the load, as shown in the following diagram: - -![Load balancing for transactions with a two-phase commit interface](images/two_phase_commit_load_balancing.png) - -There are several approaches to achieve load balancing for transactions with a two-phase commit interface depending on the protocol between the services. Some approaches for this include using gRPC, HTTP/1.1, and [ScalarDB Cluster (redirects to the Enterprise docs site)](https://scalardb.scalar-labs.com/docs/latest/scalardb-cluster/), which is a component that is available only in the ScalarDB Enterprise edition. - -### gRPC - -When you use a client-side load balancer, you can use the same gRPC connection to send requests in a transaction, which guarantees that the requests go to the same servers. - -When you use a server-side (proxy) load balancer, solutions are different between an L3/L4 (transport-level) load balancer and an L7 (application-level) load balancer: - -- When using an L3/L4 load balancer, you can use the same gRPC connection to send requests in a transaction, similar to when you use a client-side load balancer. In this case, requests in the same gRPC connection always go to the same server. -- When using an L7 load balancer, since requests in the same gRPC connection don't necessarily go to the same server, you need to use cookies or similar method to route requests to the correct server. - - For example, if you use [Envoy](https://www.envoyproxy.io/), you can use session affinity (sticky session) for gRPC. Alternatively, you can use [bidirectional streaming RPC in gRPC](https://grpc.io/docs/what-is-grpc/core-concepts/#bidirectional-streaming-rpc) since the L7 load balancer distributes requests in the same stream to the same server. - -For more details about load balancing in gRPC, see [gRPC Load Balancing](https://grpc.io/blog/grpc-load-balancing/). - -### HTTP/1.1 - -Typically, you use a server-side (proxy) load balancer with HTTP/1.1: - -- When using an L3/L4 load balancer, you can use the same HTTP connection to send requests in a transaction, which guarantees the requests go to the same server. -- When using an L7 load balancer, since requests in the same HTTP connection don't necessarily go to the same server, you need to use cookies or similar method to route requests to the correct server. You can use session affinity (sticky session) in that case. - -### ScalarDB Cluster - -ScalarDB Cluster addresses request routing by providing a routing mechanism that is capable of directing requests to the appropriate cluster node within the cluster. For details about ScalarDB Cluster, see [ScalarDB Cluster (redirects to the Enterprise docs site)](https://scalardb.scalar-labs.com/docs/latest/scalardb-cluster/). - -## Hands-on tutorial - -One of the use cases for transactions with a two-phase commit interface is microservice transactions. For a hands-on tutorial, see [Create a Sample Application That Supports Microservice Transactions](https://github.com/scalar-labs/scalardb-samples/tree/main/microservice-transaction-sample).