Skip to content

DDL statements, Batching and Async Execution

Knut Olav Løite edited this page Feb 18, 2018 · 2 revisions

DDL Statements

Data Definition Language statements (CREATE TABLE, CREATE INDEX, ALTER TABLE, DROP TABLE,...) can be issued using this driver. The syntax follows the normal syntax of Google Cloud Spanner. The reference documentation can be found here: https://cloud.google.com/spanner/docs/data-definition-language

DDL statements are quite slow on Google Cloud Spanner. The JDBC driver therefore contains a number of tricks and special commands to make it easier to deal with this.

IF [NOT] EXISTS

The driver supports MySQL style IF [NOT] EXISTS clauses in CREATE TABLE/INDEX and DROP TABLE/INDEX statements:

CREATE TABLE IF NOT EXISTS TEST1 (ID INT64 NOT NULL, NAME STRING(100) NOT NULL, DESCRIPTION STRING(200)) PRIMARY KEY (ID);
CREATE INDEX IF NOT EXISTS IDX_TEST1 ON TEST1 (DESCRIPTION);
CREATE UNIQUE INDEX IF NOT EXISTS IDX_TEST2 ON TEST1 (NAME);

DROP TABLE IF EXISTS TEST1;
DROP INDEX IF EXISTS IDX_TEST1;

Asynchronous Execution

DDL statements can be sent to Google Cloud Spanner for execution in the background, releasing the connection for other statements while the DDL operation runs in the background. Use the connection property AsyncDdlOperations to enable/disable asynchronous execution of DDL statements. The property can be set during the creation of the connection (as a property or a part of the URL), or it can be changed dynamically using a SET_CONNECTION_PROPERTY statement. Issuing the following statements enables, disables and resets the async mode (resetting means setting the value equal to the initial value for the connection):

SET_CONNECTION_PROPERTY AsyncDdlOperations=true;
SET_CONNECTION_PROPERTY AsyncDdlOperations=false;
RESET_CONNECTION_PROPERTY AsyncDdlOperations;

Enabling and disabling the property dynamically allows you to include it in scripts containing several DDL-statements like this:

SET_CONNECTION_PROPERTY AsyncDdlOperations=true;

drop table test1;
drop table test2;
drop table test3;

create table test1 (id int64 not null, name string(100)) primary key (id);
create table test2 (id int64 not null, name string(100)) primary key (id);
create table test3 (id int64 not null, name string(100)) primary key (id);

RESET_CONNECTION_PROPERTY AsyncDdlOperations;

You can view the status of DDL operations running in the background using the statement SHOW_DDL_OPERATIONS. You can also wait for a batch of asynchronous DDL operations to finish with the statement WAIT_FOR_DDL_OPERATIONS. These commands can only be used on the same connection that issued the DDL statements:

SET_CONNECTION_PROPERTY AsyncDdlOperations=true;

drop table test1;
drop table test2;
drop table test3;

-- Show running operations
SHOW_DDL_OPERATIONS;
-- Wait for drops to finish
WAIT_FOR_DDL_OPERATIONS;
-- Show the operations once more (they should now all be finished)
SHOW_DDL_OPERATIONS;

RESET_CONNECTION_PROPERTY AsyncDdlOperations;

Batched Execution

Google Cloud Spanner also allows DDL statements to be batched together and executed largely in parallel. This also significantly speeds up the total execution time for a large DDL script.

Programmatically

When you use the JDBC driver programmatically, you can achieve this batching easily by using the batching functionality of Statement:

Statement statement = connection.createStatement();
for(String ddl : largeScript)
{
   statement.addBatch(ddl);
}
statement.executeBatch();

Scripted (Standard database tool)

If you use the driver in combination with a standard database tool, like for example DBeaver, you do not have control over how the tool sends a script to the database. The driver therefore offers the possibility of automatically batching DDL statements together using the AutoBatchDdlOperations property. This property can be set both as a property or part of the connection URL, as well as dynamically using a SET_CONNECTION_PROPERTY statement. The latter is the preferred way of using this mode, as it gives you better control over what happens. You can combine batching with background execution.

SET_CONNECTION_PROPERTY AsyncDdlOperations=true;
SET_CONNECTION_PROPERTY AutoBatchDdlOperations=true;

drop table test1;
drop table test2;
drop table test3;

create table test1 (id int64 not null, name string(100)) primary key (id);
create table test2 (id int64 not null, name string(100)) primary key (id);
create table test3 (id int64 not null, name string(100)) primary key (id);

-- Execute batch
EXECUTE_DDL_BATCH;
-- Show running operations
SHOW_DDL_OPERATIONS;
-- Wait for operations to finish
WAIT_FOR_DDL_OPERATIONS;
-- Show the operations once more (they should now all be finished)
SHOW_DDL_OPERATIONS;

RESET_CONNECTION_PROPERTY AsyncDdlOperations;
RESET_CONNECTION_PROPERTY AutoBatchDdlOperations;