Skip to content

How to enable Bulk Copy API for batch insert operations

Peter Bae edited this page Jul 4, 2018 · 5 revisions

Using Bulk Copy API for batch insert operations is supported starting from JDBC driver preview release 6.5.4. This feature allows the users to utilize Bulk Copy API underneath when executing batch insert operations against Azure Data Warehouses, and improves the performance significantly. This feature aims to achieve the performance improvement while inserting the same data as it would have with regular batch insert operation, by parsing the user's SQL Query and leveraging the Bulk Copy API in lieu of the usual batch insert operation. This page shows various ways to enable batch insert using Bulk Copy API, as well as its limitations.

This feature is applicable to only PreparedStatement and CallableStatement's executeBatch & executeLargeBatch APIs.

Prerequisites

There are two prerequisites to enable Bulk Copy API for batch insert.

  1. The server must be Azure Data Warehouse.
  2. The query must be an insert query (the query may contain comments, but the query must start with the INSERT keyword for this feature to come into effect).

Enabling Bulk Copy API for batch insert

There are three ways to enable Bulk Copy API for batch insert.

1. Enabling with connection property:

Adding useBulkCopyForBatchInsert=true; to the connection string enables this feature.

connection = DriverManager.getConnection("jdbc:sqlserver://localhost;userName=user;password=password;database=test;useBulkCopyForBatchInsert=true;");

2. Enabling with setUseBulkCopyForBatchInsert() method from SQLServerConnection object:

Calling SQLServerConnection.setUseBulkCopyForBatchInsert(true) enables this feature.

SQLServerConnection.getUseBulkCopyForBatchInsert() retrieves the current value for useBulkCopyForBatchInsert connection property.

Note that the value for useBulkCopyForBatchInsert stays constant for each PreparedStatement at the time of its initialization, and subsequent calls to SQLServerConnection.setUseBulkCopyForBatchInsert() will not affect the already created PreparedStatement with regard to the value for useBulkCopyForBatchInsert.

3. Enabling with setUseBulkCopyForBatchInsert() method from SQLServerDataSource object:

Similar to above, but using SQLServerDataSource to create a SQLServerConnection object. Both methods achieve the same result.

Limitations

There are currently these limitations that apply to this feature.

  1. Insert queries that contain INSERT-SELECT expressions (e.g. INSERT INTO TABLE SELECT * FROM TABLE2) are not supported.
  2. Insert queries that contain multiple VALUE expressions (e.g. INSERT INTO TABLE VALUES (1, 2) (3, 4)) are not supported.
  3. Insert queries that are followed by the OPTION clause or joined with other queries that are not part of the INSERT query are not supported.

If the query fails due to non-SQL server related errors, the driver will log the error message and fallback to the original logic for batch insert.