Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk Copy cannot handle inserts with subset or differently ordered columns #1554

Open
uuilee opened this issue Mar 30, 2021 · 5 comments
Open
Labels
Backlog The topic in question has been recognized and added to development backlog Documentation Pull requests that update documentation Enhancement An enhancement to the driver. Lower priority than bugs.
Projects

Comments

@uuilee
Copy link

uuilee commented Mar 30, 2021

Driver version

9.2.1.jre15

SQL Server version

Microsoft SQL Server 2016 (SP2) (KB4052908) - 13.0.5026.0 (X64)

Client Operating System

Windows 10

JAVA/JVM version

openjdk-15.0.2

Table schema

Any table

Problem description

When using useBulkCopyForBatchInsert=true a SQLServerPreparedStatement seems to expect any insert statements to contain all columns in the exact order returned from sp_executesql N'SET FMTONLY ON SELECT * FROM " (line 1990), as this is the order the column metadata is added.
The problem with this is that any insert statement only specifying a subset of columns or columns in the wrong order can have the wrong column metadata mapped. The consequence of this seems to be that some columns will be set to null when inserted. Ideally the executeBatch will look at the column list and derive the right metadata from the colum name instead of the coumn index.

Workaround is to make sure that the SQLServerBulkBatchInsertRecord has the columnList, valueList and columnMetadata with the same number of elements in the same order.

  1. Expected behaviour: Should insert all values and behave equivalent to when useBulkCopyForBatchInsert=false
  2. Actual behaviour: Some columns are null (depending on the ordering or columns)
  3. Error message/stack trace: None

Reproduction code

Have not tested this simplified version

public class Repro {
    public static void main(String[] args) throws Throwable {
        SQLServerDataSource ds = new SQLServerDataSource();
        ds.setUser("<user>");
        ds.setPassword("<password>");
        ds.setURL("<db-host>;useBulkCopyForBatchInsert=true");

        Date time = new Date();
        Connection c = ds.getConnection();
        PreparedStatement ps;
        ResultSet rs;
        try {
            c.setAutoCommit(false);
            String sql = "INSERT INTO MyTable (id, a, b) VALUES (?,?,?)";
            ps = c.prepareStatement(sql);
            try {
                ps.setLong(1, 1);
                ps.setLong(2, 123);
                ps.setLong(3, 456);
                ps.addBatch();
                ps.executeBatch();
                c.commit();
            } finally {
                ps.close();
            }
        } catch (Throwable t) {
            throw t;
        }
    }
}
@peterbae
Copy link
Contributor

Thanks @uuilee for contacting us. Currently the feature assumes that the orders coming from the user is equivalent to how the columns are ordered on the server's end. Ideally, useBulkCopyForBatchInsert=false and useBulkCopyForBatchInsert=true would behave the same, but because the driver has to manually parse the user's query, there are some limitations to the feature, which can be found here. The document doesn't explicitly say that the column order must be preserved/subsets cannot be used, but I'm inclined to document this as a limitation for now and let the users workaround the issue. Does that sound ok with you?

@uuilee
Copy link
Author

uuilee commented Mar 31, 2021

Hi @peterbae. Thanks for the speedy reply. I think this should be at least documented and throw an error in the case a subset of columns is provided (and if possible, even when a different order is provided). I think however that ideally it would figure out the right metadata by column name provided and support any subset in any order. Would that be a big change?

@peterbae
Copy link
Contributor

Currently the team is busy with higher priority items. It would be a non-trivial change to support subset of data / any order, but I agree that we should document this and throw an error. I'll let you know when we have an update on this item.

@lilgreenbird lilgreenbird added the Documentation Pull requests that update documentation label Mar 31, 2021
@uuilee
Copy link
Author

uuilee commented Apr 1, 2021

Thanks!

@lilgreenbird lilgreenbird added the Enhancement An enhancement to the driver. Lower priority than bugs. label Apr 6, 2021
@lilgreenbird lilgreenbird added this to Backlog in MSSQL JDBC Apr 13, 2021
@Jeffery-Wasty Jeffery-Wasty added the Backlog The topic in question has been recognized and added to development backlog label Mar 10, 2022
@mvrueden
Copy link

mvrueden commented Aug 29, 2022

I encountered the same issue.
Even if I use the same order as returned by sp_executesql N'SET FMTONLY ON SELECT * FROM <table>' the inserted data is null.

Edit:
Nevermind, I have the same issue as mentioned here. I cannot reproduce null values when using the "correct" order.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Backlog The topic in question has been recognized and added to development backlog Documentation Pull requests that update documentation Enhancement An enhancement to the driver. Lower priority than bugs.
Projects
MSSQL JDBC
  
Backlog
Development

No branches or pull requests

5 participants