Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getBytes() using default charset can make the encrypted data not able to be decrypted. #658

Closed
zxiex opened this issue Mar 15, 2018 · 8 comments
Projects

Comments

@zxiex
Copy link

zxiex commented Mar 15, 2018

Driver version or jar name

6.5.0

SQL Server version

latest version

Client operating system

Debian

Java/JVM version

java version "1.8.*"

Table schema

any table contains an encrypted column (using AE), and the type is set as "varchar" (with typical definition of encryption information)..

Problem description

https://github.com/Microsoft/mssql-jdbc/blob/dev/src/main/java/com/microsoft/sqlserver/jdbc/dtv.java#L1663

This depends on the default charset encoding, if it is UTF8, and the string contains unicode, getBytes() produces more bytes as expected (1 byte for each character in the string). As a result, the encrypted data (if we use Always Encrypted for that column) can't be decrypted in some cases. A string as "\u201C" + "A" + "\u201D" + " A " + "\u201C" + "A" + "\u201D" and jvm option -Dfile.encoding=UTF8 can encrypt the data but not decrypted.

Expected behavior and actual behavior

throw exception if the number of bytes after getBytes is different than the length of the string.

Repro code

create a table with a column having type varchar(128) (encrypted using some keys), and try to insert a string "\u201C" + "A" + "\u201D" + " A " + "\u201C" + "A" + "\u201D" to that column and then read it back.

@zxiex zxiex changed the title getBytes() using default charset can make the encrypted data not decrypted. getBytes() using default charset can make the encrypted data not able to be decrypted. Mar 15, 2018
@peterbae
Copy link
Contributor

Hi @zxiex, thanks for reporting this issue. After setting the charset encoding to UTF8, I was able to reproduce the error "Specified ciphertext has an invalid authentication tag" (please let me know if this was not what you saw), while trying to insert your Unicode string into an encrypted varchar column on SQL Server's end.

However, I believe this is an effect of trying to insert a Unicode character into a varchar column. Unicode strings should be stored in nvarchar columns instead, especially if the column is encrypted. After creating an encrypted nvarchar column and using SQLServerPreparedStatement.setNString to perform the same operation, I was able to insert and retrieve the unicode string from an encrypted nvarchar column.

Please let me know if this solves your problem, or there's anything I might've missed.

@zxiex
Copy link
Author

zxiex commented Mar 16, 2018

Hi @peterbae , yes, I got the same exception as you did during the decryption process. I agree that creating the encrypetd column as nvarchar can avoid this issue. The problem is that we initially did not know there would be unicode characters in the java string (from user input), and the encryption does not throw any exception or complain at all, but suddenly the decryption failed and the initlal data before encryption is lost. It is reasonable to assume that if encryption is OK, the decryption should also be OK. So, I would suggest to let jdbc throw exception for possible cases like this (the check condition can be tha the length of the string should be the length of the byte array after getBytes()), which is assumed in a comment a few lines beforegetBytes(). Depending on a system property ( file.encoding) and generating some encrypted data that can’t be decrypted should be avoided IMHO.

@peterbae
Copy link
Contributor

I see. That's a good point - I will create a PR soon that addresses this issue, so that the driver will not attempt to encrypt data that cannot be decrypted. Thanks!

@cheenamalhotra cheenamalhotra added this to Under Investigation in MSSQL JDBC Mar 20, 2018
@peterbae peterbae moved this from Under Investigation to In progress in MSSQL JDBC Mar 20, 2018
@peterbae
Copy link
Contributor

Hi @zxiex, after looking into this problem more, it's come to my attention that this implementation will have issues where the client and server have different text encoding. In the case where the client and server have different encoding, there will be letters that require one byte on one end but not on the other, which will either fail to prevent this problem, or incorrectly not allow the user to insert a valid character into the SQL Server (which would be a regression, and may break the code for existing users). Also, since this check would have to be applied to every row and for every encrypted varchar column, it's going to have a non-negligible impact on performance as well.

With these newfound potential issues, I've decided to leave this problem for now and leave it to the users to employ setNString as well as an nvarchar column if the user experiences this problem. Please let me know if there's anything else you'd like to mention, or I'll be closing this issue in the near future.

@cheenamalhotra
Copy link
Member

Hi @zxiex

The root cause of exception during decryption is that during encryption and insertion, driver's default behavior is to pass String values with Unicode supported (pass Strings to database as nvarchar values). But in your case the underlying column is of type 'varchar' which cannot hold nvarchar values.

This happens when "sendStringParametersAsUnicode" is "true" which is set by default. If you disable this connection property, by setting it to false, String values will not be passed as Unicode (by default, unless specified) and thus encryption/decryption would be successful.

Other possible solution to this is as mentioned by @peterbae above, using NVARCHAR column instead to allow Unicode characters when passed.

Let us know if either of the two solutions resolve your problem.

@zxiex
Copy link
Author

zxiex commented Mar 21, 2018

Hi @cheenamalhotra

We set "sendStringParametersAsUnicode" to be "true" according to the following doc.

https://docs.microsoft.com/en-us/sql/connect/jdbc/using-always-encrypted-with-the-jdbc-driver

When Always Encrypted is enabled and a char/varchar/varchar(max) column is encrypted, the value of sendStringParametersAsUnicode must be set to true (or be left as the default).

This is such a strong and complete statement, and it matches our initial column definition, so it prevented me from reading anything further of the doc regarding that property.

I agree with @peterbae that the suggested change might have a non-negligible impact on performance, and I am OK to not make any changes, but I still think data encrypted without any issue but not able to be decrypted is dangerous. I would suggest to at least to make some changes to the doc to warn user for this kind of scenarios considering that there is no way for the developer to know what caused the decryption to fail once the data is encrypted and the developer can be easily trapped into the encryption and decryption algorithm based on the exception ("Specified ciphertext has an invalid authentication tag") JDBC throws.

@cheenamalhotra
Copy link
Member

Hi @zxiex

Yes we are updating documentation to address this issue. It doesn't seem right as tested. I have requested an update internally and will keep this issue open till the document gets updated.

Let us know if there's anything else we can help you with.

@peterbae
Copy link
Contributor

The doc change has been merged. Closing the issue.

MSSQL JDBC automation moved this from In progress to Closed Issues Mar 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
MSSQL JDBC
  
Closed Issues
Development

No branches or pull requests

3 participants