New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
getBytes() using default charset can make the encrypted data not able to be decrypted. #658
Comments
Hi @zxiex, thanks for reporting this issue. After setting the charset encoding to UTF8, I was able to reproduce the error "Specified ciphertext has an invalid authentication tag" (please let me know if this was not what you saw), while trying to insert your Unicode string into an encrypted varchar column on SQL Server's end. However, I believe this is an effect of trying to insert a Unicode character into a varchar column. Unicode strings should be stored in nvarchar columns instead, especially if the column is encrypted. After creating an encrypted nvarchar column and using SQLServerPreparedStatement.setNString to perform the same operation, I was able to insert and retrieve the unicode string from an encrypted nvarchar column. Please let me know if this solves your problem, or there's anything I might've missed. |
Hi @peterbae , yes, I got the same exception as you did during the decryption process. I agree that creating the encrypetd column as nvarchar can avoid this issue. The problem is that we initially did not know there would be unicode characters in the java string (from user input), and the encryption does not throw any exception or complain at all, but suddenly the decryption failed and the initlal data before encryption is lost. It is reasonable to assume that if encryption is OK, the decryption should also be OK. So, I would suggest to let jdbc throw exception for possible cases like this (the check condition can be tha the length of the string should be the length of the byte array after getBytes()), which is assumed in a comment a few lines beforegetBytes(). Depending on a system property ( file.encoding) and generating some encrypted data that can’t be decrypted should be avoided IMHO. |
I see. That's a good point - I will create a PR soon that addresses this issue, so that the driver will not attempt to encrypt data that cannot be decrypted. Thanks! |
Hi @zxiex, after looking into this problem more, it's come to my attention that this implementation will have issues where the client and server have different text encoding. In the case where the client and server have different encoding, there will be letters that require one byte on one end but not on the other, which will either fail to prevent this problem, or incorrectly not allow the user to insert a valid character into the SQL Server (which would be a regression, and may break the code for existing users). Also, since this check would have to be applied to every row and for every encrypted varchar column, it's going to have a non-negligible impact on performance as well. With these newfound potential issues, I've decided to leave this problem for now and leave it to the users to employ setNString as well as an nvarchar column if the user experiences this problem. Please let me know if there's anything else you'd like to mention, or I'll be closing this issue in the near future. |
Hi @zxiex The root cause of exception during decryption is that during encryption and insertion, driver's default behavior is to pass String values with Unicode supported (pass Strings to database as nvarchar values). But in your case the underlying column is of type 'varchar' which cannot hold nvarchar values. This happens when "sendStringParametersAsUnicode" is "true" which is set by default. If you disable this connection property, by setting it to false, String values will not be passed as Unicode (by default, unless specified) and thus encryption/decryption would be successful. Other possible solution to this is as mentioned by @peterbae above, using NVARCHAR column instead to allow Unicode characters when passed. Let us know if either of the two solutions resolve your problem. |
We set "sendStringParametersAsUnicode" to be "true" according to the following doc. https://docs.microsoft.com/en-us/sql/connect/jdbc/using-always-encrypted-with-the-jdbc-driver
This is such a strong and complete statement, and it matches our initial column definition, so it prevented me from reading anything further of the doc regarding that property. I agree with @peterbae that the suggested change might have a non-negligible impact on performance, and I am OK to not make any changes, but I still think data encrypted without any issue but not able to be decrypted is dangerous. I would suggest to at least to make some changes to the doc to warn user for this kind of scenarios considering that there is no way for the developer to know what caused the decryption to fail once the data is encrypted and the developer can be easily trapped into the encryption and decryption algorithm based on the exception ("Specified ciphertext has an invalid authentication tag") JDBC throws. |
Hi @zxiex Yes we are updating documentation to address this issue. It doesn't seem right as tested. I have requested an update internally and will keep this issue open till the document gets updated. Let us know if there's anything else we can help you with. |
The doc change has been merged. Closing the issue. |
Driver version or jar name
6.5.0
SQL Server version
latest version
Client operating system
Debian
Java/JVM version
java version "1.8.*"
Table schema
any table contains an encrypted column (using AE), and the type is set as "varchar" (with typical definition of encryption information)..
Problem description
https://github.com/Microsoft/mssql-jdbc/blob/dev/src/main/java/com/microsoft/sqlserver/jdbc/dtv.java#L1663
This depends on the default charset encoding, if it is UTF8, and the string contains unicode, getBytes() produces more bytes as expected (1 byte for each character in the string). As a result, the encrypted data (if we use Always Encrypted for that column) can't be decrypted in some cases. A string as "\u201C" + "A" + "\u201D" + " A " + "\u201C" + "A" + "\u201D" and jvm option -Dfile.encoding=UTF8 can encrypt the data but not decrypted.
Expected behavior and actual behavior
throw exception if the number of bytes after getBytes is different than the length of the string.
Repro code
create a table with a column having type varchar(128) (encrypted using some keys), and try to insert a string "\u201C" + "A" + "\u201D" + " A " + "\u201C" + "A" + "\u201D" to that column and then read it back.
The text was updated successfully, but these errors were encountered: