New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce support for "Data Classification Specifications" on fetched resultsets #709

Merged
merged 16 commits into from Jun 23, 2018

Conversation

6 participants
@cheenamalhotra
Member

cheenamalhotra commented May 31, 2018

This PR introduces new APIs that provide support and read Data Sensitivity Classification information from SQL Server.

This feature will be available/functional from new versions of SQL Servers (from 2018). For older versions of SQL Server, request for Data Classification Feature-Ack will be a No-Op.

cheenamalhotra added some commits May 11, 2018

@codecov-io

This comment has been minimized.

codecov-io commented May 31, 2018

Codecov Report

Merging #709 into dev will decrease coverage by 0.14%.
The diff coverage is 15.31%.

Impacted file tree graph

@@             Coverage Diff              @@
##                dev     #709      +/-   ##
============================================
- Coverage     48.09%   47.94%   -0.15%     
- Complexity     2623     2628       +5     
============================================
  Files           112      118       +6     
  Lines         26643    26753     +110     
  Branches       4477     4493      +16     
============================================
+ Hits          12813    12827      +14     
- Misses        11695    11802     +107     
+ Partials       2135     2124      -11
Flag Coverage Δ Complexity Δ
#JDBC42 47.42% <15.31%> (?) 2582 <3> (?)
#JDBC43 47.85% <15.31%> (-0.25%) 2623 <3> (ø)
Impacted Files Coverage Δ Complexity Δ
...m/microsoft/sqlserver/jdbc/SQLServerException.java 77.23% <ø> (ø) 30 <0> (ø) ⬇️
...om/microsoft/sqlserver/jdbc/SQLServerResource.java 100% <ø> (ø) 4 <0> (ø) ⬇️
...r/jdbc/dataclassification/SensitivityProperty.java 0% <0%> (ø) 0 <0> (?)
.../dataclassification/SensitivityClassification.java 0% <0%> (ø) 0 <0> (?)
...erver/jdbc/dataclassification/InformationType.java 0% <0%> (ø) 0 <0> (?)
...n/java/com/microsoft/sqlserver/jdbc/tdsparser.java 67.54% <0%> (ø) 0 <0> (ø) ⬇️
...osoft/sqlserver/jdbc/dataclassification/Label.java 0% <0%> (ø) 0 <0> (?)
...m/microsoft/sqlserver/jdbc/SQLServerResultSet.java 33.3% <0%> (+0.06%) 247 <0> (+1) ⬆️
...ver/jdbc/dataclassification/ColumnSensitivity.java 0% <0%> (ø) 0 <0> (?)
...va/com/microsoft/sqlserver/jdbc/StreamColumns.java 54.16% <2.12%> (-33.68%) 21 <0> (ø)
... and 17 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update abc9851...b9d8988. Read the comment docs.

@cheenamalhotra cheenamalhotra added this to Under Peer Review in MSSQL JDBC Jun 1, 2018

@cheenamalhotra cheenamalhotra added this to the 6.5.4 milestone Jun 1, 2018

@@ -0,0 +1,20 @@
package com.microsoft.sqlserver.jdbc;

This comment has been minimized.

@ulvii

ulvii Jun 4, 2018

Member

InformationType and Label classes are exactly the same. Please also make sure the class names are descriptive.

This comment has been minimized.

@cheenamalhotra

cheenamalhotra Jun 5, 2018

Member

We cannot rename classes, as they are as per design specs.

On the other hand, I think it would be better to move all these new classes to dataclassification package so that customers would relate Label and InformationType classes to data classification information and not mix with other driver classes.

Other drivers have implemented differently with all classes defined in a common class DataClassification , but in Java we shall do in separate individual classes as they are public APIs.

public String getId() {
return id;
}
}

This comment has been minimized.

@ulvii

ulvii Jun 5, 2018

Member

Please add a new line to the end of the file.

}
public String getName() {
return name;

This comment has been minimized.

@ulvii

ulvii Jun 5, 2018

Member

Please apply the formatter.

if (write) {
// Write Feature ID, length of the version# field and Sensitivity Classification Version#
tdsWriter.writeByte((byte) TDS.TDS_FEATUREEXT_DATACLASSIFICATION);

This comment has been minimized.

@ulvii

ulvii Jun 5, 2018

Member

Let's declare TDS_FEATUREEXT_DATACLASSIFICATION as a byte instead.

This comment has been minimized.

@cheenamalhotra

cheenamalhotra Jun 5, 2018

Member

Cannot change it right away, as while reading token, featureId is compared in int type, as you can see here. If it will be changed for one, it will be changed for others too and need to see its impact on the driver behavior.

This comment has been minimized.

@ulvii

ulvii Jun 5, 2018

Member

featureID is a byte too, it should not be int at the first place.

// 0x06 is for x_eFeatureExtensionId_LoginToken
// 0x07 is for x_eFeatureExtensionId_ClientSideTelemetry
// Data Classification constants
static final int TDS_FEATUREEXT_DATACLASSIFICATION = 0x09;

This comment has been minimized.

@ulvii

ulvii Jun 5, 2018

Member

Can we keep/make the naming consistent? There are 3 more similar constants:
TDS_FEATURE_EXT_AE, TDS_FEATURE_EXT_FEDAUTH, TDS_FEATURE_EXTSION_ACK. I would suggest using TDS_FEATURE_EXT for all of them.

This comment has been minimized.

@cheenamalhotra

cheenamalhotra Jun 5, 2018

Member

I would not rename others, but would make the one I added TDS_FEATURE_EXT_DATACLASSIFICATION

* @throws SQLException
*/
private void dropTable(Statement stmt) throws SQLException {
stmt.execute("DROP TABLE " + tableName);

This comment has been minimized.

@ulvii

ulvii Jun 5, 2018

Member

Please use Utils.dropTableIfExists() instead.

private final byte valueBytes[] = new byte[256];
private static final AtomicInteger lastReaderID = new AtomicInteger(0);
protected SensitivityClassification sensitivityClassification;

This comment has been minimized.

@peterbae

peterbae Jun 12, 2018

Member

formatting

@@ -7114,6 +7138,13 @@ final void TryProcessFeatureExtAck(boolean featureExtAckReceived) throws SQLServ
if (isColumnEncryptionSettingEnabled() && !featureExtAckReceived)
throw new SQLServerException(this, SQLServerException.getErrString("R_AE_NotSupportedByServer"), null, 0, false);
}
boolean TrySetSensitivityClassification(SensitivityClassification sensitivityClassification) {

This comment has been minimized.

@peterbae

peterbae Jun 12, 2018

Member

method names should start with a lowercase

This comment has been minimized.

@cheenamalhotra

cheenamalhotra Jun 13, 2018

Member

Kept it consistent with other feature methods, will better rename all such methods.

@@ -3421,7 +3426,20 @@ int writeFedAuthFeatureRequest(boolean write,
}
return totalLen;
}
int writeDataClassificationFeatureRequest (boolean write /* if false just calculates the length */,

This comment has been minimized.

@peterbae

peterbae Jun 12, 2018

Member

I'd put the protected keyword here explicitly to maintain consistency throughout the file.

This comment has been minimized.

@cheenamalhotra

cheenamalhotra Jun 13, 2018

Member

Similar feature request methods - writeAEFeatureRequest and writeFedAuthFeatureRequest are not protected too, hence kept like that.

if (data.length != 2) {

This comment has been minimized.

@peterbae

peterbae Jun 12, 2018

Member

Do we need to check 1 > data.length and data.length != 2 separately?

This comment has been minimized.

@cheenamalhotra

cheenamalhotra Jun 13, 2018

Member

Didn't want to change the way/sequence we check for this token, exact same way is done for AE feature as well as in .NET driver.

@@ -393,5 +393,7 @@ static String getResource(String key) {
{"R_invalidSSLProtocol", "SSL Protocol {0} label is not valid. Only TLS, TLSv1, TLSv1.1, and TLSv1.2 are supported."},
{"R_cancelQueryTimeoutPropertyDescription", "The number of seconds to wait to cancel sending a query timeout."},
{"R_invalidCancelQueryTimeout", "The cancel timeout value {0} is not valid."},
{"R_UnknownDataClsTokenNumber","Unknown token for Data Classification."}, // From Server
{"R_InvalidDataClsVersionNumber","Invalid version number {0} for Data Classification."}, // From Server

This comment has been minimized.

@peterbae

peterbae Jun 12, 2018

Member

where is R_InvalidDataClsVersionNumber being used? I don't see it in our code.

This comment has been minimized.

@cheenamalhotra

cheenamalhotra Jun 13, 2018

Member

Should be used here. Good catch!

This comment has been minimized.

@peterbae

peterbae Jun 13, 2018

Member

oh yeah, also add a space between the comma and the description part

* @param Statement
* @return boolean
*/
private boolean serverSupportsDataClassification(Statement stmt) {

This comment has been minimized.

@ulvii

ulvii Jun 12, 2018

Member

Can we move this to Utils class?

This comment has been minimized.

@cheenamalhotra
// get the information type count
int numInformationTypes = tdsReader.readUnsignedShort();
List<InformationType> informationTypes = new ArrayList<InformationType>(numInformationTypes);

This comment has been minimized.

@rene-ye

rene-ye Jun 13, 2018

Member

I think we should generally use LinkedLists over ArrayLists, seeing how little we do random access compared to iteration. Also new ArrayList<InformationType> is repetitive, the compiler will infer the template type if we just use new ArrayList<>.

byte enabled = data[1];
serverSupportsDataClassification = (enabled == 0) ? false : true;
break;
}

This comment has been minimized.

@lilgreenbird

lilgreenbird Jun 14, 2018

Member

is it just me or the spacing is all off in this method?

if (data.length != 2) {

This comment has been minimized.

@lilgreenbird

lilgreenbird Jun 14, 2018

Member

can we keep this consistent? above is (1> data.length) etc..

throw new SQLServerException(SQLServerException.getErrString("R_UnknownDataClsTokenNumber"), null);
}
byte enabled = data[1];

This comment has been minimized.

@lilgreenbird

lilgreenbird Jun 15, 2018

Member

this variable seem unnecessary?

This comment has been minimized.

@cheenamalhotra

cheenamalhotra Jun 15, 2018

Member

used in the next line. It basically tells us is the Feature_Ack enabled and TDS packets will be available in stream

This comment has been minimized.

@lilgreenbird

lilgreenbird Jun 15, 2018

Member

I mean, it's only used once in the next line, seems unnecessary to define a variable for it...no biggie tho..

@@ -0,0 +1,134 @@
package com.microsoft.sqlserver.jdbc.resultset;

This comment has been minimized.

@ulvii

ulvii Jun 18, 2018

Member

Please add the license header.

// 0x07 is for x_eFeatureExtensionId_ClientSideTelemetry
// Data Classification constants
static final byte TDS_FEATURE_EXT_DATACLASSIFICATION = 0x09;
static final byte DATA_CLASSIFICATION_NOT_ENABLED = 0x00;

This comment has been minimized.

@ulvii

ulvii Jun 18, 2018

Member

DATA_CLASSIFICATION_NOT_ENABLED is not used anywhere.

protected SensitivityClassification sensitivityClassification;
private static final AtomicInteger lastReaderID = new AtomicInteger(0);

This comment has been minimized.

@ulvii

ulvii Jun 18, 2018

Member

Please apply the formatter to your changes.

boolean trySetSensitivityClassification(SensitivityClassification sensitivityClassification) {
this.sensitivityClassification = sensitivityClassification;
return true;

This comment has been minimized.

@ulvii

ulvii Jun 18, 2018

Member

Why does this method return boolean? return true makes no sense.

connectionlogger.fine(toString() + " Received feature extension acknowledgement for Data Classification.");
}
if (1 > data.length) {

This comment has been minimized.

@ulvii

ulvii Jun 18, 2018

Member

Please remove this block and move if (data.length != 2) before if ((0 == supportedDataClassificationVersion) || (supportedDataClassificationVersion > TDS.MAX_SUPPORTED_DATA_CLASSIFICATION_VERSION)).

@@ -72,6 +72,11 @@ final int getErrorCode() {
static final int DRIVER_ERROR_INTERMITTENT_TLS_FAILED = 7;
static final int ERROR_SOCKET_TIMEOUT = 8;
static final int ERROR_QUERY_TIMEOUT = 9;
static final int DataClassificationInvalidVersion = 24;

This comment has been minimized.

@ulvii

ulvii Jun 18, 2018

Member

Is there a reason why you specified the error codes but didn't use any of them? You could use DataClassificationInvalidVersion with throw new SQLServerException(SQLServerException.getErrString("R_InvalidDataClsVersionNumber"), null); for example.

I think there is actually no need for these error codes in JDBC driver.

This comment has been minimized.

@cheenamalhotra

cheenamalhotra Jun 18, 2018

Member

The code is ported from .NET driver as such, tokens and error codes not used here are not used in ADO driver too, but may be used in future expansion of this feature. I let them stay as such as they will be useful to track changes in future in comparison to ADO changes.

This comment has been minimized.

@ulvii

ulvii Jun 18, 2018

Member

Does 24 have a special meaning or this was just 24th error code in .Net driver? If that's the case, we shouldn't jump from 9 to 24.
Please also keep the naming consistent with other error codes.

@@ -0,0 +1,16 @@
package com.microsoft.sqlserver.jdbc.dataclassification;

This comment has been minimized.

@ulvii

ulvii Jun 18, 2018

Member

Please add license headers too all new files.

This comment has been minimized.

@cheenamalhotra
@@ -167,7 +166,7 @@ final StreamError getDatabaseError() {
return databaseError;
}
TDSTokenHandler(String logContext) {
TDSTokenHandler(String logContext) {

This comment has been minimized.

@ulvii

ulvii Jun 18, 2018

Member

Formatting.

*
* @param connection
* @param stmt
* @param tableName

This comment has been minimized.

@ulvii

ulvii Jun 18, 2018

Member

tableName is not a @param

@@ -0,0 +1,134 @@
package com.microsoft.sqlserver.jdbc.resultset;
import java.sql.Connection;

This comment has been minimized.

@ulvii

ulvii Jun 18, 2018

Member

Apply the formatter please

This comment has been minimized.

@cheenamalhotra

cheenamalhotra Jun 18, 2018

Member

Already applied - I don't see any change on doing again

* Selects data from the table and triggers verifySensitivityClassification method
*
* @param stmt
* @param queries

This comment has been minimized.

@ulvii

ulvii Jun 18, 2018

Member

queries is not a @param

This comment has been minimized.

@cheenamalhotra

@cheenamalhotra cheenamalhotra dismissed stale reviews from lilgreenbird and rene-ye via 29e33c5 Jun 18, 2018

@cheenamalhotra cheenamalhotra dismissed stale reviews from ulvii and peterbae via c62fbcf Jun 22, 2018

@cheenamalhotra cheenamalhotra merged commit 09d7967 into Microsoft:dev Jun 23, 2018

3 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
license/cla All CLA requirements met.
Details

MSSQL JDBC automation moved this from Under Peer Review to Closed/Merged PRs Jun 23, 2018

cheenamalhotra added a commit that referenced this pull request Jun 30, 2018

Merge dev to master for 6.5.4 release (#733)
* 623 fix

* 623 change stash

* Prepared Statement Caching fix for 'handle not found' errors

* Fix for PS Caching issue - Calling reset instead of on type def changes

* Updated comparison

* Change back assert check.

* Adding call to removeReference back + Fix for Batch processes intermittent failures.

* Removed DBName and made changes to resetPrepStmtHandle method

* Check for null handle before proceed

* Adding Old Constrcutor back to AKV Implementation

* Making baseURL final

* Remove unnecessary code.

* Use Bulk Copy API for batch insert operation

* Parse bug fixing and test added

* bug fix + additional tests

* change reflection for testing

* more test changes

* Add parsing logic for -- comment

* refactoring

* Update snapshot

* Bug fix / testing change

* Reflect comment change

* Feature | AKV Old Constructor changes - Reformatted code + Deprecated old Constructor and added a new constructor with 1 param

* Mark computed columns as IS_GENERATEDCOLUMN in the result set returned by getColumns() (#695)

* Fix | getColumns() API, changed column name from SS_IS_COMPUTED to IS_AUTOINCREMENT per JDBC specs | issue #600

* Fix | getColumns() API, changed column name from SS_IS_COMPUTED to IS_GENERATEDCOLUMN per JDBC specs | issue #600

* fix issue with redirection

* Fix | PS Caching - Remove commented lines

* Trigger Appveyor test

* Fix | AKV Old Constructor - Calling the other constructor instead.

* Fix | Reversed null checks

* Resolving alignment problems and comments

* Refactor two Bulk files into a common parent

* javadoc changes

* Applied formatter

* fix problem with precision / scale

* Fix | Fix some of the Javadoc warnings (#702)

* fix issue with setting all to true

* Resolved maven build warnings and java warnings regarding deprecated API (#701)

* Resolving maven warnings

* Removing jreVersion property

Does not make sense now that we use final name in the build itself. Only used in 1 place, hard-coding java version for different builds as that's what it represents anyways.

* java warnings

* make bamoo fixes

* resource bundle for junit test error strings (#698)

resource bundle for error message strings in junit tests

* undo some changes made to SQLServerConnection

* apply resource bundling changes

* Add support for JDK 10 in both Maven and Gradle (#691)

* Feature | Added support for JDK 10 in both Maven and Gradle - builds jre10 jars for the driver, replacing jre9

* JDK 10 | Merge 42 classes to base classes to reduce class redundancy.

* JDK 10 | Attempt to run JDK 10 with Appveyor

* Remove unwanted space

* Updating Travis script to use JDK 10

* Testing without addons

* Update script for Jacoco report to build 43 profile

* Revert driver changes for 42 compliance - to be added in a separate PR

* Revert Test class changes for 42 compliance - to be done in a separate PR

* Reformatted code

* Add ID to jacoco plugin execution task

* Kerberos Constrained Delegation Impersonated Credential Expiry fix (#636)

fix for automatic credential discarding

* update felix to 3.5.0

* Revised implementation

Decided to not dispose user created credentials at all.

* Updated flag set location

* changes for 6.5.3 preview release

* Revert "changes for 6.5.3 preview release"

This reverts commit 5c6ccd3.

* Changes in preparation for 6.5.3 preview release (#710)

* changes for preview release

* requested changes

* jre version update changes

* snapshot updates post release

* remove on_dw, and remove redundant fmtonly

* formatting

* fix for getSchema when using "-" in name

* Reformatting + adding more tests

* inherit the connection property in statement + fix issue with null / empty string being passed in as values

* Request Boundary methods - beginRequest()/endRequest() implementation (#708)

* Add | Request Boundary Methods - beginRequest()/endRequest() implementation

* Fix | Remove unused import from AbstractTest

* Fix | Applying review comments

* Fix | Moving RequestBoundaryMethodsTest.java to connection package

* added error message in resource file and changed files accordingly

* comment revisions

* use TestResource

* test changes

removed finals
removed database creation tracking

* drop database before creating

* replaced dropDBIfExists with Utils function

* added try-with-resources nest

avoid manually closing statements, and safetly handles resources.

* Fixing logic / adding more tests

* dont use test database in tests

* Change exception handling as per JDBC specs

* Add | Add missing license headers (#725)

* remove some comments

* Enable verify data (#724)

Fix to enable data verification in Junit tests. Also addresses intermittent failures with Time/Timestamp where the precision was being inaccurately judged.

* Fix | Refactored socket creation to simplify handling of socket creation

Refactors socket creation in SocketFinder.findSocket(...) to simplify handling of socket creation.

When the host resolves to a single address the driver now defers to getConnectedSocket(...)
to create the socket without spawning any threads. This happens regardless of whether we're
running on an IBM JDK. Previously the single address case would still use NIO on an IBM JDK.

On non-IBM JDKs the driver now handles both IPv4 and IPv6 addresses concurrently with a single
shared timeout. Previously hosts that resolved to both types of addresses were allowed half the
timeout for socket creation per address type with the resolution performed sequentially.

* reflect comments

* Add support for UTF-8 feature extension. (#722)

* Add | Support for UTF8 changes

* changed how logger works, refactored code in SQLServerBulkCommon due to that, changed exception being thrown to BatchUpdateException, added same logic for parsing in executeLargeBatch, and added tests accordingly.

* add more tests, make the prepared statement property go away

* Feature | Introduce support for "Data Classification Specifications" on fetched resultsets (#709)

* Feature | Data Classification Project | Phase 1 (contains temporary skipping 2 bytes)

* Feature | Data Classification - Removing extra bytes added before

* Feature | Data Classification - Added new test class for testing Data Classification support in the driver

* Remove one println

* Feature | Repackaged newly added files for Data Classification + improvements in source code

* Feature | Changing tokens to bytes instead of int

* Feature | Making variables private

* Formatted code + dropTable method called from Utils

* Feature | Data Classification - Changes as per review comments

* Fix | Review comment changes

* Change exception codes to follow series

* Fix Conflict issue

* Added missing Javadocs and headers for all new files

* Added getter/setter public for the useBulkCopyForBatchInsert connection property.

* Change implementation of child classes a bit

* Remove dependencies from tests that are from outside required libraries

* also remove hex from DBTable

* Fix bamboo problem + refactor test code

* Replace all connection and statements with try blocks

* change spacing

* refactor code

* refactoring

* Fix | Making driver default compliant to JDBC 4.2 Specs and update ADAL4J dependency to 1.6.0 (#711)

* Feature | Added support for JDK 10 in both Maven and Gradle - builds jre10 jars for the driver, replacing jre9

* JDK 10 | Merge 42 classes to base classes to reduce class redundancy.

* JDK 10 | Attempt to run JDK 10 with Appveyor

* Remove unwanted space

* Updating Travis script to use JDK 10

* Testing without addons

* Update script for Jacoco report to build 43 profile

* Minor fix in formatting to avoid conflicts

* moving driver specific functions

for SQLServerPreparedStatement

* Remove unwanted code + Update Adal4J library dependency

* changes for CallableStatement

repeptitive delcarations

* Remove an extra bracket due to conflict

* changes for ISQLServerConnection

there are problems with moving all Driver sepcific public methods. SQLServerConnectionPoolProxy also implements this interface and there are many public APIs (such as preparedstmt cacheing stuff) which it doesn't implement, and cannot be moved into the interface at this time.

* lambda touch-up

should generally stick to 1 line if possible.

* changes for ISQLServerDataSource

* updates for ISQLServerResultSet

* Improvements | Missing interface APIs added + Code improvements

* More changes for Interface missing methods

* Implemented missing methods in SQLServerConnectionPoolProxy

* Removed ISQLServerConnection43 for duplicated method definitions

* Added APIs in interface for SQLServerResultSet

* More cleanup done

* Fix minor issues

* Fix test failures and implement Serialization for HashKey

* Fix JavaDoc errors and warnigs

* More changes for CallableStatement APIs

* More changes for Statement and Prepared Statament public APIs

* Javadoc fix

* More changes for SQL Server Bulk Record interface

* Callable Statement missing APIs for Interface

* Add missing desciptions

* Reverting pom.xml change for this PR

* Attempt to resolve conflicts

* Remove Interface as not needed.

* Added missing docs

* Changes for Clob/Blob classes for compliance

* Update ADAL4J with latest version

* Changes for Data Source classes

* Minor fixes to the new changes

* Fix for failing tests

* More changes for compliance

* Add Javadocs and class headers

* Fixed Malformed HTML Error in Javadocs

* javadoc changes

* more javadoc changes to make the abbreviations more clear

* fix unchecked warning issue

* Change HashKey in the driver to 256 Hash

* Add Interface back to SQLServerConnection43 class

* Revert "Change HashKey in the driver to 256 Hash"

This reverts commit e6bef4e.

* Changes for exceptions to throw SQLServerException type

* 6.5.4 preview release changelog (#731)

Release | Changelog for 6.5.4 preview release (#731)

* Fix Conflict issues with master branch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment