Skip to content

Conversation

@paul-rogers
Copy link
Contributor

Validates offset vectors in VarChar and repeated vectors. Validates the
special case of repeated VarChar vectors (two layers of offsets.)

Provides two new session variables to turn on validation. One enables
the existing operator (iterator) validation, the other adds vector
validation. This allows validation to occur in a “production” Drill
(without restarting Drill with assertions, as previously required.)

Unit tests validate the validator. Another test validates the
integration, but requires manual steps, so is ignored by default.

This version is first-cut: all work is done within a single class.
Allows back-porting to an earlier version to solve a specific issues. A
revision should move some of the work into generated code (or refactor
vectors to allow outside access), since offset vectors appear for each
subclass; not on a base class that would allow generic operations.

* each batch passed to each iterator.
*/
String ENABLE_VECTOR_VALIDATION = "debug.validate_vectors";
BooleanValidator ENABLE_VECTOR_VALIDATOR = new BooleanValidator(ENABLE_VECTOR_VALIDATION, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

false, by default, here and below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Fixed.

But, that error actually accidentally caught a bug...

if (! enableValidation) {
enableValidation = context.getOptionSet().getOption(ExecConstants.ENABLE_ITERATOR_VALIDATOR);
}
if (enableValidation) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (AssertionUtil.isAssertionsEnabled() ||  
     context.getOptionSet().getOption(ExecConstants.ENABLE_ITERATOR_VALIDATOR) { ... }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

private void validateWrapper(VectorWrapper<? extends ValueVector> w) {
if (w instanceof SimpleVectorWrapper) {
validateVector(w.getValueVector());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mentioned above that HyperVectorWrapper is not validated. Can you open a ticket for the functionality to-be-implemented in this validator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. See DRILL-5526.

}
}

public void validate() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought. Is there a way to enable these checks (and fail if invalid) for pre-commit tests as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea! Added a config option that forces vector validation. Add the following to the pom.xml file in the Surefire options:

{code}
-Ddrill.exec.debug.validate_vectors=true
{code}

Will try this out and enable the checks as a different JIRA ticket and PR.

@sudheeshkatkam
Copy link
Contributor

+1

Please squash the commits.

@paul-rogers
Copy link
Contributor Author

Commits squashed.

Validates offset vectors in VarChar and repeated vectors. Validates the
special case of repeated VarChar vectors (two layers of offsets.)

Provides two new session variables to turn on validation. One enables
the existing operator (iterator) validation, the other adds vector
validation. This allows validation to occur in a “production” Drill
(without restarting Drill with assertions, as previously required.)

Unit tests validate the validator. Another test validates the
integration, but requires manual steps, so is ignored by default.

This version is first-cut: all work is done within a single class.
Allows back-porting to an earlier version to solve a specific issues. A
revision should move some of the work into generated code (or refactor
vectors to allow outside access), since offset vectors appear for each
subclass; not on a base class that would allow generic operations.

* Added boot-time options to allow enabling vector validation in Maven
unit tests.
* Code cleanup per suggestions.
* Additional (manual) tests for boot-time options and default options.
@paul-rogers
Copy link
Contributor Author

Fixed typo in log message and rebased onto latest master.

sudheeshkatkam pushed a commit to sudheeshkatkam/drill that referenced this pull request May 26, 2017
Validates offset vectors in VarChar and repeated vectors. Validates the
special case of repeated VarChar vectors (two layers of offsets.)

Provides two new session variables to turn on validation. One enables
the existing operator (iterator) validation, the other adds vector
validation. This allows validation to occur in a “production” Drill
(without restarting Drill with assertions, as previously required.)

Unit tests validate the validator. Another test validates the
integration, but requires manual steps, so is ignored by default.

This version is first-cut: all work is done within a single class.
Allows back-porting to an earlier version to solve a specific issues. A
revision should move some of the work into generated code (or refactor
vectors to allow outside access), since offset vectors appear for each
subclass; not on a base class that would allow generic operations.

* Added boot-time options to allow enabling vector validation in Maven
unit tests.
* Code cleanup per suggestions.
* Additional (manual) tests for boot-time options and default options.

closes apache#832
@asfgit asfgit closed this in d7bc213 Jun 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants