-
Notifications
You must be signed in to change notification settings - Fork 986
DRILL-5504: Vector validator to diagnose offset vector issues #832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| * each batch passed to each iterator. | ||
| */ | ||
| String ENABLE_VECTOR_VALIDATION = "debug.validate_vectors"; | ||
| BooleanValidator ENABLE_VECTOR_VALIDATOR = new BooleanValidator(ENABLE_VECTOR_VALIDATION, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
false, by default, here and below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Fixed.
But, that error actually accidentally caught a bug...
| if (! enableValidation) { | ||
| enableValidation = context.getOptionSet().getOption(ExecConstants.ENABLE_ITERATOR_VALIDATOR); | ||
| } | ||
| if (enableValidation) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (AssertionUtil.isAssertionsEnabled() ||
context.getOptionSet().getOption(ExecConstants.ENABLE_ITERATOR_VALIDATOR) { ... }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
| private void validateWrapper(VectorWrapper<? extends ValueVector> w) { | ||
| if (w instanceof SimpleVectorWrapper) { | ||
| validateVector(w.getValueVector()); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mentioned above that HyperVectorWrapper is not validated. Can you open a ticket for the functionality to-be-implemented in this validator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. See DRILL-5526.
| } | ||
| } | ||
|
|
||
| public void validate() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a thought. Is there a way to enable these checks (and fail if invalid) for pre-commit tests as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea! Added a config option that forces vector validation. Add the following to the pom.xml file in the Surefire options:
{code}
-Ddrill.exec.debug.validate_vectors=true
{code}
Will try this out and enable the checks as a different JIRA ticket and PR.
|
+1 Please squash the commits. |
|
Commits squashed. |
Validates offset vectors in VarChar and repeated vectors. Validates the special case of repeated VarChar vectors (two layers of offsets.) Provides two new session variables to turn on validation. One enables the existing operator (iterator) validation, the other adds vector validation. This allows validation to occur in a “production” Drill (without restarting Drill with assertions, as previously required.) Unit tests validate the validator. Another test validates the integration, but requires manual steps, so is ignored by default. This version is first-cut: all work is done within a single class. Allows back-porting to an earlier version to solve a specific issues. A revision should move some of the work into generated code (or refactor vectors to allow outside access), since offset vectors appear for each subclass; not on a base class that would allow generic operations. * Added boot-time options to allow enabling vector validation in Maven unit tests. * Code cleanup per suggestions. * Additional (manual) tests for boot-time options and default options.
|
Fixed typo in log message and rebased onto latest master. |
Validates offset vectors in VarChar and repeated vectors. Validates the special case of repeated VarChar vectors (two layers of offsets.) Provides two new session variables to turn on validation. One enables the existing operator (iterator) validation, the other adds vector validation. This allows validation to occur in a “production” Drill (without restarting Drill with assertions, as previously required.) Unit tests validate the validator. Another test validates the integration, but requires manual steps, so is ignored by default. This version is first-cut: all work is done within a single class. Allows back-porting to an earlier version to solve a specific issues. A revision should move some of the work into generated code (or refactor vectors to allow outside access), since offset vectors appear for each subclass; not on a base class that would allow generic operations. * Added boot-time options to allow enabling vector validation in Maven unit tests. * Code cleanup per suggestions. * Additional (manual) tests for boot-time options and default options. closes apache#832
Validates offset vectors in VarChar and repeated vectors. Validates the
special case of repeated VarChar vectors (two layers of offsets.)
Provides two new session variables to turn on validation. One enables
the existing operator (iterator) validation, the other adds vector
validation. This allows validation to occur in a “production” Drill
(without restarting Drill with assertions, as previously required.)
Unit tests validate the validator. Another test validates the
integration, but requires manual steps, so is ignored by default.
This version is first-cut: all work is done within a single class.
Allows back-porting to an earlier version to solve a specific issues. A
revision should move some of the work into generated code (or refactor
vectors to allow outside access), since offset vectors appear for each
subclass; not on a base class that would allow generic operations.