Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-6355: [Java] Make range equal visitor reusable #5195

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Expand Up @@ -49,6 +49,11 @@ public class ApproxEqualsVisitor extends RangeEqualsVisitor {
private DiffFunction<Double> doubleDiffFunction =
(Double value1, Double value2) -> Math.abs(value1 - value2);

public ApproxEqualsVisitor(float floatEpsilon, double doubleEpsilon) {
this.floatEpsilon = floatEpsilon;
this.doubleEpsilon = doubleEpsilon;
}

public ApproxEqualsVisitor(ValueVector right, float epsilon) {
this (right, epsilon, epsilon, true);
}
Expand All @@ -66,9 +71,9 @@ public ApproxEqualsVisitor(ValueVector right, float floatEpsilon, double doubleE
*/
public ApproxEqualsVisitor(ValueVector right, float floatEpsilon, double doubleEpsilon, boolean typeCheckNeeded,
int leftStart, int rightStart, int length) {
super(right, rightStart, leftStart, length, typeCheckNeeded);
this.floatEpsilon = floatEpsilon;
this.doubleEpsilon = doubleEpsilon;
set(right, rightStart, leftStart, length, typeCheckNeeded);
}

public void setFloatDiffFunction(DiffFunction<Float> floatDiffFunction) {
Expand Down Expand Up @@ -102,9 +107,9 @@ protected boolean compareUnionVectors(UnionVector left) {
return false;
}

ApproxEqualsVisitor visitor = new ApproxEqualsVisitor(floatEpsilon, doubleEpsilon);
for (int k = 0; k < leftChildren.size(); k++) {
ApproxEqualsVisitor visitor = new ApproxEqualsVisitor(rightChildren.get(k),
floatEpsilon, doubleEpsilon);
visitor.set(rightChildren.get(k), 0, 0, rightChildren.get(k).getValueCount(), true);
if (!leftChildren.get(k).accept(visitor, null)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we give a message if visitor is directly used without setting params?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the code to make leftStart, rightStart and length have -1 as default value. Since each visit method calls validate method first, if the set method is not called, an exception will be thrown.

return false;
}
Expand All @@ -121,9 +126,9 @@ protected boolean compareStructVectors(NonNullableStructVector left) {
return false;
}

ApproxEqualsVisitor visitor = new ApproxEqualsVisitor(floatEpsilon, doubleEpsilon);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also do this for compareListVectors/compareFixedSizeLiseVectors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems compareListVector/compareFixedSizeListVectors in this class does not change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. Fixed now.

for (String name : left.getChildFieldNames()) {
ApproxEqualsVisitor visitor = new ApproxEqualsVisitor(rightVector.getChild(name),
floatEpsilon, doubleEpsilon);
visitor.set(rightVector.getChild(name), 0, 0, rightVector.getChild(name).getValueCount(), true);
Copy link
Contributor

@pravindra pravindra Aug 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are two other ways to pass along parameters
a. for parameters that do not change between invocation, can put them in the constructor of the vistor
b. for parameters that are different for each invocation, pass along as a pojo/object in the accept() call (instead of null).

did you consider these options ? The drawback of the current approach is that it's mandating a strict order between (visitor.set and the accept call) i.e the accept call must always be preceeded by the visitor.set call (not doing so will be caught at runtime only).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pravindra Sounds good.
Wrapping changing parameters into a pojo is a good idea. Too many parameters may confuse users and easily produce bugs.

if (!left.getChild(name).accept(visitor, null)) {
return false;
}
Expand All @@ -134,7 +139,7 @@ protected boolean compareStructVectors(NonNullableStructVector left) {

@Override
protected boolean compareListVectors(ListVector left) {

ApproxEqualsVisitor visitor = new ApproxEqualsVisitor(floatEpsilon, doubleEpsilon);
for (int i = 0; i < length; i++) {
int leftIndex = leftStart + i;
int rightIndex = rightStart + i;
Expand All @@ -160,8 +165,8 @@ protected boolean compareListVectors(ListVector left) {
ValueVector leftDataVector = left.getDataVector();
ValueVector rightDataVector = ((ListVector)right).getDataVector();

if (!leftDataVector.accept(new ApproxEqualsVisitor(rightDataVector, floatEpsilon, doubleEpsilon,
typeCheckNeeded, startIndexLeft, startIndexRight, (endIndexLeft - startIndexLeft)), null)) {
visitor.set(rightDataVector, startIndexRight, startIndexLeft, endIndexLeft - startIndexLeft, true);
if (!leftDataVector.accept(visitor, null)) {
return false;
}
}
Expand All @@ -175,6 +180,7 @@ protected boolean compareFixedSizeListVectors(FixedSizeListVector left) {
return false;
}

ApproxEqualsVisitor visitor = new ApproxEqualsVisitor(floatEpsilon, doubleEpsilon);
for (int i = 0; i < length; i++) {
int leftIndex = leftStart + i;
int rightIndex = rightStart + i;
Expand All @@ -200,8 +206,8 @@ protected boolean compareFixedSizeListVectors(FixedSizeListVector left) {
ValueVector leftDataVector = left.getDataVector();
ValueVector rightDataVector = ((FixedSizeListVector)right).getDataVector();

if (!leftDataVector.accept(new ApproxEqualsVisitor(rightDataVector, floatEpsilon, doubleEpsilon,
typeCheckNeeded, startIndexLeft, startIndexRight, (endIndexLeft - startIndexLeft)), null)) {
visitor.set(rightDataVector, rightStart, leftStart, endIndexLeft - startIndexLeft, true);
if (!leftDataVector.accept(visitor, null)) {
return false;
}
}
Expand Down
Expand Up @@ -37,23 +37,24 @@
*/
public class RangeEqualsVisitor implements VectorVisitor<Boolean, Void> {

protected final ValueVector right;
protected int leftStart;
protected int rightStart;
protected int length;
protected ValueVector right;
protected int leftStart = -1;
protected int rightStart = -1;
protected int length = -1;

protected boolean typeCheckNeeded = true;

/**
* The default constructor.
*/
public RangeEqualsVisitor() {
}

/**
* Constructs a new instance.
*/
public RangeEqualsVisitor(ValueVector right, int rightStart, int leftStart, int length, boolean typeCheckNeeded) {
this.leftStart = leftStart;
this.rightStart = rightStart;
this.right = right;
this.length = length;
this.typeCheckNeeded = typeCheckNeeded;
Preconditions.checkArgument(length >= 0, "length must be non negative");
set(right, rightStart, leftStart, length, typeCheckNeeded);
}

/**
Expand All @@ -63,6 +64,18 @@ public RangeEqualsVisitor(ValueVector right, int leftStart, int rightStart, int
this(right, rightStart, leftStart, length, true);
}

/**
* Sets the parameters for comparison.
*/
public void set(ValueVector right, int rightStart, int leftStart, int length, boolean typeCheckNeeded) {
this.right = right;
this.leftStart = leftStart;
this.rightStart = rightStart;
this.length = length;
this.typeCheckNeeded = typeCheckNeeded;
Preconditions.checkArgument(length >= 0, "length must be non negative");
}

/**
* Do some validation work, like type check and indices check.
*/
Expand All @@ -81,6 +94,12 @@ protected boolean validate(ValueVector left) {
Preconditions.checkArgument((rightStart + length) <= right.getValueCount(),
"(rightStart + length) %s out of range[0, %s].", 0, right.getValueCount());

Preconditions.checkArgument(left != null,
"left vector cannot be null");

Preconditions.checkArgument(right != null,
"right vector cannot be null");

return true;
}

Expand Down Expand Up @@ -144,9 +163,10 @@ protected boolean compareUnionVectors(UnionVector left) {
return false;
}

RangeEqualsVisitor visitor = new RangeEqualsVisitor();
for (int k = 0; k < leftChildren.size(); k++) {
RangeEqualsVisitor visitor = new RangeEqualsVisitor(rightChildren.get(k),
leftStart, rightStart, length);
visitor.set(rightChildren.get(k),
rightStart, leftStart, length, true);
if (!leftChildren.get(k).accept(visitor, null)) {
return false;
}
Expand All @@ -162,9 +182,10 @@ protected boolean compareStructVectors(NonNullableStructVector left) {
return false;
}

RangeEqualsVisitor visitor = new RangeEqualsVisitor();
for (String name : left.getChildFieldNames()) {
RangeEqualsVisitor visitor = new RangeEqualsVisitor(rightVector.getChild(name),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also for compareListVectors/compareFixedSizeListVectors?

Copy link
Contributor Author

@liyafan82 liyafan82 Aug 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Thanks for the good point.

leftStart, rightStart, length);
visitor.set(rightVector.getChild(name),
rightStart, leftStart, length, true);
if (!left.getChild(name).accept(visitor, null)) {
return false;
}
Expand Down Expand Up @@ -236,7 +257,7 @@ protected boolean compareBaseVariableWidthVectors(BaseVariableWidthVector left)
}

protected boolean compareListVectors(ListVector left) {

RangeEqualsVisitor visitor = new RangeEqualsVisitor();
for (int i = 0; i < length; i++) {
int leftIndex = leftStart + i;
int rightIndex = rightStart + i;
Expand All @@ -262,8 +283,8 @@ protected boolean compareListVectors(ListVector left) {
ValueVector leftDataVector = left.getDataVector();
ValueVector rightDataVector = ((ListVector)right).getDataVector();

if (!leftDataVector.accept(new RangeEqualsVisitor(rightDataVector, startIndexLeft,
startIndexRight, (endIndexLeft - startIndexLeft)), null)) {
visitor.set(rightDataVector, startIndexRight, startIndexLeft, endIndexLeft - startIndexLeft, true);
if (!leftDataVector.accept(visitor, null)) {
return false;
}
}
Expand All @@ -277,6 +298,7 @@ protected boolean compareFixedSizeListVectors(FixedSizeListVector left) {
return false;
}

RangeEqualsVisitor visitor = new RangeEqualsVisitor();
for (int i = 0; i < length; i++) {
int leftIndex = leftStart + i;
int rightIndex = rightStart + i;
Expand All @@ -302,8 +324,8 @@ protected boolean compareFixedSizeListVectors(FixedSizeListVector left) {
ValueVector leftDataVector = left.getDataVector();
ValueVector rightDataVector = ((FixedSizeListVector)right).getDataVector();

if (!leftDataVector.accept(new RangeEqualsVisitor(rightDataVector, startIndexLeft, startIndexRight,
(endIndexLeft - startIndexLeft)), null)) {
visitor.set(rightDataVector, startIndexRight, startIndexLeft, endIndexLeft - startIndexLeft, true);
if (!leftDataVector.accept(visitor, null)) {
return false;
}
}
Expand Down
Expand Up @@ -136,12 +136,13 @@ static final int roundUpToPowerOf2(int size) {
public int getIndex(int indexInArray, ValueVector toEncode) {
int hash = toEncode.hashCode(indexInArray);
int index = indexFor(hash, table.length);

RangeEqualsVisitor equalVisitor = new RangeEqualsVisitor();
for (DictionaryHashTable.Entry e = table[index]; e != null ; e = e.next) {
if (e.hash == hash) {
int dictIndex = e.index;

if (new RangeEqualsVisitor(dictionary, dictIndex, indexInArray, 1, false)
.equals(toEncode)) {
equalVisitor.set(dictionary, dictIndex, indexInArray, 1, false);
if (equalVisitor.equals(toEncode)) {
return dictIndex;
}
}
Expand Down