Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operators In / Not In are not implemented for Residuals #760

Closed
arina-ielchiieva opened this issue Jan 29, 2020 · 6 comments
Closed

Operators In / Not In are not implemented for Residuals #760

arina-ielchiieva opened this issue Jan 29, 2020 · 6 comments

Comments

@arina-ielchiieva
Copy link
Member

arina-ielchiieva commented Jan 29, 2020

Example of the unit tests in org.apache.iceberg.transforms.TestResiduals which should pass:

  @Test
  public void testIn() {
    Schema schema = new Schema(
      Types.NestedField.optional(50, "dateint", Types.IntegerType.get()),
      Types.NestedField.optional(51, "hour", Types.IntegerType.get())
    );

    PartitionSpec spec = PartitionSpec.builderFor(schema)
      .identity("dateint")
      .build();

    ResidualEvaluator resEval = ResidualEvaluator.of(spec, 
      in("dateint", 20170815, 20170816, 20170817), true);

    Expression residual = resEval.residualFor(Row.of(20170815));
    Assert.assertEquals("Residual should be alwaysTrue", alwaysTrue(), residual);
  }

  @Test
  public void testNotIn() {
    Schema schema = new Schema(
      Types.NestedField.optional(50, "dateint", Types.IntegerType.get()),
      Types.NestedField.optional(51, "hour", Types.IntegerType.get())
    );

    PartitionSpec spec = PartitionSpec.builderFor(schema)
      .identity("dateint")
      .build();

    ResidualEvaluator resEval = ResidualEvaluator.of(spec, 
      notIn("dateint", 20170815, 20170816, 20170817),true);

    Expression residual = resEval.residualFor(Row.of(20170815));
    Assert.assertEquals("Residual should be alwaysFalse", alwaysFalse(), residual);
  }

Now both unit tests fail. Example of the exception:

java.lang.UnsupportedOperationException: In operation is not supported by the visitor

	at org.apache.iceberg.expressions.ExpressionVisitors$BoundExpressionVisitor.in(ExpressionVisitors.java:102)
	at org.apache.iceberg.expressions.ExpressionVisitors$BoundExpressionVisitor.predicate(ExpressionVisitors.java:152)
	at org.apache.iceberg.expressions.ResidualEvaluator$ResidualVisitor.predicate(ResidualEvaluator.java:228)
	at org.apache.iceberg.expressions.ResidualEvaluator$ResidualVisitor.predicate(ResidualEvaluator.java:270)
	at org.apache.iceberg.expressions.ResidualEvaluator$ResidualVisitor.predicate(ResidualEvaluator.java:119)
	at org.apache.iceberg.expressions.ExpressionVisitors.visit(ExpressionVisitors.java:283)
	at org.apache.iceberg.expressions.ResidualEvaluator$ResidualVisitor.eval(ResidualEvaluator.java:124)
	at org.apache.iceberg.expressions.ResidualEvaluator$ResidualVisitor.access$100(ResidualEvaluator.java:119)
	at org.apache.iceberg.expressions.ResidualEvaluator.residualFor(ResidualEvaluator.java:116)
	at org.apache.iceberg.transforms.TestResiduals.testIn(TestResiduals.java:176)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
	at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
	at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
	at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
	at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
	at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
	at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)

I think #600 did not add these operators to all visitors, in particular ResidualVisitor was omitted.

@arina-ielchiieva
Copy link
Member Author

FYI @jun-he

@arina-ielchiieva arina-ielchiieva changed the title Operators In / Not In is not implemented for Residuals Operators In / Not In are not implemented for Residuals Jan 29, 2020
@jun-he
Copy link
Collaborator

jun-he commented Feb 1, 2020

Thanks @arina-ielchiieva.
I will add them to ResidualVisitor and double check other visitors.

@jun-he
Copy link
Collaborator

jun-he commented Feb 2, 2020

@arina-ielchiieva I have submitted a PR to address this issue. Thanks.

@arina-ielchiieva
Copy link
Member Author

@jun-he thanks!

@jun-he
Copy link
Collaborator

jun-he commented Feb 7, 2020

Can we close it as the change is merged? Thanks.

@aokolnychyi
Copy link
Contributor

@jun-he, thanks for fixing this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants