Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-4409] Improve exception when RelBuilder tries to create a field on a non-struct expression #2272

Merged
merged 1 commit into from Nov 26, 2020

Conversation

rubenada
Copy link
Contributor

Jira: CALCITE-4409

Comment on lines 498 to 501
if (!e.getType().isStruct()) {
throw new IllegalArgumentException("Requested field " + name + " in non-struct expression "
+ e.toString() + " (type: " + e.getType() + ")");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering where is it better to add this check here or inside RexBuilder#makeFieldAccess. Clients who call RexBuilder direclty will still get the NPE.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the impression that the RelBuilder build rel trees based on all kinds of assumptions, it assumes that the invoker knows how to behave with the valid operands. We may need to check all the interfaces in RelBuilder, this PR is a good start.

Copy link
Contributor Author

@rubenada rubenada Nov 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. Looking again at this code in particular, I have the impression that a simpler fix for this specific issue would be leaving RelBuilder untouched and modify this line in RelDataTypeImpl.getField:

  @Override public RelDataTypeField getField(String fieldName, boolean caseSensitive,
      boolean elideRecord) {
    for (RelDataTypeField field : fieldList) {
// =>
  @Override public RelDataTypeField getField(String fieldName, boolean caseSensitive,
      boolean elideRecord) {
    for (RelDataTypeField field : getFieldList()) {  // change here

With this change, the scenario in the Jira would fail with an AssertionError (thrown by getFieldList()) instead of NPE.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this check should be moved to org.apache.calcite.rel.type.RelDataTypeImpl#getField
Then all uses of RelDataTypeImpl#getField would get the proper exception rather than NPE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vlsi we had the same thought :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I proposed IllegalArgumentException, considering the first argument to be the illegal one (RexNode expr, having a non-struct type). But I can see IllegalStateException fitting here as well.

@zabetak @danny0405 what are your thoughts regarding the new proposal? Are you against adding the check in RexBuilder#makefieldAccess (hence having an AssertionError from RelDataTypleImpl); or for adding check (in which case, would you see it as an IllegalArgumentException or IllegalStateException)?

Copy link
Contributor

@vlsi vlsi Nov 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, I do not like to make RexBuilder aware of the ways getField can fail.
In other words, it is not RexBuilder's business to tell if non-structs are allowed to have fields or not.

I've had a very similar discussion with @julianhyde in https://issues.apache.org/jira/browse/CALCITE-4217, and the takeaway was "don't try to make sense of struct vs fieldList".

That is why I believe the necessary and sufficient solution here is:

  1. Delegate the proper exception message to org.apache.calcite.rel.type.RelDataType#getField(String fieldName, ...) implementations. In other words, if someone implements RelDataType, then it is their business to throw the appropriate exception.

  2. org.apache.calcite.rex.RexBuilder#makeFieldAccess(org.apache.calcite.rex.RexNode, int) is slightly different since it can't delegate to RelDataType, and it has to call type.getFieldList(). It could be left as is, and the type implementation would throw "type ... has no fields" (which might be enough).
    If you want to add clarification with the problematic RexNode, then you could use

try {
  final List<RelDataTypeField> fields = type.getFieldList();
} catch (RuntimeException e) {
  e.addSuppressed(new Throwable("Unable to get field " + i + " from rexNode " + rexNode");
  throw e
}

The important point is in both cases RexBuilder makes no assumptions on struct vs non-struct.

Copy link
Contributor

@vlsi vlsi Nov 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, indeed, RelCrossType is a case when isStruct == false and getFieldList() returns non-null, so .isStruct() check must not be added to RexBuilder.

Copy link
Contributor Author

@rubenada rubenada Nov 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, indeed, RelCrossType is a case when isStruct == false and getFieldList() returns non-null.

Well spotted, thanks for the feedback. I'll revert the changes in RexBuilder

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Committed new proposal taking into account @vlsi 's input

@@ -79,6 +79,10 @@ protected RelDataTypeImpl() {

@Override public RelDataTypeField getField(String fieldName, boolean caseSensitive,
boolean elideRecord) {
if (fieldList == null) {
throw new IllegalStateException("Requested field " + fieldName
+ " in a type with null fieldList, type = " + this);
Copy link
Contributor

@danny0405 danny0405 Nov 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which case, user can pass in a null fieldList ? I would rather to move the check to the builder that generate the RelDataTypeImpl instance.

The exception message is also confusing, we should tell user why he does wrong but not "we got null field list".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danny0405 any "simple" (i.e. non-struct) type will have a null fieldList.
The problem is not having a null fieldList (which is valid), the problem is trying to access this list (e.g. via getField, but also getFieldList, getFieldNames, getFieldCount...) when this list is null

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that type with null fieldList is too low-level for RelDataType users, however, I don't know the better message.

The following might be slightly bit better: "The current type " + this + " does not support named field lookup, so field " + fieldName + " can't be located"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exception message reviewed to make it more high level

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d prefer a simpler error message, such as “Type does not support fields”. Especially if it’s emitted by RelDataType, which is a low level class and not supposed to be user-friendly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exceptions should always be meaningful, even the low-level ones.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion is meaningful, too.

The verbose message says that the type doesn’t support named field lookup. So the reader will wonder whether it supports field lookup by other means. No, it doesn’t have fields. Just say that.

A simple message is sufficient for the user to diagnose their problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please suggest exact pattern to insert into new IllegalStateException(...)

@rubenada
Copy link
Contributor Author

@vlsi @zabetak @danny0405 are we satisfied with the current status of the PR? Any other remarks? Shall I go on and squash commits and then merge?

@julianhyde
Copy link
Contributor

@rubenada I think the message should be simple and terse. Less is more.

@julianhyde
Copy link
Contributor

julianhyde commented Nov 25, 2020 via email

@rubenada
Copy link
Contributor Author

Let's not make a big fuss out of an exception message. I have committed a new proposal. I think it is simple, meaningful, not too low level.
If someone has an objection please let me know, otherwise I plan to merge this (small) patch in the next 24h.

@rubenada rubenada added the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label Nov 25, 2020
}, "Field should fail since we are trying access a field on expression with non-struct type");
assertThat(ex.getMessage(),
allOf(containsString("Trying to access field"),
containsString("in a type with no fields")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose to split into two containsString here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply because the actual fieldName value goes in between, and it is not really relevant in order to verify the exception message. I just followed the same pattern as the test just above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field name does not change over time, so it would be better to use regular is("....") matcher to get better feedback in IDE and CI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm, the verification with is(...) would look like:
assertThat(ex.getMessage(), is("Trying to access field abc in a type with no fields: SMALLINT"));
Is that what you both would prefer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, changed.

Copy link
Contributor

@danny0405 danny0405 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, looks fine from my side.

@rubenada rubenada merged commit fc912eb into apache:master Nov 26, 2020
@rubenada
Copy link
Contributor Author

Merged.
Thanks everyone for the feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LGTM-will-merge-soon Overall PR looks OK. Only minor things left.
Projects
None yet
5 participants