Skip to content

Conversation

@cloud-fan
Copy link

No description provided.

Copy link
Owner

@fqaiser94 fqaiser94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @cloud-fan , thanks for doing this.
I'm more than happy to adopt most of these changes.
I see however that this does not pass any of the unit tests. Am I correct in thinking this PR was just your way of sharing your ideas? And that I should fix up the tests/code as necessary after merging it? Please let me know.

s"Intermediate field ${checkedFields.quoted}"
}
TypeCheckResult.TypeCheckFailure(
s"$fieldName must be struct type, got: ${currentStruct.catalogString}")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code does not check if intermediate fields are struct type. As a result, we get java.lang.ClassCastException thrown by line 593 if any intermediate field is not a struct type. Personally, I would prefer an AnalysisException with a relatively descriptive message like I have in my PR. So I wanted to ask, is this change intentional?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, made a typo, I've pushed a new commit to fix it.

case (name, i) => Seq(Literal(name), GetStructField(struct, i, Some(name)))
})
}
If(IsNull(struct), Literal(null, expr.dataType), expr)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better if we do something like:

    if (struct.nullable) {
      If(IsNull(struct), Literal(null, expr.dataType), expr)
    } else {
      expr
    }

Right now, with the code in this PR, WithField will always be nullable even if the original struct was not nullable. This change would fix that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a big deal as the optimizer will turn IsNull to false literal if the expression is not nullable. You can see from ScalaReflection that we also add IsNull without checking nullability, to leave optimization work to the optimizer.

lazy val toCreateNamedStruct: Expression = toCreateNamedStruct(structExpr, attributes)
case class WithField(struct: Expression, fieldPath: Seq[String], value: Expression)
extends Unevaluable {
assert(fieldPath.nonEmpty)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not be better to do this in checkInputDataTypes function? So we get an AnalysisException with a descriptive message?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do assert here as this should be a bug. Users can't construct an empty fieldPath with public APIs.

@cloud-fan
Copy link
Author

Yea I didn't fix all the tests. I just tried my idea and ran some local examples. Please help to fix the tests after merging it, thanks!

@fqaiser94 fqaiser94 merged commit 1199550 into fqaiser94:SPARK-22231-withField Jun 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants