Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-8241][SQL] string function: concat_ws. #7504

Closed
wants to merge 5 commits into from

Conversation

rxin
Copy link
Contributor

@rxin rxin commented Jul 19, 2015

I also changed the semantics of concat w.r.t. null back to the same behavior as Hive.
That is to say, concat now returns null if any input is null.

@SparkQA
Copy link

SparkQA commented Jul 19, 2015

Test build #37749 has finished for PR 7504 at commit 3c344a6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class UnresolvedAttribute(nameParts: Seq[String]) extends Attribute with Unevaluable
    • case class UnresolvedFunction(name: String, children: Seq[Expression])
    • case class UnresolvedStar(table: Option[String]) extends Star with Unevaluable
    • case class ResolvedStar(expressions: Seq[NamedExpression]) extends Star with Unevaluable
    • case class UnresolvedAlias(child: Expression)
    • case class Cast(child: Expression, dataType: DataType)
    • trait Unevaluable
    • case class SortOrder(child: Expression, direction: SortDirection)
    • trait AggregateExpression extends Expression with Unevaluable
    • case class Abs(child: Expression)
    • trait CodegenFallback
    • case class CreateArray(children: Seq[Expression]) extends Expression with CodegenFallback
    • case class CreateStruct(children: Seq[Expression]) extends Expression with CodegenFallback
    • case class CreateNamedStruct(children: Seq[Expression]) extends Expression with CodegenFallback
    • case class CurrentDate() extends LeafExpression with CodegenFallback
    • case class CurrentTimestamp() extends LeafExpression with CodegenFallback
    • case class Explode(child: Expression) extends UnaryExpression with Generator with CodegenFallback
    • case class Literal protected (value: Any, dataType: DataType)
    • case class Hex(child: Expression)
    • case class Unhex(child: Expression)
    • case class PrettyAttribute(name: String) extends Attribute with Unevaluable
    • case class In(value: Expression, list: Seq[Expression]) extends Predicate with CodegenFallback
    • case class NewSet(elementType: DataType) extends LeafExpression with CodegenFallback
    • case class AddItemToSet(item: Expression, set: Expression)
    • case class CombineSets(left: Expression, right: Expression)
    • case class CountSet(child: Expression) extends UnaryExpression with CodegenFallback
    • case class ConcatWs(children: Seq[Expression])
    • case class Upper(child: Expression)
    • case class StringFormat(children: Expression*) extends Expression with CodegenFallback
    • case class StringSpace(child: Expression)
    • case class Ascii(child: Expression)
    • case class Base64(child: Expression)
    • case class UnBase64(child: Expression)

@rxin
Copy link
Contributor Author

rxin commented Jul 19, 2015

Jenkins, retest this please.

@rxin
Copy link
Contributor Author

rxin commented Jul 19, 2015

cc @davies

child.eval(input) match {
case s: UTF8String => Iterator(s)
case arr: Seq[_] => arr.asInstanceOf[Seq[UTF8String]]
case null => Iterator(null.asInstanceOf[UTF8String])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: Can we just ignore the null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How? It won't match s. Also this thing doesn't compile if I do a wildcard match on s, e.g.

case s: _ => ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean we can return an empty Iterator

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that won't work for the 1st value (separator), unless we special case handle that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That make sense, thanks!

@davies
Copy link
Contributor

davies commented Jul 19, 2015

LGTM

@SparkQA
Copy link

SparkQA commented Jul 19, 2015

Test build #37761 has finished for PR 7504 at commit dda1021.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class ConcatWs(children: Seq[Expression])

rxin added 5 commits July 19, 2015 13:26
I also changed the semantics of concat w.r.t. null back to the same behavior as Hive.
That is to say, concat now returns null if any input is null.
@@ -79,7 +79,7 @@ abstract class DataType extends AbstractDataType {

override private[sql] def defaultConcreteType: DataType = this

override private[sql] def acceptsType(other: DataType): Boolean = this == other
override private[sql] def acceptsType(other: DataType): Boolean = sameType(other)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @cloud-fan

This is changed to test sameType in order to support ArrayType(StringType)

@SparkQA
Copy link

SparkQA commented Jul 19, 2015

Test build #37781 has finished for PR 7504 at commit 83fd950.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class ConcatWs(children: Seq[Expression])

@asfgit asfgit closed this in 163e3f1 Jul 19, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants