Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-16278][SPARK-16279][SQL] Implement map_keys/map_values SQL functions #13967

Closed
wants to merge 4 commits into from
Closed

[SPARK-16278][SPARK-16279][SQL] Implement map_keys/map_values SQL functions #13967

wants to merge 4 commits into from

Conversation

dongjoon-hyun
Copy link
Member

What changes were proposed in this pull request?

This PR adds map_keys and map_values SQL functions in order to remove Hive fallback.

How was this patch tested?

Pass the Jenkins tests including new testcases.

@dongjoon-hyun
Copy link
Member Author

cc @rxin and @cloud-fan .

@SparkQA
Copy link

SparkQA commented Jun 29, 2016

Test build #61454 has finished for PR 13967 at commit 2270b39.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class MapKeys(child: Expression)
    • case class MapValues(child: Expression)

override def foldable: Boolean = child.foldable

override def nullSafeEval(map: Any): Any = {
map.asInstanceOf[MapData].valueArray().copy()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we don't call copy here? It looks reasonable to copy, I'm just curious :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually didn't found the corresponding case. Is it safe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A similar one is GetStructField, it can get an array column from a row without copying it. Maybe it's safe to not copy here too. cc @davies

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for confirming. That's nice.

@dongjoon-hyun
Copy link
Member Author

Thank you for review, @rxin and @cloud-fan .
According to the comments, I removed the redundant copy operations and added function detail description examples.

@SparkQA
Copy link

SparkQA commented Jun 30, 2016

Test build #61536 has finished for PR 13967 at commit 6953b4e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


override def dataType: DataType = ArrayType(child.dataType.asInstanceOf[MapType].keyType)

override def foldable: Boolean = child.foldable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the default of UnaryExpression

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep.

@dongjoon-hyun
Copy link
Member Author

I added a null map testcase and remove redundant implementation.

For the removal from sql/functions.scala, there is no problem to remove that. But, could you check that again?

@SparkQA
Copy link

SparkQA commented Jun 30, 2016

Test build #61553 has finished for PR 13967 at commit 3881a98.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

LGTM, pending jenkins

@SparkQA
Copy link

SparkQA commented Jul 1, 2016

Test build #61584 has finished for PR 13967 at commit 56d838e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

Hi, @cloud-fan .
Could you merge this PR? :)

@dongjoon-hyun
Copy link
Member Author

Rebased to the master.

@SparkQA
Copy link

SparkQA commented Jul 2, 2016

Test build #61661 has finished for PR 13967 at commit 8db1e65.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@asfgit asfgit closed this in 54b27c1 Jul 3, 2016
@dongjoon-hyun
Copy link
Member Author

Thank you, @cloud-fan and @rxin !

asfgit pushed a commit that referenced this pull request Jul 8, 2016
…ctions

This PR adds `map_keys` and `map_values` SQL functions in order to remove Hive fallback.

Pass the Jenkins tests including new testcases.

Author: Dongjoon Hyun <dongjoon@apache.org>

Closes #13967 from dongjoon-hyun/SPARK-16278.

(cherry picked from commit 54b27c1)
Signed-off-by: Reynold Xin <rxin@databricks.com>
@dongjoon-hyun dongjoon-hyun deleted the SPARK-16278 branch July 20, 2016 07:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants