Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20566][SQL] ColumnVector should support appendFloats for array #17836

Closed
wants to merge 4 commits into from
Closed

[SPARK-20566][SQL] ColumnVector should support appendFloats for array #17836

wants to merge 4 commits into from

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented May 2, 2017

What changes were proposed in this pull request?

This PR aims to add a missing appendFloats API for array into ColumnVector class. For double type, there is appendDoubles for array here.

How was this patch tested?

Pass the Jenkins with a newly added test case.

@dongjoon-hyun
Copy link
Member Author

Hi, @nongli .
Could you review this PR?

@SparkQA
Copy link

SparkQA commented May 2, 2017

Test build #76394 has finished for PR 17836 at commit a03f927.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-20566] ColumnVector should support appendFloats for array [SPARK-20566][SQL] ColumnVector should support appendFloats for array May 3, 2017
@kiszk
Copy link
Member

kiszk commented May 3, 2017

LGTM if tests pass
cc: @sameeragarwal @cloud-fan

@dongjoon-hyun
Copy link
Member Author

Thank you for review, @kiszk

@SparkQA
Copy link

SparkQA commented May 3, 2017

Test build #76404 has finished for PR 17836 at commit d979d0f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@hvanhovell
Copy link
Contributor

cc @michal-databricks

@cloud-fan
Copy link
Contributor

LGTM

@dongjoon-hyun
Copy link
Member Author

Thank you for review, @hvanhovell and @cloud-fan .

@@ -801,6 +801,14 @@ public final int appendFloats(int count, float v) {
return result;
}

public final int appendFloats(int length, float[] src, int offset) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really use this API and also appendDoubles? I scan the codes but didn't find anywhere they are used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. That's the reason why this is missing until now. We need to use them in order to ColumnarBatch independently from Parquet.

@michal-databricks
Copy link
Contributor

michal-databricks commented May 3, 2017

Not sure if intentional, but the added test ("Float APIs") does not cover the new function (it uses putType API mostly). Actually as far as I can see the "appendTypes" APIs are not used at all in the code. The singular versions are also not covered by the tests but are used by toBatch (which in turn is used only in testing).

@@ -320,6 +320,80 @@ class ColumnarBatchSuite extends SparkFunSuite {
}}
}

test("Float APIs") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the added test, do we really test the added appendFloats?

@dongjoon-hyun
Copy link
Member Author

Thank you, @michal-databricks and @viirya .
It was my mistake. I thought I did all. :(
I'll add that correctly, too.

@dongjoon-hyun
Copy link
Member Author

I updated the test cases to include appendFloat/appendFloats/appendDouble/appendDoubles.

@kiszk
Copy link
Member

kiszk commented May 3, 2017

Good catch. Is is better to apply these changes to test cases for other types (e.g. null, boolean, byte, short, int, and long)?

@SparkQA
Copy link

SparkQA commented May 3, 2017

Test build #76411 has finished for PR 17836 at commit 6cec5da.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

Sure, @kiszk . I'll update the other test suite to improve coverage.

@SparkQA
Copy link

SparkQA commented May 4, 2017

Test build #76438 has finished for PR 17836 at commit 2b58c9f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member

kiszk commented May 4, 2017

Thanks, LGTM

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@asfgit asfgit closed this in bfc8c79 May 4, 2017
@dongjoon-hyun
Copy link
Member Author

Thank you for review and merging again, @kiszk and @cloud-fan .

@dongjoon-hyun dongjoon-hyun deleted the SPARK-20566 branch May 4, 2017 15:20
@dongjoon-hyun
Copy link
Member Author

Hi, @cloud-fan .
Apache Spark 2.2 seems to be passed now.
Do you think it is possible to backport this into branch-2.2 for 2.2.1?

@dongjoon-hyun
Copy link
Member Author

It's cherry-pickable into branch-2.2.

@cloud-fan
Copy link
Contributor

to be safe, can we create a new PR for 2.2? thanks

@dongjoon-hyun
Copy link
Member Author

Sure!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants