-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-20566][SQL] ColumnVector should support appendFloats
for array
#17836
Conversation
Hi, @nongli . |
Test build #76394 has finished for PR 17836 at commit
|
appendFloats
for arrayappendFloats
for array
LGTM if tests pass |
Thank you for review, @kiszk |
Test build #76404 has finished for PR 17836 at commit
|
LGTM |
Thank you for review, @hvanhovell and @cloud-fan . |
@@ -801,6 +801,14 @@ public final int appendFloats(int count, float v) { | |||
return result; | |||
} | |||
|
|||
public final int appendFloats(int length, float[] src, int offset) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really use this API and also appendDoubles
? I scan the codes but didn't find anywhere they are used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. That's the reason why this is missing until now. We need to use them in order to ColumnarBatch
independently from Parquet.
Not sure if intentional, but the added test ("Float APIs") does not cover the new function (it uses putType API mostly). Actually as far as I can see the "appendTypes" APIs are not used at all in the code. The singular versions are also not covered by the tests but are used by toBatch (which in turn is used only in testing). |
@@ -320,6 +320,80 @@ class ColumnarBatchSuite extends SparkFunSuite { | |||
}} | |||
} | |||
|
|||
test("Float APIs") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the added test, do we really test the added appendFloats
?
Thank you, @michal-databricks and @viirya . |
I updated the test cases to include |
Good catch. Is is better to apply these changes to test cases for other types (e.g. null, boolean, byte, short, int, and long)? |
Test build #76411 has finished for PR 17836 at commit
|
Sure, @kiszk . I'll update the other test suite to improve coverage. |
Test build #76438 has finished for PR 17836 at commit
|
Thanks, LGTM |
thanks, merging to master! |
Thank you for review and merging again, @kiszk and @cloud-fan . |
Hi, @cloud-fan . |
It's cherry-pickable into branch-2.2. |
to be safe, can we create a new PR for 2.2? thanks |
Sure! |
What changes were proposed in this pull request?
This PR aims to add a missing
appendFloats
API for array into ColumnVector class. For double type, there isappendDoubles
for array here.How was this patch tested?
Pass the Jenkins with a newly added test case.