[SPARK-31492][ML] flatten the result dataframe of FValueTest #28268

zhengruifeng · 2020-04-20T03:48:38Z

What changes were proposed in this pull request?

add a new method def test(dataset: DataFrame, featuresCol: String, labelCol: String, flatten: Boolean): DataFrame

Why are the changes needed?

Similar to new test method in ChiSquareTest, it will:
1, support df operation on the returned df;
2, make driver no longer a bottleneck with large numFeatures

Does this PR introduce any user-facing change?

Yes, add a new method

How was this patch tested?

existing testsuites

nit nit nit nit nit

SparkQA · 2020-04-20T05:27:45Z

Test build #121496 has finished for PR 28268 at commit 3caf7d1.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

zhengruifeng · 2020-04-21T03:09:50Z

Merged to master

init

3caf7d1

nit nit nit nit nit

zhengruifeng added the ML label Apr 20, 2020

zhengruifeng closed this in 32259c9 Apr 21, 2020

zhengruifeng deleted the flatten_fvalue branch June 18, 2022 11:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-31492][ML] flatten the result dataframe of FValueTest #28268

[SPARK-31492][ML] flatten the result dataframe of FValueTest #28268

zhengruifeng commented Apr 20, 2020

SparkQA commented Apr 20, 2020

zhengruifeng commented Apr 21, 2020

[SPARK-31492][ML] flatten the result dataframe of FValueTest #28268

[SPARK-31492][ML] flatten the result dataframe of FValueTest #28268

Conversation

zhengruifeng commented Apr 20, 2020

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

SparkQA commented Apr 20, 2020

zhengruifeng commented Apr 21, 2020