-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-33636][PYTHON][ML][3.0] Add labelsArray to PySpark StringIndexer #30580
Conversation
cc @srowen |
cc @HyukjinKwon too. |
LGTM but I will leave it to @srowen. |
Test build #132080 has finished for PR 30580 at commit
|
Kubernetes integration test starting |
Kubernetes integration test status success |
Thanks. Then I will merge this to branch-3.0. |
### What changes were proposed in this pull request? This is a followup to add missing `labelsArray` to PySpark `StringIndexer`. ### Why are the changes needed? `labelsArray` is for multi-column case for `StringIndexer`. We should provide this accessor at PySpark side too. ### Does this PR introduce _any_ user-facing change? Yes, `labelsArray` was missing in PySpark `StringIndexer` in Spark 3.0. ### How was this patch tested? Unit test. Closes #30580 from viirya/SPARK-33636. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com>
What changes were proposed in this pull request?
This is a followup to add missing
labelsArray
to PySparkStringIndexer
.Why are the changes needed?
labelsArray
is for multi-column case forStringIndexer
. We should provide this accessor at PySpark side too.Does this PR introduce any user-facing change?
Yes,
labelsArray
was missing in PySparkStringIndexer
in Spark 3.0.How was this patch tested?
Unit test.