-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-5891][ML] Add Binarizer ML Transformer #5699
Conversation
Test build #30952 has finished for PR 5699 at commit
|
The failure is caused by an unrelated test in streaming. |
please retest. |
test this please |
*/ | ||
@AlphaComponent | ||
final class Binarizer extends Transformer | ||
with HasInputCol with HasOutputCol with HasThreshold { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a problem with HasThreshold
. Because in the doc we said "threshold used in binary classification". Maybe we should implement `threshold" param in Binarizer and document it correctly. Also, we need to document what the output is if the input equals the threshold.
Test build #30972 has finished for PR 5699 at commit
|
Test build #31124 has finished for PR 5699 at commit
|
import org.apache.spark.ml.Transformer | ||
import org.apache.spark.ml.attribute.BinaryAttribute | ||
import org.apache.spark.ml.param._ | ||
import org.apache.spark.ml.param.shared._ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only HasInputCol
and HasOutputCol
are used. So this could be more explicit.
@mengxr updated. Thanks for comments. |
Test build #31550 has finished for PR 5699 at commit
|
retest this please. |
Test build #31553 has finished for PR 5699 at commit
|
override def beforeAll(): Unit = { | ||
super.beforeAll() | ||
sqlContext = new SQLContext(sc) | ||
data = Array(0.1, -0.5, 0.2, -0.3, 0.8, 0.7, -0.1, -0.4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: data
could be a val
LGTM. Merged into master. Thanks! |
JIRA: https://issues.apache.org/jira/browse/SPARK-5891 Author: Liang-Chi Hsieh <viirya@gmail.com> Closes apache#5699 from viirya/add_binarizer and squashes the following commits: 1a0b9a4 [Liang-Chi Hsieh] For comments. bc397f2 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into add_binarizer cc4f03c [Liang-Chi Hsieh] Implement threshold param and use merged params map. 7564c63 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into add_binarizer 1682f8c [Liang-Chi Hsieh] Add Binarizer ML Transformer.
JIRA: https://issues.apache.org/jira/browse/SPARK-5891 Author: Liang-Chi Hsieh <viirya@gmail.com> Closes apache#5699 from viirya/add_binarizer and squashes the following commits: 1a0b9a4 [Liang-Chi Hsieh] For comments. bc397f2 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into add_binarizer cc4f03c [Liang-Chi Hsieh] Implement threshold param and use merged params map. 7564c63 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into add_binarizer 1682f8c [Liang-Chi Hsieh] Add Binarizer ML Transformer.
JIRA: https://issues.apache.org/jira/browse/SPARK-5891 Author: Liang-Chi Hsieh <viirya@gmail.com> Closes apache#5699 from viirya/add_binarizer and squashes the following commits: 1a0b9a4 [Liang-Chi Hsieh] For comments. bc397f2 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into add_binarizer cc4f03c [Liang-Chi Hsieh] Implement threshold param and use merged params map. 7564c63 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into add_binarizer 1682f8c [Liang-Chi Hsieh] Add Binarizer ML Transformer.
JIRA: https://issues.apache.org/jira/browse/SPARK-5891