-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-7333][MLLIB] Add BinaryClassificationEvaluator to PySpark #5885
Conversation
Test build #31751 has finished for PR 5885 at commit
|
test this please |
Test build #31752 has finished for PR 5885 at commit
|
test this please |
Test build #31767 has finished for PR 5885 at commit
|
test this please |
Test build #31768 has finished for PR 5885 at commit
|
super(HasRawPredictionCol, self).__init__() | ||
#: param for raw prediction column name | ||
self.rawPredictionCol = Param(self, "rawPredictionCol", "raw prediction column name") | ||
if 'rawPrediction' is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might be a mistake? You are comparing a string to None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this code is generated. Please see _shared_param_code_gen.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, my mistake!
@@ -652,7 +652,7 @@ def _python_to_sql_converter(dataType): | |||
|
|||
if isinstance(dataType, StructType): | |||
names, types = zip(*[(f.name, f.dataType) for f in dataType.fields]) | |||
converters = map(_python_to_sql_converter, types) | |||
converters = [_python_to_sql_converter(t) for t in types] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Python 3, map
returns a map object instead of a list. So I changed it to [...]
that is compatible with both 2 & 3.
Test build #31840 has finished for PR 5885 at commit
|
test this please |
Test build #31841 has finished for PR 5885 at commit
|
test this please |
__metaclass__ = ABCMeta | ||
|
||
@abstractmethod | ||
def evaluate(self, dataset, params={}): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should "params" be "paramMap" to match Scala?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python cannot overload methods. So it should be both paramMaps
and paramMap
. I used params
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized I didn't get this. What does "it should be both paramMaps and paramMap" mean?
LGTM pending tests |
Test build #31887 has finished for PR 5885 at commit
|
Merged into master and branch-1.4. |
This PR adds `BinaryClassificationEvaluator` to Python ML Pipelines API, which is a simple wrapper of the Scala implementation. oefirouz Author: Xiangrui Meng <meng@databricks.com> Closes #5885 from mengxr/SPARK-7333 and squashes the following commits: 25d7451 [Xiangrui Meng] fix tests in python 3 babdde7 [Xiangrui Meng] fix doc cb51e6a [Xiangrui Meng] add BinaryClassificationEvaluator in PySpark (cherry picked from commit ee374e8) Signed-off-by: Xiangrui Meng <meng@databricks.com>
This PR adds `BinaryClassificationEvaluator` to Python ML Pipelines API, which is a simple wrapper of the Scala implementation. oefirouz Author: Xiangrui Meng <meng@databricks.com> Closes apache#5885 from mengxr/SPARK-7333 and squashes the following commits: 25d7451 [Xiangrui Meng] fix tests in python 3 babdde7 [Xiangrui Meng] fix doc cb51e6a [Xiangrui Meng] add BinaryClassificationEvaluator in PySpark
This PR adds `BinaryClassificationEvaluator` to Python ML Pipelines API, which is a simple wrapper of the Scala implementation. oefirouz Author: Xiangrui Meng <meng@databricks.com> Closes apache#5885 from mengxr/SPARK-7333 and squashes the following commits: 25d7451 [Xiangrui Meng] fix tests in python 3 babdde7 [Xiangrui Meng] fix doc cb51e6a [Xiangrui Meng] add BinaryClassificationEvaluator in PySpark
This PR adds `BinaryClassificationEvaluator` to Python ML Pipelines API, which is a simple wrapper of the Scala implementation. oefirouz Author: Xiangrui Meng <meng@databricks.com> Closes apache#5885 from mengxr/SPARK-7333 and squashes the following commits: 25d7451 [Xiangrui Meng] fix tests in python 3 babdde7 [Xiangrui Meng] fix doc cb51e6a [Xiangrui Meng] add BinaryClassificationEvaluator in PySpark
This PR adds
BinaryClassificationEvaluator
to Python ML Pipelines API, which is a simple wrapper of the Scala implementation. @oefirouz