Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-31012][ML][PySpark][DOCS] Updating ML API docs for 3.0 changes #27762

Closed
wants to merge 2 commits into from

Conversation

huaxingao
Copy link
Contributor

What changes were proposed in this pull request?

Updating ML docs for 3.0 changes

Why are the changes needed?

I am auditing 3.0 ML changes, found some docs are missing or not updated. Need to update these.

Does this PR introduce any user-facing change?

Yes, doc changes

How was this patch tested?

Manually build and check

@SparkQA
Copy link

SparkQA commented Mar 2, 2020

Test build #119186 has finished for PR 27762 at commit d567064.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -28,7 +28,8 @@ import org.apache.spark.sql.functions._
import org.apache.spark.sql.types.DoubleType

/**
* Evaluator for binary classification, which expects two input columns: rawPrediction and label.
* Evaluator for binary classification, which expects input columns rawPrediction, label and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Evaluator for binary classification, which expects input columns: rawPrediction, label and

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Text is OK as is I think, but could quote the column names

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I will leave this as is if that's OK by you two. The other two classes MulticlassClassificationEvaluator and RegressionEvaluator don't quote the column names either. If I change, I will have to change all of them.

@@ -110,7 +110,8 @@ def isLargerBetter(self):
class BinaryClassificationEvaluator(JavaEvaluator, HasLabelCol, HasRawPredictionCol, HasWeightCol,
JavaMLReadable, JavaMLWritable):
"""
Evaluator for binary classification, which expects two input columns: rawPrediction and label.
Evaluator for binary classification, which expects input columns rawPrediction, label
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

"(f1|accuracy|weightedPrecision|weightedRecall|weightedTruePositiveRate|"
"weightedFalsePositiveRate|weightedFMeasure|truePositiveRateByLabel|"
"falsePositiveRateByLabel|precisionByLabel|recallByLabel|fMeasureByLabel|"
"(f1|accuracy|weightedPrecision|weightedRecall|weightedTruePositiveRate| "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we do not need to add spaces here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems I have to break the line by adding a white space.
Before:
image

After:
image

@@ -28,7 +28,8 @@ import org.apache.spark.sql.functions._
import org.apache.spark.sql.types.DoubleType

/**
* Evaluator for binary classification, which expects two input columns: rawPrediction and label.
* Evaluator for binary classification, which expects input columns rawPrediction, label and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Text is OK as is I think, but could quote the column names

@huaxingao huaxingao changed the title [SPARK-31012][ML][PySpark][DOCS] Updating ML docs for 3.0 changes [SPARK-31012][ML][PySpark][DOCS] Updating ML API docs for 3.0 changes Mar 3, 2020
@srowen srowen closed this in 4a64901 Mar 7, 2020
srowen pushed a commit that referenced this pull request Mar 7, 2020
### What changes were proposed in this pull request?
Updating ML docs for 3.0 changes

### Why are the changes needed?
I am auditing 3.0 ML changes, found some docs are missing or not updated. Need to update these.

### Does this PR introduce any user-facing change?
Yes, doc changes

### How was this patch tested?
Manually build and check

Closes #27762 from huaxingao/spark-doc.

Authored-by: Huaxin Gao <huaxing@us.ibm.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
(cherry picked from commit 4a64901)
Signed-off-by: Sean Owen <srowen@gmail.com>
@srowen
Copy link
Member

srowen commented Mar 7, 2020

Merged to master/3.0

@huaxingao
Copy link
Contributor Author

Thank you very much! @srowen @zhengruifeng

@huaxingao huaxingao deleted the spark-doc branch March 7, 2020 17:48
sjincho pushed a commit to sjincho/spark that referenced this pull request Apr 15, 2020
### What changes were proposed in this pull request?
Updating ML docs for 3.0 changes

### Why are the changes needed?
I am auditing 3.0 ML changes, found some docs are missing or not updated. Need to update these.

### Does this PR introduce any user-facing change?
Yes, doc changes

### How was this patch tested?
Manually build and check

Closes apache#27762 from huaxingao/spark-doc.

Authored-by: Huaxin Gao <huaxing@us.ibm.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants