[SPARK-17718] [Update MLib Classification Documentation] by jagadeesanas2 · Pull Request #15293 · apache/spark

jagadeesanas2 · 2016-09-29T07:59:25Z

What changes were proposed in this pull request?

The loss function here for logistic regression is confusing. It seems to imply that spark uses only -1 and 1 class labels. However it uses 0,1. Added detailed documentation to avoid confusion.

SparkQA · 2016-09-29T08:22:31Z

Test build #66091 has finished for PR 15293 at commit 210dc85.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen

I think it's better to implement the second alternative in the JIRA: Better yet, the loss function should be replaced with that for 0, 1 despite mathematical inconvenience, since that is what is actually implemented.

srowen · 2016-09-29T10:35:35Z

No, these expressions aren't correct then for y=0,1

SparkQA · 2016-09-29T10:50:26Z

Test build #66097 has finished for PR 15293 at commit 1fa016f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

jagadeesanas2 · 2016-09-29T10:55:59Z

The hinge loss provides a relatively tight, convex upper bound on the 0–1 indicator function. Specifically, the hinge loss equals the 0–1 indicator function.
Source: https://en.wikipedia.org/wiki/Loss_functions_for_classification#Hinge_loss

Else whether we can use previous note docs.

@srowen any suggestion..?

srowen · 2016-09-29T11:32:51Z

Yes that's the definition, but for y = +/- 1. The expression can't be correct for y = 0/1; when y = 0 the loss is always 1 for example.

Well, I'm looking at what the equivalent is like for 0,1 and it's more complicated really in all cases. It wouldn't match the comments in the source code either. Maybe it is actually better to just move the note, yeah.

@dbtsai what do you think?

dbtsai · 2016-09-29T23:11:41Z

+1 on just having the note. For y = 0/1, just more confusing to have complicated formulation in the doc.

SparkQA · 2016-09-30T04:44:15Z

Test build #66151 has finished for PR 15293 at commit 9c50522.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2016-09-30T04:46:24Z

docs/mllib-linear-methods.md

Yeah back to the start, sorry. Looks good, though you can also remove the statement below that this duplicates now. Thank you.

As mentioned in the JIRA, i simply added detailed documentation to avoid future confusion.

This duplicates an existing statement below. The idea was to move it u here rather than copy it.

Removed duplicates

SparkQA · 2016-10-02T11:29:38Z

Test build #66241 has finished for PR 15293 at commit 56821a1.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

srowen · 2016-10-02T12:13:24Z

This still duplicates the message. The point is to move it to a more prominent place; that's all. I can open a PR directly if this is just unclear.

jagadeesanas2 · 2016-10-02T17:24:50Z

Thanks @srowen 👍 please go ahead

srowen · 2016-10-03T09:45:03Z

This can be closed; see #15330

srowen requested changes Sep 29, 2016

View reviewed changes

jagadeesanas2 force-pushed the SPARK-17718 branch from 210dc85 to 1fa016f Compare September 29, 2016 10:27

jagadeesanas2 force-pushed the SPARK-17718 branch from 1fa016f to 9c50522 Compare September 30, 2016 04:22

srowen approved these changes Sep 30, 2016

View reviewed changes

[SPARK-17718] [Update MLib Classification Documentation]

56821a1

jagadeesanas2 force-pushed the SPARK-17718 branch from 9c50522 to 56821a1 Compare October 2, 2016 11:07

jagadeesanas2 closed this Oct 3, 2016

Conversation

jagadeesanas2 commented Sep 29, 2016

What changes were proposed in this pull request?

Uh oh!

SparkQA commented Sep 29, 2016

Uh oh!

srowen left a comment

Choose a reason for hiding this comment

Uh oh!

srowen commented Sep 29, 2016

Uh oh!

SparkQA commented Sep 29, 2016

Uh oh!

jagadeesanas2 commented Sep 29, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

srowen commented Sep 29, 2016

Uh oh!

dbtsai commented Sep 29, 2016

Uh oh!

SparkQA commented Sep 30, 2016

Uh oh!

srowen Sep 30, 2016

Choose a reason for hiding this comment

Uh oh!

jagadeesanas2 Oct 2, 2016

Choose a reason for hiding this comment

Uh oh!

srowen Oct 2, 2016

Choose a reason for hiding this comment

Uh oh!

jagadeesanas2 Oct 2, 2016

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Oct 2, 2016

Uh oh!

srowen commented Oct 2, 2016

Uh oh!

jagadeesanas2 commented Oct 2, 2016

Uh oh!

srowen commented Oct 3, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jagadeesanas2 commented Sep 29, 2016 •

edited

Loading