[SPARK-17718] [Update MLib Classification Documentation]#15293
[SPARK-17718] [Update MLib Classification Documentation]#15293jagadeesanas2 wants to merge 1 commit intoapache:masterfrom
Conversation
|
Test build #66091 has finished for PR 15293 at commit
|
srowen
left a comment
There was a problem hiding this comment.
I think it's better to implement the second alternative in the JIRA: Better yet, the loss function should be replaced with that for 0, 1 despite mathematical inconvenience, since that is what is actually implemented.
210dc85 to
1fa016f
Compare
|
No, these expressions aren't correct then for y=0,1 |
|
Test build #66097 has finished for PR 15293 at commit
|
|
Else whether we can use previous note docs. @srowen any suggestion..? |
|
Yes that's the definition, but for y = +/- 1. The expression can't be correct for y = 0/1; when y = 0 the loss is always 1 for example. Well, I'm looking at what the equivalent is like for 0,1 and it's more complicated really in all cases. It wouldn't match the comments in the source code either. Maybe it is actually better to just move the note, yeah. @dbtsai what do you think? |
|
+1 on just having the note. For y = 0/1, just more confusing to have complicated formulation in the doc. |
1fa016f to
9c50522
Compare
|
Test build #66151 has finished for PR 15293 at commit
|
docs/mllib-linear-methods.md
Outdated
There was a problem hiding this comment.
Yeah back to the start, sorry. Looks good, though you can also remove the statement below that this duplicates now. Thank you.
There was a problem hiding this comment.
As mentioned in the JIRA, i simply added detailed documentation to avoid future confusion.
There was a problem hiding this comment.
This duplicates an existing statement below. The idea was to move it u here rather than copy it.
There was a problem hiding this comment.
Removed duplicates
9c50522 to
56821a1
Compare
|
Test build #66241 has finished for PR 15293 at commit
|
|
This still duplicates the message. The point is to move it to a more prominent place; that's all. I can open a PR directly if this is just unclear. |
|
Thanks @srowen 👍 please go ahead |
|
This can be closed; see #15330 |

What changes were proposed in this pull request?
The loss function here for logistic regression is confusing. It seems to imply that spark uses only -1 and 1 class labels. However it uses 0,1. Added detailed documentation to avoid confusion.