-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix log_loss and add test for multiclass #28
Conversation
return -np.sum(y_true * np.log(y_prob)) / y_prob.shape[0] | ||
|
||
|
||
def binary_log_loss(y_true, y_prob): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't you think calling it multilabel_log_loss would be more informative?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean renaming binary_log_loss
to multilabel_log_loss
? This makes sense, it's just that this loss function is actually used in both binary and multilabel case since I made minimal change to BaseMultilayerPerceptron._initialize()
. I took the 'binary' here from 'binary cross-entropy'.
I agree, that's a bug. |
I'm not sure if it is better to work on this branch or on scikit-learn#3204. But we can fix this here for now. |
I'm also confused about which pull request is going to be merged to master, but I guess we should first merge the two pull requests 3939 and 3204. |
Unfortunately that is not entirely decided. scikit-learn#3204 is more "up to date" but I'm not sure if all the additions there are useful. |
I see. I was trying to compare the two pull requests through github but couldn't get an informative comparison result. I'm just wondering if it would be better to merge an extensible minimal working implementation first, and consider new features later, since this pull request has been around for nearly two years... |
I mostly agree with your assessment. The problem is that we need working default values, and if the defaults just not work, people might be discouraged from using it. I just wrote an email to some people who have worked on this and on deep learning in general to ask their opinion. Apart from benchmarking the learning schedules (these are not really learning rates imho), the other thing would be investigating early stopping. |
…o patch-1 Conflicts: doc/whats_new.rst
…into patch-1 Conflicts: doc/whats_new.rst
Empty "residues_" in "LinearRegression": documented the case when the "residues_" attribute is empty.
…d_stuff [MRG] MAINT remove deprecated stuff that will no longer be supported in 0.18
[MRG+2] MAINT: deprecation warns from StandardScaler std_
[MRG + 1] Bug fix for unnormalized laplacian
MAINT: remove unused import
Circle ci to build the documentation (only after merge to master)
…raph_is_connected [MRG+1] Optimize sklearn.manifold._graph_is_connected
…huffle [MRG+1] Remove shuffling in LabelKFold
Residues can be empty if the rank of X does not satisfy the conditions described in scipy.linalg.lstsq documentation. As residues are not that useful (i.e. we're not doing stat testing), this property is deprecated and will be removed in sklearn 0.19.
[MRG+1] The code was raising an exception while plotting 2nd curve
[MRG + 1] Removed deprecated stuff in 0.18
…-1-row-csr-fix [MRG + 1] max abs scaler 1 row csr fix
…zed-svd [MRG + 2] ENH: optimizing power iterations phase for randomized_svd
…age as default multioutput parameter of r2_score function after 0.19) adding deprecation message fixing deprecation message
confirm deprecation starting from 0.16 and removal after 0.18
Used to print "('violation:', 1.0)". Now "violation: 1.0".
Empty `residues_` in `LinearRegression`: docstring updated
Changed imports in test to separate testing of API and of internals. See scikit-learngh-5509.
[MRG + 1] Remove deprecated stuff from SVM
…metrics_second_pass [MRG+2] Use uniform_average as default multioutput parameter of r2_score function
…metrics [MRG+1] removing deprecated files in metrics
[MRG+1] partly fixed issue scikit-learn#3450 for hamming loss
…sionadded DOC versionadded randomized_svd
Seeking to finalize MLP
…trl+C stop option for SGD
In
MLPClassifier
, loss is set to'log_loss'
for both multiclass case and multilabel case.The
log_loss
implementation inneural_network/base.py
is in fact binary cross entropy, but for multiclass case whereself.out_activation_
is set to'softmax'
, it does not make sense to use binary cross entropy loss. I think categorical cross entropy, or otherwise called negative log likelihood should be used instead.