In Which We Add Support for Sample Weights to MLPs #75

waltaskew · 2018-01-25T01:21:09Z

support sample weights in MLPClassifier
support sample weights in MLPRegressor
add tests ensuring sample weights are handled reasonably

beckermr · 2018-01-25T13:52:51Z

A quick comment here. We should normalize the losses by the sum of the sample weights in order to keep them on a decent numerical scale.

beckermr · 2018-01-25T15:19:17Z

muffnn/mlp/mlp_classifier.py

@@ -129,19 +129,23 @@ def _init_model_objective_fn(self, t):
        if self.multilabel_:
            cross_entropy = tf.nn.sigmoid_cross_entropy_with_logits(
                logits=t, labels=tf.cast(self.input_targets_, np.float32))
+            cross_entropy = tf.multiply(cross_entropy, self._sample_weight)


Does this broadcast correctly?

For multilabel, I think the shape of of cross_entropy is [n_examples, n_classes] and the weights are [n_examples]. You might need to add a trailing dimensions to get this to work?

It does not broadcast correctly. The fact that the tests passed makes me worry how well we're testing the multi-label case.

Agreed. Sam and I worked on those tests a lot but I am not going to claim they are bullet proof.

- support sample weights in MLPClassifier - support sampel weights in MLPRegressor - add tests ensuring sample weights are handled reasonably

mheilman

could you also please add this to the changelog in this PR?

mheilman · 2018-01-26T02:41:38Z

muffnn/mlp/base.py

+        sample_weight : numpy array of shape [n_samples,]
+            Per-sample weights. Re-scale the loss per sample.
+            Higher weights force the estimator to put more emphasis
+            on these samples.


It's probably worth noting here or in the class docstring that the sample weights are normalized per batch. i understand the argument for this, and i think that as the batch size approaches the size of the dataset, it's equivalent to normalizing the sample weights once ahead of time, but this is altering the objective function a bit, so i imagine that it could lead to odd behavior in edge cases. in practice, it probably doesn't matter either way, so i'm fine with keeping it this way, but it'd be good to document in case a user expects not normalizing.

mheilman · 2018-01-26T02:53:18Z

muffnn/mlp/mlp_classifier.py

-                tf.where(y_finite, self._zeros, cross_entropy))
+
+            # reshape to broadcast multiplication along cross_entropy
+            sample_weight = tf.reshape(self._sample_weight, (-1, 1))


So this and the next line of code turn the original (batch_size,) sample weights array into a (batch, n_classes_) one with the same value across a row? If so, could you add a comment to that effect?

Also, we're not actually using broadcasting here, right? That comment might be confusing, unless I'm misunderstanding something here.

mheilman · 2018-01-26T03:07:01Z

muffnn/mlp/tests/util.py

+    assert_clfs_almost_equal(clf1, clf2)
+
+    # Fitting twice with half sample-weights should result
+    # in same classifier as fitting once with full weights.


hmm... i don't think running for twice as many epochs with sample weights scaled by 0.5 will lead to the same model for non-vanilla SGD. Also, shouldn't the models be the same after a single fit of clf2 because of the batch sample weight normalization?

Does this fail if partial_fit is only called once below for clf2?

Yeah, I don't think this test makes sense because we're normalizing by the sum of the weights.
My original thinking was to shrink the loss with the weights to force the training to take smaller steps, but normalization means this isn't happening -- I'll remove this test.

mheilman · 2018-01-26T03:08:14Z

muffnn/mlp/tests/util.py

+
+    clf1 = make_classifier_func().fit(X_train[ind], y_train[ind])
+    clf1 = make_classifier_func().fit(
+        X_train, y_train, sample_weight=sample_weight


does this fail if you don't include the sample weights here (after fixing the copy-paste error)?

mheilman · 2018-01-26T03:08:35Z

muffnn/mlp/tests/util.py

+    sample_weight = np.bincount(ind, minlength=X_train.shape[0])
+
+    clf1 = make_classifier_func().fit(X_train[ind], y_train[ind])
+    clf1 = make_classifier_func().fit(


copy-paste error? i think this should be clf2.

why didn't that cause a test failure?

It doesn't cause a failure because all the clf1 and clf2s are learning the same model. It's comparing against the wrong clf2, but the wrong clf2 is still trained on the same data.

The window for determining if two classifiers make the same prediction may be too large, though.

beckermr

My main comment here is that the tests seem a bit excessive. It seems sufficient to test that the weights are handled properly in a simple technical sense, as opposed to testing our assumptions about how the weights should work. To this end, only the last test in the assert function in the test suite seems relevant.

An even simpler test which just checks some of the computations against similar versions in numpy might be easier to write and more direct.

waltaskew · 2018-01-26T16:00:44Z

My main comment here is that the tests seem a bit excessive. It seems sufficient to test that the weights are handled properly in a simple technical sense, as opposed to testing our assumptions about how the weights should work. To this end, only the last test in the assert function in the test suite seems relevant.

I'd like to keep the test asserting that not passing sample weights give the same result as passing sample weights of all 1s, because that seems like a reasonable invariant that could catch errors in handling absent sample weights.

I'm definitely removing the two partial fits test, which leaves the test asserting that duplicated inputs are equivalent to up-weighted inputs. I don't have particular allegiances to that test.

waltaskew · 2018-01-26T16:26:57Z

I improved the docs and comments and removed the brittle tests. Anything else?

- better docs - better comments - fewer brittle tests - updated CHANGELOG

mheilman · 2018-01-26T17:47:07Z

CHANGELOG.md

@@ -4,6 +4,12 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](http://keepachangelog.com/)
 and this project adheres to [Semantic Versioning](http://semver.org/).

+## Unreleased
+
+- Added support for the `sample_weight` keyword argument to the `fit`


Could you put this under an " ### Added" section please?

mheilman · 2018-01-26T17:47:47Z

muffnn/mlp/base.py

+            Per-sample weights. Re-scale the loss per sample.
+            Higher weights force the estimator to put more emphasis
+            on these samples.
+            Sample weights are normalized per-batch. As the batch size


I think just "Sample weights are normalized per-batch." is fine here.

mheilman · 2018-01-26T17:49:45Z

muffnn/mlp/mlp_classifier.py

+            Per-sample weights. Re-scale the loss per sample.
+            Higher weights force the estimator to put more emphasis
+            on these samples.
+            Sample weights are normalized per-batch. As the batch size


see comment above

beckermr · 2018-01-26T17:57:43Z

LGTM from me modulo mike’s comments. Thanks a bunch for doing this Walt! (Hammer) (huge fan)

waltaskew · 2018-01-26T18:08:06Z

CHANGELOG changed and docstrings shortened, anything else?

mheilman

LGTM

waltaskew changed the title ~~In Which We Add Support for Sample Weights to MLP Classifier~~ WIP In Which We Add Support for Sample Weights to MLP Classifier Jan 25, 2018

waltaskew changed the title ~~WIP In Which We Add Support for Sample Weights to MLP Classifier~~ [DSRD-575] WIP In Which We Add Support for Sample Weights to MLP Classifier Jan 25, 2018

beckermr reviewed Jan 25, 2018

View reviewed changes

mheilman changed the title ~~[DSRD-575] WIP In Which We Add Support for Sample Weights to MLP Classifier~~ WIP In Which We Add Support for Sample Weights to MLP Classifier Jan 25, 2018

add sample weights to MLP

c8fc32a

- support sample weights in MLPClassifier - support sampel weights in MLPRegressor - add tests ensuring sample weights are handled reasonably

waltaskew changed the title ~~WIP In Which We Add Support for Sample Weights to MLP Classifier~~ In Which We Add Support for Sample Weights to MLP Classifier Jan 25, 2018

waltaskew changed the title ~~In Which We Add Support for Sample Weights to MLP Classifier~~ In Which We Add Support for Sample Weights to MLPs Jan 25, 2018

waltaskew assigned mheilman Jan 25, 2018

mheilman reviewed Jan 26, 2018

View reviewed changes

beckermr reviewed Jan 26, 2018

View reviewed changes

PR Feedback

0320497

- better docs - better comments - fewer brittle tests - updated CHANGELOG

beckermr self-assigned this Jan 26, 2018

mheilman reviewed Jan 26, 2018

View reviewed changes

word-smithing

ad4ca91

mheilman approved these changes Jan 26, 2018

View reviewed changes

waltaskew merged commit 3c90d1e into civisanalytics:master Jan 26, 2018

waltaskew deleted the support-sample-weights branch January 26, 2018 18:26

beckermr mentioned this pull request Jan 26, 2018

Sample weights in MLP #72

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In Which We Add Support for Sample Weights to MLPs #75

In Which We Add Support for Sample Weights to MLPs #75

waltaskew commented Jan 25, 2018 •

edited

Loading

beckermr commented Jan 25, 2018

beckermr Jan 25, 2018

waltaskew Jan 25, 2018

beckermr Jan 26, 2018

mheilman left a comment

mheilman Jan 26, 2018

mheilman Jan 26, 2018

mheilman Jan 26, 2018

waltaskew Jan 26, 2018

mheilman Jan 26, 2018

mheilman Jan 26, 2018

waltaskew Jan 26, 2018

beckermr left a comment

waltaskew commented Jan 26, 2018 •

edited

Loading

waltaskew commented Jan 26, 2018

mheilman Jan 26, 2018

mheilman Jan 26, 2018

mheilman Jan 26, 2018

beckermr commented Jan 26, 2018

waltaskew commented Jan 26, 2018

mheilman left a comment

In Which We Add Support for Sample Weights to MLPs #75

In Which We Add Support for Sample Weights to MLPs #75

Conversation

waltaskew commented Jan 25, 2018 • edited Loading

beckermr commented Jan 25, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mheilman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

beckermr left a comment

Choose a reason for hiding this comment

waltaskew commented Jan 26, 2018 • edited Loading

waltaskew commented Jan 26, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

beckermr commented Jan 26, 2018

waltaskew commented Jan 26, 2018

mheilman left a comment

Choose a reason for hiding this comment

waltaskew commented Jan 25, 2018 •

edited

Loading

waltaskew commented Jan 26, 2018 •

edited

Loading