Improve train_test_split #12

cancan101 · 2014-12-29T05:05:02Z

For NeuralNet:

Why use your own implementation of train_test_split rather than using sklearn.cross_validation::train_test_split?
-- This would fix train_test_split does not work for Pandas Series with non Dense Index #7 because it uses safe_indexing.
Allow the user to pass in, through composition (like the batch iterator), a class responsible for performing the train vs valid split. Right now the only option is to subclass NeuralNet which is not ideal.

The text was updated successfully, but these errors were encountered:

cancan101 · 2014-12-31T16:11:46Z

I am currently using this:

class NeuralNetFix(NeuralNet):
    def train_test_split(self, X, y, eval_size):
        assert eval_size is None
        X_train = X
        y_train = y
        X_valid = self.X_valid
        y_valid = self.y_valid
        if not self.regression and self.use_label_encoder:
            y_valid = self.enc_.transform(y_valid).astype(np.int32)
        return X_train, X_valid, y_train, y_valid

I then handle the train_test_split myself.

UPDATE: I no longer read X_valid and y_valid from self.

dnouri · 2015-01-01T20:44:58Z

@cancan101 To answer your first question: You'll notice that NeuralNet.train_test_split uses StratifiedKFold for classification tasks. That's different to what sklearn's train_test_split would give you.

But maybe the stratified split isn't all that important, and we can just use an overridable train_test_split (component) by default.

dnouri · 2015-02-19T22:52:17Z

I think we can merge this discussion with #42, and close this one. Also consider #45 when thinking about a better way to override the train_test_split method.

cancan101 mentioned this issue Dec 30, 2014

Added forced_even to BatchIterator ensure equal batch size #11

Merged

cancan101 mentioned this issue Feb 11, 2015

Option to make the nets more deterministic #26

Closed

dnouri closed this as completed Feb 19, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve train_test_split #12

Improve train_test_split #12

cancan101 commented Dec 29, 2014

cancan101 commented Dec 31, 2014

dnouri commented Jan 1, 2015

dnouri commented Feb 19, 2015

Improve train_test_split #12

Improve train_test_split #12

Comments

cancan101 commented Dec 29, 2014

cancan101 commented Dec 31, 2014

dnouri commented Jan 1, 2015

dnouri commented Feb 19, 2015