Problem migrating from DBN to lasagne NeuralNet: NaN for each epoch #35

bmilde · 2015-02-05T18:59:07Z

Lasagne looks fantastic, thanks for integrating it into nolearn! However, I have trouble transitioning from nolearn's DBN to the new lasagne NeuralNet.

Here is what happens:

Done loading and transforming data, traindata size: 83.5334777832 MB
Distribution of classes in train data:
[[ 0.00000000e+00 5.82160000e+04]
[ 1.00000000e+00 5.12730000e+04]] 2
conf: momentum: 0.01 self.learn_rates: 0.01
fitting classifier... nolearn
InputLayer (None, 200) produces 200 outputs
DenseLayer (None, 50) produces 50 outputs
DenseLayer (None, 2) produces 2 outputs

Epoch	Train loss	Valid loss	Train / Val	Valid acc	Dur
1	nan	nan	nan	45.05%	0.7s
2	nan	nan	nan	46.47%	0.6s
3	nan	nan	nan	45.77%	0.6s
4	nan	nan	nan	47.06%	0.6s
5	nan	nan	nan	47.07%	0.7s
6	nan	nan	nan	47.06%	0.7s
7	nan	nan	nan	47.08%	0.7s
8	nan	nan	nan	53.71%	0.7s
9	nan	nan	nan	47.05%	0.6s
10	nan	nan	nan	47.05%	0.6s
11	nan	nan	nan	47.05%	0.6s
12	nan	nan	nan	47.05%	0.7s
13	nan	nan	nan	47.05%	0.6s
14	nan	nan	nan	47.05%	0.6s

I tried fiddling with different learning rates (1,0.1,0.01,... 0.0000001 even 0.0), momentum rates, different optimisers (sgd,nestrov, rmsprop ...every method that lasagne offers), input sizes, no. of hidden units, two and one hidden layer, all to no avail.

The mnist example from lasagne runs fine though.

Here is my DBN code, which also runs fine and produces models with >0.90% accuracy (on an audio gender detection task), on the same data:

            clf = DBN([X_train.shape[1], self.hid_layer_units, self.hid_layer_units, self._no_classes],
                    dropouts=self.dropouts,
                    learn_rates=self.learn_rates,
                    learn_rates_pretrain=self.learn_rates_pretrain,
                    minibatch_size=self.minibatch_size,
                    learn_rate_decays=self.learn_rate_decays,
                    learn_rate_minimums=self.learn_rate_minimums,
                    epochs_pretrain=self.pretrainepochs,
                    epochs=self.epochs,
                    momentum= self.momentum,
                    real_valued_vis=True,
                    use_re_lu=True,
                    verbose=1)

I've translated that into:

            clf = NeuralNet(
                    layers=[  # three layers: one hidden layer
                            ('input', layers.InputLayer),
                            ('hidden', layers.DenseLayer),
                            #('hidden', layers.DenseLayer),
                            ('output', layers.DenseLayer),
                            ],
                            # layer parameters:
                            input_shape=(None, X_train.shape[1]),  
                            hidden_num_units=self.hid_layer_units,  
                            output_num_units=self._no_classes,
                            output_nonlinearity=None,

                            eval_size=0.1,

                            # optimization method:
                            update=sgd,
                            update_learning_rate=self.learn_rates,
                            #update_momentum=momentum,

                            regression=False, 
                            max_epochs=self.epochs,  
                            verbose=1,
                            )

Is there anything obvious that I've missed here? How can I debug this?

The text was updated successfully, but these errors were encountered:

bmilde · 2015-02-06T13:52:32Z

I have put together this simple example fitting nolearns NeuralNet on MNIST, which also doesn't run on my machine (nan for losses + valid acc does not improve). Could you try to run it?

from lasagne import layers
from lasagne import init

from lasagne.updates import sgd,nesterov_momentum
from nolearn.lasagne import NeuralNet

import numpy as np

from sklearn.datasets import fetch_mldata
from sklearn.utils import shuffle

DATA_PATH = '~/data'

mnist = fetch_mldata('MNIST original', data_home=DATA_PATH)

train = mnist.data[:60000].astype(np.float32)
train_labels = mnist.target[:60000].astype(np.int32)

train, train_labels = shuffle(train, train_labels, random_state=42)

print 'train.shape:',train.shape,'train.dtype:',train.dtype,'train_labels.dtype:',train_labels.dtype

clf = NeuralNet(
    layers=[
        ('input', layers.InputLayer),
        ('hidden', layers.DenseLayer),
        ('output', layers.DenseLayer),
        ],
    input_shape = (None, train.shape[1]),
    hidden_num_units=100,
    output_num_units=10,
    output_nonlinearity=None,

    update=nesterov_momentum,
    #update=sgd,
    update_learning_rate=0.01,
    update_momentum=0.9,

    regression=False,
    max_epochs=1000,
    verbose=1,

    #W=init.Uniform()

    )

clf.fit(train,train_labels)

dnouri · 2015-02-08T21:22:29Z

I think what you're missing is output_nonlinearity=lasagne.nonlinearities.softmax. Sorry for a lack of proper documentation for this. But there's an MNIST example included in the tests if you want to have a look: https://github.com/dnouri/nolearn/blob/master/nolearn/tests/test_lasagne.py#L41-L91

bmilde · 2015-02-09T00:01:01Z

Ah, thanks, yes that was it! Apparently I looked into every other parameter besides output_nonlinearity... thanks again!

bmilde closed this as completed Feb 9, 2015

MartinThoma mentioned this issue Jun 20, 2015

docs: Started Lasagne documentation #117

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem migrating from DBN to lasagne NeuralNet: NaN for each epoch #35

Problem migrating from DBN to lasagne NeuralNet: NaN for each epoch #35

bmilde commented Feb 5, 2015

bmilde commented Feb 6, 2015

dnouri commented Feb 8, 2015

bmilde commented Feb 9, 2015

Problem migrating from DBN to lasagne NeuralNet: NaN for each epoch #35

Problem migrating from DBN to lasagne NeuralNet: NaN for each epoch #35

Comments

bmilde commented Feb 5, 2015

bmilde commented Feb 6, 2015

dnouri commented Feb 8, 2015

bmilde commented Feb 9, 2015