You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At least one user got trapped by setting target_var to a vector for a regression task, and then passing it to lasagne.objectives.squared_error along with the network predictions for a network of a single output unit: https://groups.google.com/forum/#!topic/lasagne-users/DsSY1cHCC2M
Due to #715, Theano knows that the network predictions are a column vector (a matrix with broadcast pattern (False, True)), so it will happily broadcast the network predictions and the targets into a square matrix of (batchsize, batchsize) and compute the squared differences between every prediction and every target (as opposed to computing the squared differences between corresponding predictions and targets only). The mean of the loss is still a scalar, so training will appear to work, but not learn anything.
We should catch this case and warn the user to use a column vector for the squared error targets.
The text was updated successfully, but these errors were encountered:
We should catch this case and warn the user to use a column vector for the squared error targets.
Maybe it's even better to just support this case, and turn a 1D target vector into a column vector if needed. This closes the trap and also circumvents the log sigmoid stabilization not being applied for binary classification (if we encourage users to use a target vector for single outputs). PR in #770.
We should catch this case and warn the user to use a column vector for the
squared error targets.
Maybe it's even better to just support this case, and turn a 1D target
vector into a column vector if needed. This closes the trap and also
circumvents the log sigmoid stabilization not being applied for binary
classification (if we encourage users to use a target vector for single
outputs). PR in #770#770.
At least one user got trapped by setting target_var to a vector for a regression task, and then passing it to lasagne.objectives.squared_error along with the network predictions for a network of a single output unit: https://groups.google.com/forum/#!topic/lasagne-users/DsSY1cHCC2M
Due to #715, Theano knows that the network predictions are a column vector (a matrix with broadcast pattern (False, True)), so it will happily broadcast the network predictions and the targets into a square matrix of (batchsize, batchsize) and compute the squared differences between every prediction and every target (as opposed to computing the squared differences between corresponding predictions and targets only). The mean of the loss is still a scalar, so training will appear to work, but not learn anything.
We should catch this case and warn the user to use a column vector for the squared error targets.
The text was updated successfully, but these errors were encountered: