Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow vector targets for column vector predictions in objectives #770

Merged
merged 1 commit into from
Nov 27, 2016

Conversation

f0k
Copy link
Member

@f0k f0k commented Nov 20, 2016

This changes behaviour of lasagne.objectives.* for a special case: If predictions are a column vector (i.e., a matrix that is broadcastable in the second dimension) and targets are a 1D vector, implicitly turn the targets into a column vector as well. This avoids the current trap (opened in #715) that combining a network with a single output unit and a T.vector() target broadcasts predictions and targets to compare all predictions against all targets. Closes #755, but we need to discuss whether we want this. I'd argue that it seems so natural to use T.vector() for regression or binary classification targets that we should just support it. The alternative would be to warn the user to use a T.matrix() or T.col() instead.

Trapped users:
https://groups.google.com/forum/#!topic/lasagne-users/DsSY1cHCC2M
https://groups.google.com/forum/#!topic/lasagne-users/_JAgvyhpf7Q

@benanne
Copy link
Member

benanne commented Nov 27, 2016

Looks good, I think this is a very common mistake so I'm in favour. Can you think of any reasons why we wouldn't want this? Are there any use cases where the current behaviour is desired, or does it complicate things when one wants to achieve the current behaviour at all? If there are any they are probably very rare.

It adds a little bit of complexity, but it's only one additional line and the purpose is clear. So I have no objections. Feel free to merge :)

@f0k
Copy link
Member Author

f0k commented Nov 27, 2016

Are there any use cases where the current behaviour is desired, or does it complicate things when one wants to achieve the current behaviour at all?

No, not really. To achieve the current behaviour, you would need to define the target vector as a T.row() or dimshuffle it accordingly, or use e.g. T.nnet.binary_crossentropy() directly. But this is only needed for special use cases, when you want to define some kind of contrastive loss, for example (where you'd contrast the main diagonal against off-diagonal elements). I think closing the trap is more important.

Feel free to merge :)

Thank you!

@f0k f0k merged commit 9886da2 into Lasagne:master Nov 27, 2016
@f0k f0k deleted the vector-targets branch November 27, 2016 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants