This repository has been archived by the owner. It is now read-only.

Implement loss function for multi label classification tasks #2724

Merged
merged 4 commits into from Mar 13, 2018

Conversation

Projects
None yet
4 participants
@treo
Copy link
Member

treo commented Mar 10, 2018

This implements the loss function from this paper:

Min-Ling Zhang and Zhi-Hua Zhou, "Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization," in IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 10, pp. 1338-1351, Oct. 2006.

It is useful as a loss function for multi label classification tasks.

@treo treo requested a review from AlexDBlack Mar 10, 2018

@saudet

This comment has been minimized.

Copy link
Member

saudet commented Mar 11, 2018

@sshepel Did you set some commands to force Jenkins to rebuild the branch without having to push new commits?

@sshepel

This comment has been minimized.

Copy link
Contributor

sshepel commented Mar 11, 2018

@sshepel Did you set some commands to force Jenkins to rebuild the branch without having to push new commits?

@saudet there is a button in Jenkins UI to do that, I have already provided @treo with all required info, no additional forcing logic where added.

@AlexDBlack
Copy link
Member

AlexDBlack left a comment

Some issues/questions noted, nothing huge.
Main thing I'd like to see here would be a parallel PR for DL4J that adds this to the gradient checks here: https://github.com/deeplearning4j/deeplearning4j/blob/master/deeplearning4j-core/src/test/java/org/deeplearning4j/gradientcheck/LossFunctionGradientCheck.java

if (scoreOutput != null) {
if (mask != null) {
final INDArray perLabel = classificationDifferences.sum(0);
LossUtil.applyMask(perLabel, mask);

This comment has been minimized.

@AlexDBlack

AlexDBlack Mar 12, 2018

Member

This line looks incorrect at first glance... seems like you are applying the full (all examples) mask on a single row?

This comment has been minimized.

@treo

treo Mar 13, 2018

Member

You've got a point here, I've been moving around the masking stuff to find a proper place for it, and missed calling getRow here.



int examples = positive.size(0);
for (int i = 0; i < examples; i++) {

This comment has been minimized.

@AlexDBlack

AlexDBlack Mar 12, 2018

Member

I haven't checked the math - but I assume there's no way to do this without the loop?

This comment has been minimized.

@treo

treo Mar 13, 2018

Member

I've failed to find a way that works without the loop. Every method that I could come up with, that skips the loop also calculates A LOT of unnecessary results. The speed hit isn't as bad as it may look like. It is at about 85% to 90% of BinaryXENT (e.g. 130 batches per second vs 150 batches per second).

If you have an idea how it could be done any faster, or any more vectorized, I'd love to hear it :)


final INDArray locPositive = positive.getRow(i);
final INDArray locNegative = negative.getRow(i);
final INDArray locNormFactor = normFactor.getScalar(i);

This comment has been minimized.

@AlexDBlack

AlexDBlack Mar 12, 2018

Member

Why not getDouble? Scalar in INDArray seems to provide (minor) overhead with no benefit?

This comment has been minimized.

@treo

treo Mar 13, 2018

Member

I was trying to keep the data movement off-heap->heap->off-heap to a minimum. But now that you've asked I actually looked at what getScalar does, and seen that it does just call getDouble internally... So I'll change that.

}

if (gradientOutput != null) {
gradientOutput.getRow(i).assign(classificationDifferences.sum(0).add(classificationDifferences.sum(1).transposei().negi()));

This comment has been minimized.

@AlexDBlack

AlexDBlack Mar 12, 2018

Member

Minor optimization: sum(0).addi


@Override
public List<SDVariable> doDiff(List<SDVariable> f1) {
return null;

This comment has been minimized.

@AlexDBlack

AlexDBlack Mar 12, 2018

Member

I'd rather an UnsupportedOperationException

This comment has been minimized.

@treo

treo Mar 13, 2018

Member

I've copied that over from the other loss functions. Not quite sure what exactly is expected here, so I've left it the way that it is with the others.

@treo treo force-pushed the treo_lossmultilabel branch from 140aec0 to 3aabade Mar 13, 2018

treo added a commit to deeplearning4j/deeplearning4j that referenced this pull request Mar 13, 2018

@AlexDBlack
Copy link
Member

AlexDBlack left a comment

LGTM 👍

@AlexDBlack AlexDBlack merged commit a1540de into master Mar 13, 2018

1 check was pending

continuous-integration/jenkins/pr-merge This commit is being built
Details

@AlexDBlack AlexDBlack deleted the treo_lossmultilabel branch Mar 13, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.