Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Can I set instance weight when training? #7375

Open
regzhuce opened this issue Aug 8, 2017 · 14 comments
Open

Can I set instance weight when training? #7375

regzhuce opened this issue Aug 8, 2017 · 14 comments
Labels

Comments

@regzhuce
Copy link

regzhuce commented Aug 8, 2017

Is there any way that I can set a weight for every instance when I train the model?
I just cannot find any doc about this.

@jeremiedb
Copy link
Contributor

The strategy I've used is to build a custom loss function using the MakeLoss operator and feeding it with the weights, For example:
loss = MakeLoss(weight * mx.symbol.square(label-pred))

@regzhuce
Copy link
Author

regzhuce commented Aug 8, 2017

It's so strange when predicting, I have to feed a constant weight to the model, and get a loss value but the prediction value.

@jeremiedb
Copy link
Contributor

The default behavior when using predict function on a model with MakeLoss is to get the inference on the latest layer, which is the one where the loss function is defined. You can either retro-fit the the actual predictions knowing the labels and weight, or more simply, get the predictions for the layer previous to MakeLoss where the preds are defined.

@regzhuce
Copy link
Author

regzhuce commented Aug 8, 2017

It's so trivial. Hopefully we can have more graceful approach encapuslated. That would be very nice.

@thirdwing thirdwing added the R label Aug 8, 2017
@thirdwing
Copy link
Contributor

@regzhuce
Copy link
Author

@thirdwing Thanks
Any proposals for my problem?

@thirdwing
Copy link
Contributor

thirdwing commented Aug 11, 2017 via email

@regzhuce
Copy link
Author

Say, I got lots of samples, but not all samples are the same importance. I wanna to give every sample an importance, i.e. instance weight.

@VGalata
Copy link

VGalata commented Dec 1, 2017

I am also interested in how I can weight the samples. I have a binary classification problem and I wanted to give the samples or classes different weights. Is there no other way than using a custom loss function?

Unfortunately, the tutorial for the custom loss function is not sufficient to see how to use this function in a different setup. Is there a way to fully replace mx.symbol.<...>Output without the need of additional steps afterwards to get the prediction? I would like to get model performance during training on a validation data set. Thus, I need the predictions during training and I do not know how to get them if I use MakeLoss.

Any help is highly appreciated!

@VGalata
Copy link

VGalata commented Dec 6, 2017

@regzhuce : Probably this could help you:

I finally could figure out how to use class weights for a (binary) classification problem though I still do not know how to achieve the functionality of a mx.symbol.<...>Output layer to return the loss gradient and the prediction. However, here is my code to use a weighted version of cross-entropy when having two classes:

# ... other layers, last layer's name is 'last_layer'
# Fully connected layer with 2 nodes
fc_last <- mx.symbol.FullyConnected(data=last_layer, num_hidden=2, name='lastfullyconnected')
# Label variable
label   <- mx.symbol.Variable(name='label')
# Softmax
softmax <- mx.symbol.softmax(data=fc_last, name='softmax', axis=1)
# Weighted cross-entropy
# label_weight in (0, 1), 1e-6 is added to avoid log(0)
nn_out  <- mx.symbol.MakeLoss(
    -1 * (1 - label_weight) * (1 - label)  * mx.symbol.log(mx.symbol.Reshape(mx.symbol.slice_axis(softmax, axis=1, begin=0, end=1), shape = 0) + 1e-6) -
              label_weight  *      label   * mx.symbol.log(mx.symbol.Reshape(mx.symbol.slice_axis(softmax, axis=1, begin=1, end=2), shape = 0) + 1e-6),
            name='weightedcrossentropy'
        )

After training, the same approach can be used to obtain predictions as described in this example for a regression task.

@thirdwing: It would be nice to get a confirmation whether this is a valid example for using class weights on softmax output as, unfortunately, there is no tutorial for this case.

@piyushghai
Copy link
Contributor

@regzhuce Hope your question was answered by the above comment.

@sandeep-krishnamurthy Can you please close this issue ?

@zeakey
Copy link
Contributor

zeakey commented Nov 20, 2019

I face the same problem.
I think the situation @regzhuce mentioned can be abstrated as: mannually assign weights to the loss of different samples.

In the mxnet.sym API about the SoftmaxOutput http://beta.mxnet.io/r/api/mx.symbol.SoftmaxOutput.html, I cannot find a proper solution.

I have to implement this idea in the symbolic API.

@bricksdont
Copy link

Same problem here. I am looking for a drop-in replacement for mx.sym.SoftmaxOutput that somehow allows weighting examples in a batch individually. Something like

 mx.sym.WeightedSoftmaxOutput(data=logits,
            label=labels,
            weights=weights,
            ignore_label=ignore_label,
            use_ignore=True,
            normalization=normalization,
            smooth_alpha=smooth_alpha,
            name=name)

@thirdwing Why did you tag this issue with R?

@piyushghai the example given by VGalata is not exactly what the issue is about, namely instance weights instead of class weights.

@bricksdont
Copy link

bricksdont commented Mar 12, 2020

Here is a gist with an actual implementation of batch-weighted cross-entropy loss that I believe can replace the default SoftmaxOutput, but will be less efficient, for instance if label smoothing is used:

https://gist.github.com/bricksdont/812b4d6a21ab045da771560ec9af8c11

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

7 participants