Class weight support #57

vinhqdang · 2015-12-28T23:02:40Z

Hi,

I am using skflow.ops.dnn to classify two - classes dataset (True and False). The percentage of True example is very small, so I have an imbalanced dataset.

It seems to me that one way to resolve the issue is to use weighted classes. However, when I look to the implementation of skflow.ops.dnn, I do not know how could I do weighted classes with DNN.

Is it possible to do that with skflow, or is there another technique to deal with imbalanced dataset problem in skflow?

Thanks

The text was updated successfully, but these errors were encountered:

ilblackdragon · 2015-12-29T17:47:27Z

Usually, there are two ways to handle imbalanced data:

Oversample under-represented class.
e.g. You can just copy each record by false_rate/positive_rate
Class weights
This really just needs to be implemented in the loss function - if you are interested, you need to change skflow.ops.losses_ops.softmax_classifier to take a loss_weight tensor as argument and do something like this in line 37:

 xent = tf.mul(xent, loss_weight)

and then pass it via the models.logistic_regression.
If you want to do it and try in your case and send PR - will be greatly appreciated :) Otherwise, I'll try adding this later this week.

lopuhin · 2016-02-02T07:24:13Z

I think this is already implemented here https://github.com/tensorflow/skflow/blob/master/skflow/models.py#L57 and here https://github.com/tensorflow/skflow/blob/master/skflow/ops/losses_ops.py#L52

vinhqdang · 2016-02-02T07:25:51Z

Thanks @lopuhin , so what is the correct way to use it? (in a case of unbalanced dataset, 90% of class A and 10% of class B)?

lopuhin · 2016-02-02T07:39:36Z

@vinhqdang sorry, I was wrong - I don't think that existing implementation is correct, because softmax_cross_entropy_with_logits already returns losses for each example in a mini-batch. For now you can try to over-sample class B in training data.

lopuhin · 2016-02-02T07:45:27Z

And I am not sure if is is possible to implement it in terms of existing tensorflow loss functions? It seems one will need to defined a loss function similar to softmax_cross_entropy_with_logits from scratch.

lopuhin · 2016-02-02T08:01:11Z

Ah, no, it should be possible - we just need to multiply xent with weights that depend on labels in minibatch. Sorry for the noise :)

ilblackdragon · 2016-02-02T08:23:53Z

@lopuhin It's possible, and as you mentioned https://github.com/tensorflow/skflow/blob/master/skflow/ops/losses_ops.py#L52 partially implements this bug (it does multiple xent for each class by weight of the class). The only missing piece is passing it from estimator (e.g. TFLinearClassifier(..., class_weights={1: 0.9, 0: 0.1}) to the models and losses.

I didn't think of a good interface yet to do this (right now it would need to be an argument for every model function).

ilblackdragon · 2016-02-02T08:32:07Z

What's currently there can be used by creating an explicit TF constant and initializing it with your weights:

def my_model(X, y):
    class_weight = tf.constant([0.9, 0.1]))
    return skflow.models.logistic_regression(X, y, class_weight=class_weight)

estimator = skflow.TensorFlowEstimator(model_fn=my_model, n_classes=2, ...other args...)

lopuhin · 2016-02-02T08:36:20Z

Thats what I thought @ilblackdragon , but for me it fails with tensorflow.python.framework.errors.InvalidArgumentError: Incompatible shapes: [32] vs. [2] here https://github.com/tensorflow/skflow/blob/master/skflow/ops/losses_ops.py#L53 because xent is a tensor of shape [32], which is the number of examples in a minibatch. Maybe I'm using it wrong though.

ilblackdragon · 2016-02-02T19:41:20Z

@lopuhin, You are right, softmax_cross_entropy_with_logits returns just [batch_size] of values... So it's already too late to add class weights. I'll take a look how to bypass that.

…he math should work as -weight[class]*x[class] + log( sum ( exp weighted x))

ilblackdragon · 2016-02-03T03:48:55Z

Ok, so instead, I moved it up to multiple logits.
I think it still works with math, just need to try out on a some imbalanced dataset and then add initialization from a constructor. @lopuhin Let me know what do you think.

lopuhin · 2016-02-03T08:14:19Z

@ilblackdragon for me a more natural solution would be something like this lopuhin@5c97849 - here I apply weight to xent of each label depending of what labels it is. I think this is different mathematically from scaling logits. But I am still learning, so take this with a grin of salt please :)

ilblackdragon · 2016-02-14T05:55:39Z

@lopuhin Change importance in cross-entropy is to adjust relative importance of all the classes to each other (skewing distribution) when in your option it will only adjusting the weight of one class. But your option may work in practice. I'll double check with few people what is the best way.

ilblackdragon changed the title ~~Weighted classes with DNN?~~ Class weight support Dec 29, 2015

ilblackdragon added the enhancement label Dec 29, 2015

ilblackdragon added this to the 0.1 milestone Dec 29, 2015

ilblackdragon added a commit that referenced this issue Feb 3, 2016

Ref #57: Moving to use class weight on logits before cross entropy. T…

b729f46

…he math should work as -weight[class]*x[class] + log( sum ( exp weighted x))

terrytangyuan assigned ilblackdragon Feb 6, 2016

ilblackdragon closed this as completed in 79eed03 Feb 14, 2016

nicolov mentioned this issue Jan 6, 2017

Weights before softmax error in the weighted loss function jakeret/tf_unet#8

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Class weight support #57

Class weight support #57

vinhqdang commented Dec 28, 2015

ilblackdragon commented Dec 29, 2015

lopuhin commented Feb 2, 2016

vinhqdang commented Feb 2, 2016

lopuhin commented Feb 2, 2016

lopuhin commented Feb 2, 2016

lopuhin commented Feb 2, 2016

ilblackdragon commented Feb 2, 2016

ilblackdragon commented Feb 2, 2016

lopuhin commented Feb 2, 2016

ilblackdragon commented Feb 2, 2016

ilblackdragon commented Feb 3, 2016

lopuhin commented Feb 3, 2016

ilblackdragon commented Feb 14, 2016

Class weight support #57

Class weight support #57

Comments

vinhqdang commented Dec 28, 2015

ilblackdragon commented Dec 29, 2015

lopuhin commented Feb 2, 2016

vinhqdang commented Feb 2, 2016

lopuhin commented Feb 2, 2016

lopuhin commented Feb 2, 2016

lopuhin commented Feb 2, 2016

ilblackdragon commented Feb 2, 2016

ilblackdragon commented Feb 2, 2016

lopuhin commented Feb 2, 2016

ilblackdragon commented Feb 2, 2016

ilblackdragon commented Feb 3, 2016

lopuhin commented Feb 3, 2016

ilblackdragon commented Feb 14, 2016