Per-example weights #2616

EdeMeijer · 2017-01-04T13:01:05Z

I was discussing the option of a per-example weights feature on gitter with @AlexDBlack, but decided to continue it in an issue.

So, I would like to be able to use per-example weights in order to boost the importance of certain examples. One option is to use oversampling, but that has some drawbacks:

Only effectively allowing integer weights
Makes training slower (more artificial examples), especially if you need big weights

Just for fun I tried to implement example weights and make it work with comp graphs, which wasn't too hard. POC can be found here (nd4j) and here (dl4j). However, after a little discussion Alex figured it might be easier to implement this feature by extending the masking infrastructure, and introduce an additional feature while we're at it: per-output-timestep masking and weighing. This could be done by adding a dimension to the output mask in case the user desires to specify weights for individual time steps.

So, what do you guys think?

AlexDBlack · 2017-01-04T13:09:59Z

Right, my current thinking on the subject is basically that we can implement per-example (and: per time step) weights by extending the current masking infrastructure.
Currently masks are either 0 or 1 - but in principle could be any arbitrary weighting value.

Per output (not per example) weights could be implemented (along with per-output) masks as follows:
if mask rank == labels rank: it's per output masking
if mask rank == labels rank - 1: it's per example/time step masking (like what we have now: 2d mask array for per-time step masking of a 3d array)

Per output weighting is useful for some cases like missing values, some RL cases, and some architectures like this: http://arxiv.org/abs/1603.00806

One question of course is how other DL platforms handle this - it might be good to review that too.

EdeMeijer · 2017-01-04T14:48:42Z

There's also the recently added feature of specifying per output weighing in the loss functions. Technically speaking, this could be encoded in the output masks as they are right now as well, by using a labels rank - 1 mask with weights.

fac2003 · 2017-01-13T13:36:14Z

We also need per example weights. It is useful when you have vastly imbalanced datasets (1 in 1 positive for every 1000 negative for instance). I am looking forward to the feature.

On the other hand, I personally prefer not to mix roles of methods and data attributes. Mixing masking and weighting is likely to be confusing. From a user point of view, I would prefer distinct methods (one for output weights, one for masks), with clear documentation each, even if you store the data in the INDArray in the mask fields internally.

You could make it so that clients can only call one of the methods. When reading the code, it will be clear what the intent is from the name of the method without having to think about the rank of the parameter.

cacophany53 · 2020-06-10T04:02:14Z

As it stands, is it currently possible to specify per-sample weights by using mask values other than 0 or 1? (ie. 0.5 for half the weight of a normal sample, 2 for twice the weight).

EdeMeijer mentioned this issue Jan 9, 2017

Support per-example and per-timestep weights #2659

Closed

eraly added the Enhancement New features and other enhancements label Mar 16, 2017

eraly mentioned this issue Apr 26, 2018

MiniBatchFileDataSetIterator forgets about masks of underlying DataSet. #3445

Closed

AlexDBlack pushed a commit that referenced this issue May 21, 2018

don't try the same name twise (#2616)

5be6ac3

agibsonccc closed this as completed Oct 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Per-example weights #2616

Per-example weights #2616

EdeMeijer commented Jan 4, 2017

AlexDBlack commented Jan 4, 2017

EdeMeijer commented Jan 4, 2017

fac2003 commented Jan 13, 2017

cacophany53 commented Jun 10, 2020

Per-example weights #2616

Per-example weights #2616

Comments

EdeMeijer commented Jan 4, 2017

AlexDBlack commented Jan 4, 2017

EdeMeijer commented Jan 4, 2017

fac2003 commented Jan 13, 2017

cacophany53 commented Jun 10, 2020