LabelModel produces equal probability for labeled data #1422

jamie0725 · 2019-08-21T08:20:51Z

Issue description

I am using snorkel for creating binary text classification training examples with 9 labeling functions. However, I find that for some data that are trained with the label model, they receive a probabilistic label of equal probabilities (i.e. [0.5 0.5]), despite that they only receive label of one class from the labeling functions (e.g. [-1 0 -1 -1 0 -1 -1 -1 -1], so only class 0 or ABSTAIN), why is that?

Besides, I find that setting verbose = True when defining the LabelModel does not print the logging information.

Lastly, if producing the label [0.5 0.5] is a normal behavior, then when filtering out unlabeled data points, they should be removed as well, because they do not contribute to the training (of a classifier), and if the classifier that does not support probablistic labels (e.g. FastText), using argmax will always lead to class 0 (which is undesired).

Code example/repro steps

Below I show my code of defining LabelModel:

print('==================================================')
print('Training label model...')
label_model = LabelModel(cardinality=2, verbose=True)
label_model.fit(L_train=L_train, n_epochs=10000, lr=0.001, log_freq=100, seed=1)
print('Done!')

Below I show some of the logs I print:

Check results for data in the training set:  a_text_in_the_training_set_but_I_removed_here
        * Output of L_train for this data point is: [-1  0 -1 -1  0 -1 -1 -1 -1]
        * Output of probs_train for this data point is: [0.5 0.5]
        * Output of probs_train_filtered for this data point is: [0.5 0.5]
        * Output of preds_train_filtered for this data point is: 0

Expected behavior

I expect that if a datapoint receives label of only a single class, then it should not get equal probabilities of both classes when the label model is trained.

System info

How you installed Snorkel (conda, pip, source): pip
Build command you used (if compiling from source):
OS: Windows
Python version: 3.7
Snorkel version: 0.9
Versions of any other relevant libraries:

The text was updated successfully, but these errors were encountered:

paroma · 2019-08-21T18:32:13Z

Thanks for pointing this out!

For logging, the default sets logging on only for errors or warnings. The train statements are recorded as logging.info() and therefore don't show up even with verbose=True. In order to see these messages, you can add the following to the top of the file you're calling LabelModel from:

import logging
logging.basicConfig(level=logging.INFO)

I tested with a few L matrices where a datapoint either receives a label from one class or an abstain and was unable to reproduce the issue. Is there a way you can share your L matrix with us? A few things you can look at (now that logging works!) is looking at whether the loss is changing with the learning rate you have.
In terms of filtering out examples with equal probabilities for all classes, calling label_model.predict(L, tie_break_policy="abstain") will return -1 for datapoints with equal probabilities across different classes. Options for tie_break_policy described here.

Example:

label_model = LabelModel(cardinality=2, verbose=True)
L = np.array([[0, 1], [0, 1]])
label_model.fit(L)

label_model.predict_proba(L)
# array([[0.5, 0.5], [0.5, 0.5]])

label_model.predict(L, tie_break_policy="abstain")
# array([-1, -1])

jamie0725 · 2019-08-22T02:22:00Z

I took a look at the logs, the results are very strange:

Training label model...
INFO:root:Computing O...
INFO:root:Estimating \mu...
INFO:root:[0 epochs]: TRAIN:[loss=0.003]
INFO:root:[100 epochs]: TRAIN:[loss=0.000]
INFO:root:[200 epochs]: TRAIN:[loss=0.000]
INFO:root:[300 epochs]: TRAIN:[loss=0.000]
INFO:root:[400 epochs]: TRAIN:[loss=0.000]
INFO:root:[500 epochs]: TRAIN:[loss=0.000]
INFO:root:[600 epochs]: TRAIN:[loss=0.000]
INFO:root:[700 epochs]: TRAIN:[loss=0.000]
INFO:root:[800 epochs]: TRAIN:[loss=0.000]
INFO:root:[900 epochs]: TRAIN:[loss=0.000]
INFO:root:Finished Training
Done!

So, the loss becomes 0 since ~ epoch 100.

vtang13 · 2019-08-23T13:07:26Z

I have a similar issue with probs_train outputting [0.5, 0.5]. The prob for a 1 label never goes above 0.5.

Prob	LFs
[0.5, 0.5]	[1, -1, -1, -1, 1, -1, -1]
[0.5, 0.5]	[1, -1, -1, -1, -1, -1, -1]
[0.743774, 0.256226]	[-1, -1, -1, -1, -1, 0, -1]

Here are the logs for training the LabelModel with roughly 10,000 data points.

INFO:root:Computing O...
INFO:root:Estimating \mu...
INFO:root:[0 epochs]: TRAIN:[loss=0.030]
INFO:root:[100 epochs]: TRAIN:[loss=0.004]
INFO:root:[200 epochs]: TRAIN:[loss=0.000]
INFO:root:[300 epochs]: TRAIN:[loss=0.000]
INFO:root:[400 epochs]: TRAIN:[loss=0.000]
INFO:root:[500 epochs]: TRAIN:[loss=0.000]
INFO:root:[600 epochs]: TRAIN:[loss=0.000]
INFO:root:[700 epochs]: TRAIN:[loss=0.000]
INFO:root:[800 epochs]: TRAIN:[loss=0.000]
INFO:root:[900 epochs]: TRAIN:[loss=0.000]
INFO:root:Finished Training

paroma · 2019-08-23T18:43:32Z

@vtang13 can you provide an example of the L matrix that will help reproduce this error? Can you also print L.shape to make sure it's shape is (n, m), where m is the number of LFs?

I tried creating a matrix with the LFs that you have and did not get the same results:

L = np.array([[1, -1, -1, -1, 1, -1, -1], 
              [1, -1, -1, -1, -1, -1, -1], 
              [-1, -1, -1, -1, -1, 0, -1]])

label_model = LabelModel(cardinality=2)
label_model.fit(L)
label_model.predict_proba(L)

# array([[0.019964  , 0.980036  ],
#       [0.1229839 , 0.8770161 ],
#       [0.98669756, 0.01330244]])

vtang13 · 2019-08-23T19:01:56Z

Hi @paroma

>> L_train.shape
(9671, 7)

It looks like the issue may occur when there are too many negative examples. It doesn't occur when the dataset is more balanced. Below is a subset of my L_train with more balanced positive/negative examples.

L_train = np.asarray([[-1, -1, -1, -1, -1, -1, -1],[-1, -1, -1, -1, -1, 0, -1],[-1, -1, -1, -1, -1, -1, -1],[-1, -1, -1, -1, -1, -1, -1],[-1, -1, -1, -1, -1, -1, -1],[-1, -1, -1, -1, -1, 0, -1],[-1, -1, -1, -1, -1, -1, -1],[-1, -1, -1, -1, -1, 0, -1],[-1, -1, -1, -1, -1, -1, -1],[-1, -1, -1, -1, -1, -1, -1],[-1, -1, -1, -1, -1, -1, -1],[-1, -1, -1, -1, -1, -1, -1],[-1, -1, -1, -1, -1, -1, -1],[-1, -1, -1, -1, -1, -1, -1],[1, -1, -1, -1, 1, -1, -1],[1, -1, -1, -1, -1, -1, -1],[1, -1, -1, -1, -1, -1, -1],[1, -1, -1, -1, 1, -1, -1],[1, -1, -1, -1, -1, -1, -1],[1, -1, -1, -1, -1, -1, -1],[1, -1, -1, -1, -1, -1, -1],[1, -1, -1, -1, -1, -1, -1],[1, -1, -1, -1, -1, -1, -1],[1, -1, -1, -1, 1, -1, -1],[1, -1, -1, -1, -1, -1, -1],[1, -1, -1, -1, -1, -1, -1],[1, -1, -1, -1, -1, -1, -1],[1, -1, -1, -1, 1, -1, -1]])

Training results:

>> label_model.fit(L_train=test_train, n_epochs=1000, lr=0.001, log_freq=100, seed=123)
INFO:root:Computing O...
INFO:root:Estimating \mu...
INFO:root:[0 epochs]: TRAIN:[loss=0.098]
INFO:root:[100 epochs]: TRAIN:[loss=0.014]
INFO:root:[200 epochs]: TRAIN:[loss=0.004]
INFO:root:[300 epochs]: TRAIN:[loss=0.003]
INFO:root:[400 epochs]: TRAIN:[loss=0.002]
INFO:root:[500 epochs]: TRAIN:[loss=0.001]
INFO:root:[600 epochs]: TRAIN:[loss=0.001]
INFO:root:[700 epochs]: TRAIN:[loss=0.001]
INFO:root:[800 epochs]: TRAIN:[loss=0.001]
INFO:root:[900 epochs]: TRAIN:[loss=0.001]
INFO:root:Finished Training

probs	LFs
[0.5 0.5]	[-1 -1 -1 -1 -1 -1 -1]
[0.95779496 0.04220504]	[-1 -1 -1 -1 -1 0 -1]
[0.5 0.5]	[-1 -1 -1 -1 -1 -1 -1]
[0.5 0.5]	[-1 -1 -1 -1 -1 -1 -1]
[0.5 0.5]	[-1 -1 -1 -1 -1 -1 -1]
[0.95779496 0.04220504]	[-1 -1 -1 -1 -1 0 -1]
[0.5 0.5]	[-1 -1 -1 -1 -1 -1 -1]
[0.95779496 0.04220504]	[-1 -1 -1 -1 -1 0 -1]
[0.5 0.5]	[-1 -1 -1 -1 -1 -1 -1]
[0.5 0.5]	[-1 -1 -1 -1 -1 -1 -1]
[0.5 0.5]	[-1 -1 -1 -1 -1 -1 -1]
[0.5 0.5]	[-1 -1 -1 -1 -1 -1 -1]
[0.5 0.5]	[-1 -1 -1 -1 -1 -1 -1]
[0.5 0.5]	[-1 -1 -1 -1 -1 -1 -1]
[0.00618187 0.99381813]	[ 1 -1 -1 -1 1 -1 -1]
[0.17381922 0.82618078]	[ 1 -1 -1 -1 -1 -1 -1]
[0.17381922 0.82618078]	[ 1 -1 -1 -1 -1 -1 -1]
[0.00618187 0.99381813]	[ 1 -1 -1 -1 1 -1 -1]
[0.17381922 0.82618078]	[ 1 -1 -1 -1 -1 -1 -1]
[0.17381922 0.82618078]	[ 1 -1 -1 -1 -1 -1 -1]
[0.17381922 0.82618078]	[ 1 -1 -1 -1 -1 -1 -1]
[0.17381922 0.82618078]	[ 1 -1 -1 -1 -1 -1 -1]
[0.17381922 0.82618078]	[ 1 -1 -1 -1 -1 -1 -1]
[0.00618187 0.99381813]	[ 1 -1 -1 -1 1 -1 -1]
[0.17381922 0.82618078]	[ 1 -1 -1 -1 -1 -1 -1]
[0.17381922 0.82618078]	[ 1 -1 -1 -1 -1 -1 -1]
[0.17381922 0.82618078]	[ 1 -1 -1 -1 -1 -1 -1]
[0.00618187 0.99381813]	[ 1 -1 -1 -1 1 -1 -1]

paroma · 2019-08-23T19:20:54Z

This looks like the expected output! If you think it is an issue related to class balance, you can try passing in Y_dev or class_balance to the fit() method to see if that helps with the estimates. See examples and usage here

vtang13 · 2019-08-23T19:24:13Z

Thanks @paroma, I don't have labelled data yet but I will try those parameters once I do.

Here is the full L_train to reproduce the issue: https://pastebin.com/02mEznra

paroma · 2019-08-23T20:29:18Z

Thank you for access to the L matrix! Looking at the full L_train and running LFAnalysis(L_train).lf_summary() suggests a few things:

the LFs at index 1,2,3 are always abstaining. These provide no information for the LabelModel to learn from
other than the LF at index 5, the rest of the LFs have very low coverage, labeling only 3-15 datapoints out of almost 10,000.
The lack of significant overlap and no conflict among the LFs, in addition to the low coverage overall, results in the LabelModel not learning accurate weights for the LFs. Therefore, for 17 datapoints that do receive 1-2 positive labels, the LabelModel continues to default to the class balance due to the lack of confidence in the assigned labels.

vtang13 · 2019-08-26T12:27:35Z

That's very informative. Thanks @paroma for looking into this.

ajratner · 2019-09-01T02:20:15Z

Just quickly tacking onto the great answer from @paroma

One thing we are working on is making sure that the LabelModel always reverts to sensible defaults, even when it's in a setting outside of where our theory tells us it should work (e.g. like the above, for reasons @paroma detailed).

For example, in your setting we probably would want to have a higher prior weight on the LFs over the class balance... we'll iterate here and push some updates soon!

ajratner · 2019-09-01T02:21:42Z

(Note: Marking "feature request" for our reference, as technically the LabelModel seems to be working fine... but we need to add the feature of more sensible defaults in settings such as these)

xsway · 2019-09-02T14:50:20Z

Hi, I have observed the same behavior of the LabelModel: one LF (out of 15) which was highly accurate but had low coverage was ignored (resulting in 0.5 0.5 probabilities).

In fact, I wanted to ask whether it would be possible to specify by-hand some priors about LFs in the LabelModel. In my case, I'm 99% sure that this LF gives the correct label even if it's low coverage and I'd like LabelModel not to lose this knowledge.

ajratner · 2019-09-04T20:09:07Z

Hi @s2948044 @vtang13 @xsway thanks first of all for bringing this issue to light in such detail, and @vtang13 for sharing your label matrix!

It turns out that this is actually a bug due to incorrect parameter clamping; I've corrected this (it's a one line fix more or less), confirmed on @vtang13's label matrix, and on a new synthetic test that replicates the problem and confirms the solution. PR being submitted right now! Thank you all for the great help here!!

Just in case anyone is curious... the LabelModel learns the conditional probabilities of the labeling functions outputting certain labels given certain true (but unobserved or latent) true labels, e.g. P(\lambda = x | Y = y). We also clamp these estimates to some min / max values just to guard against pathological errors... but as one might guess, when P(\lambda = x | Y = y) is really small- for example in the common sparse label setting where LFs mostly abstain- then clamping at too high of a minimum value messes everything up- we basically save clamped parameters that say these sparse LFs are equally likely to be wrong versus right! Luckily, this is fixed by changing the clamping parameter, which is now a kwarg in LabelModel.fit, and defaults to a much smaller value.

Note that what @paroma said above about the LabelModel not being able to learn much if the LFs don't label and overlap enough is still true- but it shouldn't result in a nonsensical output like you all were observing. This should now be fixed once the PR is merged in today :)

Note also that we'll be steadily pushing extensions, improvements, and additions to the LabelModel, so stay tuned! For example @xsway your request for specifying per-LF priors is on our list (was in an old branch somewhere, just need to port it).

xsway · 2019-09-05T11:24:34Z

Thanks a lot for addressing this issue so fast! I checked the updated code now and it works as expected, that is, I get > 0.5 probability for the positive class for the cases where a low-coverage LF was activated (and where before I had 0.5 probability).

vtang13 · 2019-09-05T12:48:00Z

Thanks @ajratner for the detailed explanation and speedy resolution!

ajratner · 2019-09-05T18:13:29Z

Thanks for pointing this out and helping us to improve the new version! :)

vincentschen assigned paroma Aug 21, 2019

paroma mentioned this issue Aug 21, 2019

Fix doctoring in LabelModel #1424

Merged

4 tasks

paroma mentioned this issue Aug 23, 2019

Bug? single labeling function ignored #1426

Closed

paroma mentioned this issue Aug 30, 2019

negative predictions for data point with positive labels #1437

Closed

ajratner added the feature request label Sep 1, 2019

ajratner assigned ajratner and unassigned paroma Sep 4, 2019

ajratner added bug and removed feature request labels Sep 4, 2019

ajratner mentioned this issue Sep 4, 2019

LabelModel parameter clamping fix (fixes issue 1422) #1444

Merged

4 tasks

ajratner closed this as completed in #1444 Sep 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LabelModel produces equal probability for labeled data #1422

LabelModel produces equal probability for labeled data #1422

jamie0725 commented Aug 21, 2019 •

edited

paroma commented Aug 21, 2019 •

edited

jamie0725 commented Aug 22, 2019 •

edited

vtang13 commented Aug 23, 2019 •

edited

paroma commented Aug 23, 2019 •

edited

vtang13 commented Aug 23, 2019 •

edited

paroma commented Aug 23, 2019

vtang13 commented Aug 23, 2019

paroma commented Aug 23, 2019

vtang13 commented Aug 26, 2019

ajratner commented Sep 1, 2019

ajratner commented Sep 1, 2019

xsway commented Sep 2, 2019

ajratner commented Sep 4, 2019

xsway commented Sep 5, 2019 •

edited

vtang13 commented Sep 5, 2019

ajratner commented Sep 5, 2019

LabelModel produces equal probability for labeled data #1422

LabelModel produces equal probability for labeled data #1422

Comments

jamie0725 commented Aug 21, 2019 • edited

Issue description

Code example/repro steps

Expected behavior

System info

paroma commented Aug 21, 2019 • edited

jamie0725 commented Aug 22, 2019 • edited

vtang13 commented Aug 23, 2019 • edited

paroma commented Aug 23, 2019 • edited

vtang13 commented Aug 23, 2019 • edited

paroma commented Aug 23, 2019

vtang13 commented Aug 23, 2019

paroma commented Aug 23, 2019

vtang13 commented Aug 26, 2019

ajratner commented Sep 1, 2019

ajratner commented Sep 1, 2019

xsway commented Sep 2, 2019

ajratner commented Sep 4, 2019

xsway commented Sep 5, 2019 • edited

vtang13 commented Sep 5, 2019

ajratner commented Sep 5, 2019

jamie0725 commented Aug 21, 2019 •

edited

paroma commented Aug 21, 2019 •

edited

jamie0725 commented Aug 22, 2019 •

edited

vtang13 commented Aug 23, 2019 •

edited

paroma commented Aug 23, 2019 •

edited

vtang13 commented Aug 23, 2019 •

edited

xsway commented Sep 5, 2019 •

edited