Ignore abstains in Scorer, change LabelModel default tie break policy #1450

paroma · 2019-09-05T23:03:11Z

Description of proposed changes

Change Scorer default to ignore abstains in preds
Change LabelModel tie break policy default to abstain (instead of random)
Log warning when calling LabelModel score() function

Test plan

Add tests in LabelModel for predict() and score() functions related to abstain default
Add test for Scorer to check abstains ignored in preds by default

Checklist

I have read the CONTRIBUTING document.
I have updated the documentation accordingly.
I have added tests to cover my changes.
All new and existing tests passed.

codecov · 2019-09-05T23:12:59Z

Codecov Report

Merging #1450 into master will not change coverage.
The diff coverage is 100%.

@@           Coverage Diff           @@
##           master    #1450   +/-   ##
=======================================
  Coverage   97.58%   97.58%           
=======================================
  Files          55       55           
  Lines        2029     2029           
  Branches      334      334           
=======================================
  Hits         1980     1980           
  Misses         22       22           
  Partials       27       27

Impacted Files	Coverage Δ
snorkel/labeling/model/label_model.py	`95.76% <ø> (ø)`	⬆️
snorkel/analysis/scorer.py	`100% <100%> (ø)`	⬆️

bhancock8 · 2019-09-10T05:22:38Z

snorkel/labeling/model/label_model.py

@@ -472,6 +472,11 @@ def score(
        >>> label_model.score(L, Y=np.array([1, 1, 1]), metrics=["f1"])
        {'f1': 0.8}
        """
+        if tie_break_policy == "abstain":  # pragma: no cover
+            logging.warning(
+                "Metrics calculated over datapoints with non-abstain labels only"


Nit: we've been using data points (2 words)

bhancock8 · 2019-09-10T05:24:57Z

test/analysis/test_scorer.py

+        # Test abstain=-1 for preds and gold
+        abstain_preds = np.array([-1, -1, 1, 1, 0])
+        abstain_probs = np.array([0.5, 0.5, 0.9, 0.7, 0.4])
+        results = scorer.score(golds, abstain_preds, abstain_probs)


No need to pass in probs here. They're optional, and you're only calculating accuracy in this scorer, which just requires golds and preds.

bhancock8 · 2019-09-10T05:26:41Z

test/labeling/model/test_label_model.py

@@ -209,6 +209,13 @@ def test_predict_proba(self):
        np.testing.assert_array_almost_equal(probs, true_probs)

    def test_predict(self):
+        L = np.array([[-1, 1, 0], [0, -1, 1], [1, 0, -1]])


Can we add a simple comment here that this test is confirming that with 3 LFs that counteract one another results in tie votes and therefore abstains on all points?

bhancock8

Changes lgtm! Ship it!

paroma requested review from vincentschen, ajratner, henryre and bhancock8 and removed request for ajratner and henryre September 5, 2019 23:17

bhancock8 requested changes Sep 10, 2019

View reviewed changes

paroma added 4 commits September 10, 2019 17:53

default to ignore abstains in pred

9f8e5fe

add scorer test

0c14633

minor logging change

02d898c

BH comments

7793f0d

paroma force-pushed the tie-break branch from 3bc0fc1 to 7793f0d Compare September 10, 2019 17:54

paroma requested a review from bhancock8 September 10, 2019 18:13

bhancock8 approved these changes Sep 10, 2019

View reviewed changes

paroma merged commit 9af1c77 into master Sep 10, 2019

paroma deleted the tie-break branch September 10, 2019 18:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ignore abstains in Scorer, change LabelModel default tie break policy #1450

Ignore abstains in Scorer, change LabelModel default tie break policy #1450

paroma commented Sep 5, 2019

codecov bot commented Sep 5, 2019 •

edited

Loading

bhancock8 Sep 10, 2019

bhancock8 Sep 10, 2019

bhancock8 Sep 10, 2019

bhancock8 left a comment

Ignore abstains in Scorer, change LabelModel default tie break policy #1450

Ignore abstains in Scorer, change LabelModel default tie break policy #1450

Conversation

paroma commented Sep 5, 2019

Description of proposed changes

Test plan

Checklist

codecov bot commented Sep 5, 2019 • edited Loading

Codecov Report

bhancock8 Sep 10, 2019

Choose a reason for hiding this comment

bhancock8 Sep 10, 2019

Choose a reason for hiding this comment

bhancock8 Sep 10, 2019

Choose a reason for hiding this comment

bhancock8 left a comment

Choose a reason for hiding this comment

codecov bot commented Sep 5, 2019 •

edited

Loading