Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catch class balance errors and test L matrix edge cases #1449

Merged
merged 6 commits into from
Sep 6, 2019
Merged

Conversation

paroma
Copy link
Contributor

@paroma paroma commented Sep 5, 2019

Description of proposed changes

  • Add tests for checking LabelModel behavior with limited conflict and overlap in L matrix
  • Catch errors in class balance format in LabelModel

Related issue(s)

None

Test plan

  • Add tests for class balance changes
  • Add tests for L matrices with limited overlap or conflict
  • Fix existing test to change 0 to -1 for abstains

Checklist

  • I have read the CONTRIBUTING document.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@codecov
Copy link

codecov bot commented Sep 5, 2019

Codecov Report

Merging #1449 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #1449      +/-   ##
==========================================
+ Coverage   97.55%   97.55%   +<.01%     
==========================================
  Files          55       55              
  Lines        2002     2008       +6     
  Branches      328      331       +3     
==========================================
+ Hits         1953     1959       +6     
  Misses         22       22              
  Partials       27       27
Impacted Files Coverage Δ
snorkel/labeling/model/label_model.py 95.35% <100%> (+0.1%) ⬆️

with self.assertRaisesRegex(ValueError, "L_train should have at least 3"):
label_model.fit(L, n_epochs=1)

def test_mv_default(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what change is this testing? do we actually add mv as a default anywhere here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's checking for degenerate L matrices (aka low overlap and no conflicts) that the label model should internally default to the predictions MV would assign. it doesn't explicitly call majority vote anywhere

label_model._set_class_balance(class_balance=class_balance, Y_dev=Y_dev)

Y_dev_one_class = np.array([0, 0, 0])
with self.assertRaisesRegex(ValueError, "Y_dev has"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: change substring to "Does not match LabelModel cardinality"

)
if len(self.p) != self.cardinality:
raise ValueError(
f"Y_dev has {len(self.p)} class(es). Does not match LabelModel cardinality {self.cardinality}."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could also be because class_balance is the wrong size

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this error is also confusing for the Y_dev case. what's happening is that we're saying some of the class priors are 0 (by virtue of not being in Y_dev) which is the same as the above ValueError. rec: we raise a separate message under each of the if/else block (under if class_balance is not None: if it's the wrong size, then something likeClass balance prior is 0 in the other two cases.

Copy link
Member

@henryre henryre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@paroma paroma merged commit d6e79d1 into master Sep 6, 2019
@paroma paroma deleted the lm-degen branch September 6, 2019 01:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants