Update GridSearch and add to tutorial #439

henryre · 2016-09-24T04:20:28Z

Berg gets back on the board

Fixes #436

ajratner

Nice! Two overall comments (besides inline):

Need documentation of the optional params in the doc string so we know what they are :)
Let's just make this the default in the tutorial?

ajratner · 2016-09-24T05:00:12Z

snorkel/learning.py

@@ -97,7 +97,7 @@ def predict(self, X, b=0.5):
        """Return numpy array of elements in {-1,0,1} based on predicted marginal probabilities."""
        return np.array([1 if p > b else -1 if p < b else 0 for p in self.marginals(X)])

-    def score(self, X_test, L_test, gold_candidate_set, b=0.5, set_unlabeled_as_neg=True):
+    def score(self, X_test, L_test, gold_candidate_set, b=0.5, set_unlabeled_as_neg=True, disp=True):


display for readability?

ajratner · 2016-09-24T05:09:21Z

snorkel/learning_utils.py

+    def search_space(self):
+        return product(*self.param_val_ranges)
+
+    def fit(self, X_test, L_test, gold_candidate_set, b=0.5, set_unlabeled_as_neg=True, **model_hyperparams):


Some minor things:

I would call e.g. X_validation to make it clear that they shouldn't use the test set here?

L_test is a vector of labels, so should be e.g. validation_labels (upper case single letter reserved for matrices)

ajratner · 2016-09-24T05:11:47Z

snorkel/learning_utils.py

-            p, r, f1 = scores[:3]
+            tp, fp, tn, fn = self.model.score(X_test, L_test, gold_candidate_set, b, set_unlabeled_as_neg, disp=False)
+            p, r = float(len(tp)) / (len(tp) + len(fp)), float(len(tp)) / (len(tp) + len(fn))
+            f1 = 2.0 * (p * r) / (p + r) if (p + r) > 0 else 0


We should put these as standard utility fns

ajratner · 2016-09-24T05:15:11Z

snorkel/learning_utils.py

-
+
+
+class RandomSearch(GridSearch):


Minor: I was thinking RandomSearch would be over a continuous space of param values, rather than filtering a set of given discreet values as here; I think that would be more convenient because then the user only needs to specify a range rather than a set of values?

Shhhhhhhhh values

Well played

henryre · 2016-09-25T19:17:51Z

Ok @ajratner changes made

ajratner · 2016-09-26T07:10:32Z

@henryre looks good, added a bit of text in tutorial. Two questions remaining:

What's default if they don't have any dev labels? What were you doing before (we can discuss this in person)
Did you check to make sure adding this in preserves or improves the tutorial score (if not, we need to debug)?

henryre · 2016-09-26T16:15:09Z

@ajratner

Let's add the no labels option later. Needs a discussion about best practices. Previous solution was: pick a random hold out set, then train the models and select the w s.t. sqrt(np.mean((p - 0.5)^2)) is maximized where p were the predicted marginals for the hold out set (i.e. measures how spread out the marginals are)
90% certain it does improve performance (from memory), but will double check shortly

Update: 0.4 F1 from non-grid-search tutorial, 0.667 for grid-search tutorial

henryre · 2016-09-26T16:16:43Z

lol progressbar broke the build

stephenbach · 2016-09-26T16:31:50Z

Might be helpful. Same issue: keras-team/keras#2110

…into grid-search

ajratner · 2016-09-26T23:24:57Z

So let's just put in how to proceed without grid search (if you don't have dev set labeled)- this will preempt a lot of questions!

henryre · 2016-09-27T04:08:20Z

Dev set F1: 0.4 -> 0.57
Test set F1: 0.7 -> 0.63

ajratner · 2016-09-27T05:25:15Z

LGTM- will merge in after Travis passes; we should merge into dev and master

Update GridSearch and add to tutorial

10519a9

henryre assigned stephenbach and ajratner Sep 24, 2016

Fix F1 calc for 0 P/R

3461d9e

ajratner requested changes Sep 24, 2016

View reviewed changes

Add Parameter classes

b945bc9

henryre and others added 2 commits September 25, 2016 15:36

Fix for 0 F1

4ef2d1f

Minor edits to tutorial text

6b9a323

henryre added 3 commits September 26, 2016 14:27

Restore threshold

9735133

Merge branch 'grid-search' of https://github.com/HazyResearch/snorkel …

2b4c6a0

…into grid-search

Line changes

1f24c12

henryre added 2 commits September 26, 2016 16:44

Comment about no validation set

eb16544

Set random seed for reproducibility

caff820

Editing tutorial text

ecb2111

ajratner approved these changes Sep 27, 2016

View reviewed changes

Trying to fix travis build, failing on conda theano install...

9241993

ajratner merged commit 3b8154b into dev Sep 27, 2016

ajratner deleted the grid-search branch September 27, 2016 07:29

henryre mentioned this pull request Sep 27, 2016

GridSearch back up and running in a tutorial #436

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update GridSearch and add to tutorial #439

Update GridSearch and add to tutorial #439

henryre commented Sep 24, 2016

ajratner left a comment

ajratner Sep 24, 2016

ajratner Sep 24, 2016 •

edited

Loading

ajratner Sep 24, 2016

ajratner Sep 24, 2016

henryre Sep 25, 2016

ajratner Sep 26, 2016

henryre commented Sep 25, 2016

ajratner commented Sep 26, 2016

henryre commented Sep 26, 2016 •

edited

Loading

henryre commented Sep 26, 2016

stephenbach commented Sep 26, 2016

ajratner commented Sep 26, 2016

henryre commented Sep 27, 2016

ajratner commented Sep 27, 2016




		class RandomSearch(GridSearch):

Update GridSearch and add to tutorial #439

Update GridSearch and add to tutorial #439

Conversation

henryre commented Sep 24, 2016

ajratner left a comment

Choose a reason for hiding this comment

ajratner Sep 24, 2016

Choose a reason for hiding this comment

ajratner Sep 24, 2016 • edited Loading

Choose a reason for hiding this comment

ajratner Sep 24, 2016

Choose a reason for hiding this comment

ajratner Sep 24, 2016

Choose a reason for hiding this comment

henryre Sep 25, 2016

Choose a reason for hiding this comment

ajratner Sep 26, 2016

Choose a reason for hiding this comment

henryre commented Sep 25, 2016

ajratner commented Sep 26, 2016

henryre commented Sep 26, 2016 • edited Loading

henryre commented Sep 26, 2016

stephenbach commented Sep 26, 2016

ajratner commented Sep 26, 2016

henryre commented Sep 27, 2016

ajratner commented Sep 27, 2016

ajratner Sep 24, 2016 •

edited

Loading

henryre commented Sep 26, 2016 •

edited

Loading