# Testing `TFNoiseAwareModel` in Jupyter Notebook

We'll start by testing the `textRNN` model on a categorical problem from `tutorials/crowdsourcing`.  In particular we'll test for (a) basic performance and (b) proper construction / re-construction of the TF computation graph both after (i) repeated notebook calls, and (ii) with `GridSearch` in particular.

In [1]:
%load_ext autoreload
%autoreload 2
%matplotlib inline

import os
os.environ['SNORKELDB'] = 'sqlite:///{0}{1}crowdsourcing.db'.format(os.getcwd(), os.sep)

from snorkel import SnorkelSession
session = SnorkelSession()

### Load candidates and training marginals

In [2]:
from snorkel.models import candidate_subclass
from snorkel.contrib.models.text import RawText
Tweet = candidate_subclass('Tweet', ['tweet'], cardinality=5)
train_tweets = session.query(Tweet).filter(Tweet.split == 0).order_by(Tweet.id).all()
len(train_tweets)

568

In [6]:
from snorkel.annotations import load_marginals
train_marginals = load_marginals(session, train_tweets, split=0)
train_marginals.shape

(568, 5)

### Train basic LSTM

In [8]:
from snorkel.contrib.rnn import TextRNN

train_kwargs = {
    'lr':         0.01,
    'dim':        100,
    'n_epochs':   50,
    'dropout':    0.2,
    'print_freq': 5
}
lstm = TextRNN(seed=1701, n_threads=None)
lstm.train(train_tweets, train_marginals, **train_kwargs)

[textRNN] Dimension=100  LR=0.01
[textRNN] Begin preprocessing
[textRNN] Preprocessing done (1.43s)
[textRNN] Training model
[textRNN] #examples=568  #epochs=50  batch size=256
[textRNN] Epoch 0 (1.24s)	Average loss=1.543260
[textRNN] Epoch 5 (3.00s)	Average loss=0.209871
[textRNN] Epoch 10 (4.74s)	Average loss=0.045068
[textRNN] Epoch 15 (6.50s)	Average loss=0.034457
[textRNN] Epoch 20 (8.67s)	Average loss=0.041505
[textRNN] Epoch 25 (10.53s)	Average loss=0.030682
[textRNN] Epoch 30 (12.24s)	Average loss=0.029904
[textRNN] Epoch 35 (14.00s)	Average loss=0.028201
[textRNN] Epoch 40 (15.74s)	Average loss=0.026489
[textRNN] Epoch 45 (17.58s)	Average loss=0.029394
[textRNN] Epoch 49 (19.09s)	Average loss=0.025847
[textRNN] Training done (19.09s)


In [10]:
import numpy as np
test_tweets = session.query(Tweet).filter(Tweet.split == 1).order_by(Tweet.id).all()
test_labels = np.load('crowdsourcing_test_labels.npy')
correct, incorrect = lstm.score(session, test_tweets, test_labels)
acc = len(correct) / float(len(correct) + len(incorrect))
assert acc > 0.60

Accuracy: 0.6875


### Run `GridSearch`