# An Introduction to WISER, Part 2: Generative Models

In this part of the tutorial, we will take the results of the labeling functions from part 1 and learn a generative model that combines them.

We will start by reloading the data with the labeling function outputs from part 1.

## Reloading Data

In [1]:
import pickle

with open('output/tmp/train_data.p', 'rb') as f:
    train_data = pickle.load(f)

with open('output/tmp/dev_data.p', 'rb') as f:
    dev_data = pickle.load(f)
    
with open('output/tmp/dev_data.p', 'rb') as f:
    test_data = pickle.load(f)

## Reinspecting Data

We can view the data again with all of the tagging rule annotations.

In [2]:
from wiser.viewer import Viewer

Viewer(dev_data, height=100)

<IPython.core.display.Javascript object>

Viewer(html='<head>\n<style>\nspan.active {\n    background-color: skyblue;\n    box-shadow: 1px 1px 1px grey;…

# Generative Model

To aggregate the tagging and linking rules, we will need to train a generative model. We begin my loading generative a dictionary of true and voted labels, and a set of all tagging and linking rules.

In [3]:
from wiser.generative import get_label_to_ix, get_rules

gen_label_to_ix, disc_label_to_ix = get_label_to_ix(train_data)
tagging_rules, linking_rules = get_rules(train_data)

## Defining a Generative Model

We now need to declare a generative model. In this tutorial, we will be using the *linked HMM*, a model that makes use of linking rules to model dependencies between adjaent tokens. You can use other existing generative models available at `labelmodels`. For more details on generative models and their hyperparameters, please refer to our paper.

In [4]:
from labelmodels import LinkedHMM, NaiveBayes

# TODO: Change num_labeling_funcs to num_tagging_rules
link_hmm = LinkedHMM(len(gen_label_to_ix)-1, len(tagging_rules), len(linking_rules), init_acc=0.9, acc_prior=100, balance_prior=500)

## Training a Generative Model

Once we're done creating our generative model, we're ready to begin training!

In [5]:
from wiser.generative import train_generative_model
from labelmodels import LearningConfig

config = LearningConfig()
p, r, f1 = train_generative_model(link_hmm, train_data, dev_data, epochs=1, label_to_ix=gen_label_to_ix, config=config)

## Evaluating a Generative Model

We can easily evaluate the performance of any generative model using the function ``evaluate_generative_model`` available at ``wiser.generative``. Here, we evaluate our *linked HMM* on the test set. (TODO: change dev to test)

In [6]:
from wiser.generative import evaluate_generative_model

evaluate_generative_model(model=link_hmm, data=dev_data, label_to_ix=gen_label_to_ix)

Unnamed: 0,TP,FP,FN,P,R,F1
Predictions,735,240,877,0.7538,0.456,0.5682


## Saving the Output of the Generative Model

After implementing your generative model, you need to save its probabilistic training labels. We will use these labels in the next part of the tutorial to train a recurrent neural network.

In [7]:
from wiser.data import save_label_distribution
from wiser.eval import get_generative_model_inputs

inputs = get_generative_model_inputs(train_data, gen_label_to_ix)
p_unary, p_pairwise = link_hmm.get_label_distribution(*inputs)
save_label_distribution('output/generative/link_hmm/train_data.p', train_data, p_unary, p_pairwise, gen_label_to_ix, disc_label_to_ix)
save_label_distribution('output/generative/link_hmm/dev_data.p', dev_data)
save_label_distribution('output/generative/link_hmm/test_data.p', test_data)