# Transform the data to work with Snorkel: Part 2 - Event Role

Here we will do most of the work creating a labeling model that assigns labels to argument roles in event mentions.
We need to create a row for each pair of trigger and entity mention.

For this we need to create 2 additional columns:
- trigger_id
- argument_id

Everything else we can pull from the other columns using Snorkel preprocessor functions.

In [1]:
import sys
sys.path.append("../")
import warnings
import pickle
from pathlib import Path
from wsee.utils import utils
from wsee.data import pipeline

warnings.filterwarnings(action='once')
DATA_DIR = '../data/daystream_corpus'  # replace path to corpus

### SD4M Relation/ Event Arguments

| Number | Code       | Description                                                                 |
|--------|------------|-----------------------------------------------------------------------------|
| -1     | ABSTAIN    | No vote, for Labeling Functions                                             |
| 0      | location   | Required argument for all events denoting the location.                     |
| 1      | delay      | Optional argument denoting the delay associated with the event.             |
| 2      | direction  | Optional argument denoting the direction associated with the event.         |
| 3      | start_loc  | Optional argument denoting the starting location associated with the event. |
| 4      | end_loc    | Optional argument denoting the ending location associated with the event.   |
| 5      | start_date | Optional argument denoting the start date associated with the event.        |
| 6      | end_date   | Optional argument denoting the end date associated with the event.          |
| 7      | cause      | Optional argument (trigger) denoting the cause associated with the event.   |
| 8      | jam_length | Optional argument denoting the jam length of a traffic jam event.           |
| 9      | route      | Optional argument denoting the route affected by a canceled stop event.     |
| 10     | no_arg     | No argument relation with the specified trigger.                            |

In [2]:
loaded_data = pipeline.load_data(DATA_DIR)
sd_train = loaded_data['train']
sd_dev = loaded_data['dev']
sd_test = loaded_data['test']

daystream = loaded_data['daystream']

INFO:wsee:Reading train data from: ../data/daystream_corpus/train/train_with_events_and_defaults.jsonl
INFO:wsee:Reading dev data from: ../data/daystream_corpus/dev/dev_with_events_and_defaults.jsonl
INFO:wsee:Reading test data from: ../data/daystream_corpus/test/test_with_events_and_defaults.jsonl
INFO:wsee:Reading daystream data from: ../data/daystream_corpus/daystream.jsonl


## Step 1: Create one row for each trigger-entity pair (event role)

In [3]:
dataframe_file = DATA_DIR + '/pickled_sd_train_role_examples'
pickled_dataframe_file = Path(dataframe_file + '.pkl')

df_sd_train = None
Y_sd_train = None

if pickled_dataframe_file.exists():
    with open(pickled_dataframe_file, 'rb') as pickled_dataframe:
        df_sd_train, Y_sd_train = pickle.load(pickled_dataframe)
else:
    df_sd_train, Y_sd_train = pipeline.build_event_role_examples(sd_train)
    with open(pickled_dataframe_file, 'wb') as pickled_dataframe:
        pickle.dump((df_sd_train, Y_sd_train), pickled_dataframe)

In [4]:
dataframe_file = DATA_DIR + '/pickled_sd_dev_role_examples'
pickled_dataframe_file = Path(dataframe_file + '.pkl')

df_sd_dev = None
Y_sd_dev = None

if pickled_dataframe_file.exists():
    with open(pickled_dataframe_file, 'rb') as pickled_dataframe:
        df_sd_dev, Y_sd_dev = pickle.load(pickled_dataframe)
else:
    df_sd_dev, Y_sd_dev = pipeline.build_event_role_examples(sd_dev)
    with open(pickled_dataframe_file, 'wb') as pickled_dataframe:
        pickle.dump((df_sd_dev, Y_sd_dev), pickled_dataframe)

In [5]:
from wsee import ROLE_LABELS
print(ROLE_LABELS)

['location', 'delay', 'direction', 'start_loc', 'end_loc', 'start_date', 'end_date', 'cause', 'jam_length', 'route', 'no_arg']


## Step 2: Explore the data

In [6]:
from wsee.preprocessors.preprocessors import *
from wsee.data import explore

We can apply all our preprocessors on our data and see if we can find something interesting for our labeling functions. Let's first sample the SD4M training data, which is labeled.

In [7]:
dataframe_file = DATA_DIR + '/pickled_labeled_sd4m_role_examples'
pickled_dataframe_file = Path(dataframe_file + '.pkl')

labeled_sd4m_roles = None

if pickled_dataframe_file.exists():
    with open(pickled_dataframe_file, 'rb') as pickled_dataframe:
        labeled_sd4m_roles = pickle.load(pickled_dataframe)
else:
    labeled_sd4m_roles = explore.add_labels(df_sd_train, Y_sd_train)
    labeled_sd4m_roles = explore.apply_preprocessors(labeled_sd4m_roles, [pre_between_tokens, pre_between_distance])
    labeled_sd4m_roles = explore.add_event_types(labeled_sd4m_roles)
    labeled_sd4m_roles = explore.add_event_arg_roles(labeled_sd4m_roles)
    with open(pickled_dataframe_file, 'wb') as pickled_dataframe:
        pickle.dump(labeled_sd4m_roles, pickled_dataframe)

Let's first take a look at the trigger and argument text, and the entity types!

In [8]:
import pandas as pd
pd.set_option('display.max_colwidth', -1)

In [9]:
explore.sample_data(labeled_sd4m_roles[labeled_sd4m_roles['label']==6], sample_size=2, columns=['text', 'between_tokens', 'trigger', 'argument', 'between_distance', 'label', 'event_types', 'event_arg_roles'])

Unnamed: 0,text,between_tokens,trigger,argument,between_distance,label,event_types,event_arg_roles
2970,RT @DB_Info: Ersatzverkehr auf der Linie RE 4 zwischen Torgelow und Ueckermünde Stadthafen vom 2. September bis 12. D... http://t.co/pkt1ws…\n,"[auf, der, Linie, RE, 4, zwischen, Torgelow, und, Ueckermünde, Stadthafen, vom, 2, ., September, bis]","{'id': 'c/99a77412-27d3-4fa2-9f28-4d676d60fb43', 'text': 'Ersatzverkehr', 'entity_type': 'trigger', 'start': 3, 'end': 4, 'char_start': 13, 'char_end': 26}","{'id': 'c/6d34afb8-fd81-4569-81ed-fec32557ab2b', 'text': '12.', 'entity_type': 'date', 'start': 19, 'end': 21, 'char_start': 112, 'char_end': 115}",15,6,"[(Ersatzverkehr, (13, 26), 5)]","[((Ersatzverkehr, (13, 26), 5), (RE 4, location_route, (41, 45)), 0), ((Ersatzverkehr, (13, 26), 5), (Torgelow, location_stop, (55, 63)), 3), ((Ersatzverkehr, (13, 26), 5), (Ueckermünde Stadthafen, location_stop, (68, 90)), 4), ((Ersatzverkehr, (13, 26), 5), (2. September, date, (95, 107)), 5), ((Ersatzverkehr, (13, 26), 5), (12., date, (112, 115)), 6)]"
317,■ #Hamburg: Die Bahrenfelder Chaussee ist stadteinwärts zwischen Theodorstraße und Von-Sauer-Straße bis Ende Juli gesperrt. ...\n,[],"{'id': 'c/181d08f1-287f-42c8-8358-fb3010989025', 'text': 'gesperrt', 'entity_type': 'trigger', 'start': 19, 'end': 20, 'char_start': 115, 'char_end': 123}","{'id': 'c/94a5ee82-de5b-471d-9b5f-7335d208c0ff', 'text': 'Ende Juli', 'entity_type': 'date', 'start': 17, 'end': 19, 'char_start': 105, 'char_end': 114}",0,6,"[(gesperrt, (115, 123), 4)]","[((gesperrt, (115, 123), 4), (Bahrenfelder Chaussee, location_street, (17, 38)), 0), ((gesperrt, (115, 123), 4), (Theodorstraße, location_street, (66, 79)), 3), ((gesperrt, (115, 123), 4), (Von-Sauer-Straße, location_street, (84, 100)), 4), ((gesperrt, (115, 123), 4), (Ende Juli, date, (105, 114)), 6)]"


Now we can collect the most frequent trigger-argument pairs per class.

In [10]:
n = 100
filtered_sd4m_roles = labeled_sd4m_roles[labeled_sd4m_roles['label'] != 10]
class_pairs = {}
print(f"Number of event-roles: {len(labeled_sd4m_roles)}\n")
for idx, class_name in enumerate(ROLE_LABELS):
    class_sd4m_roles = labeled_sd4m_roles[labeled_sd4m_roles['label'] == idx]
    print(f"{class_name}: {len(class_sd4m_roles)} instances")

Number of event-roles: 7285

location: 571 instances
delay: 87 instances
direction: 277 instances
start_loc: 377 instances
end_loc: 352 instances
start_date: 35 instances
end_date: 41 instances
cause: 103 instances
jam_length: 135 instances
route: 23 instances
no_arg: 5284 instances


## Step 3: Evaluate the labeling functions on the SD4M training data

In [11]:
from wsee.labeling import event_argument_role_lfs as role_lfs

### Apply the labeling functions

In [12]:
from snorkel.labeling import PandasLFApplier
from wsee.data.pipeline import get_role_list_lfs

lfs = get_role_list_lfs()

applier = PandasLFApplier(lfs)

In [13]:
L_sd_train = applier.apply(df_sd_train)

100%|██████████| 7285/7285 [01:02<00:00, 116.76it/s]


In [14]:
from snorkel.labeling import LFAnalysis

LFAnalysis(L_sd_train, lfs).lf_summary(Y_sd_train)

Unnamed: 0,j,Polarity,Coverage,Overlaps,Conflicts,Correct,Incorrect,Emp. Acc.
lf_location_adjacent_markers,0,[0],0.007412,0.003844,0.0,47,7,0.87037
lf_location_adjacent_trigger_verb,1,[0],0.003157,0.002471,0.000549,22,1,0.956522
lf_location_beginning_street_stop_route,2,[0],0.027454,0.027316,0.0,190,10,0.95
lf_location_first_sentence_street_stop_route,3,[0],0.05779,0.05779,0.000412,388,33,0.921615
lf_location_first_sentence_priorities,4,[0],0.063555,0.059437,0.000412,411,52,0.887689
lf_delay_event_sentence,5,[1],0.013315,0.006314,0.001098,83,14,0.85567
lf_delay_preceding_arg,6,[1],0.002608,0.002608,0.0,19,0,1.0
lf_delay_preceding_trigger,7,[1],0.002745,0.002745,0.000137,20,0,1.0
lf_direction_markers,8,[2],0.040082,0.038161,0.000549,249,43,0.85274
lf_direction_markers_order,9,[2],0.036925,0.036925,0.000549,238,31,0.884758


## Step 4: Error Analysis

In [15]:
from wsee.labeling import error_analysis

In [16]:
relevant_rows = labeled_sd4m_roles.iloc[L_sd_train[:, 0] == 0]
print(len(relevant_rows))
relevant_rows.sample()[['text', 'trigger', 'argument', 'label', 'event_types', 'event_arg_roles']]

54


Unnamed: 0,text,trigger,argument,label,event_types,event_arg_roles
4768,"von Samstag, 12. März, 5.35 Uhr bis Sonntag, 13. März, 23.55 Uhr<br />\n<br />\nMeldung:<br />\nDie EN-Züge halten nicht in Mainz Hbf und Frankfurt (M) Flughafen Regionalbf.<br />\n<br />Grund:<br />\nGleiserneuerung in Rüsselsheim<br />\n<br />Link zur detaillierten Meldung: <br />\n<a href= / ><br />\nLink zum kompletten PDF-Dokument: <br />\n<a href=target=_blank>(132 kB)<br /><br />------------------<br /><br />\n","{'id': 'c/e872b6f9-ee92-421d-b6ac-9b516204130c', 'text': 'halten nicht', 'entity_type': 'trigger', 'start': 27, 'end': 29, 'char_start': 105, 'char_end': 117}","{'id': 'c/88cde47b-6b9d-4ad8-a94d-10566fe7df6c', 'text': 'Mainz Hbf', 'entity_type': 'location_stop', 'start': 30, 'end': 32, 'char_start': 121, 'char_end': 130}",0,"[(halten nicht, (105, 117), 2)]","[((halten nicht, (105, 117), 2), (EN-Züge, location_route, (97, 104)), 9), ((halten nicht, (105, 117), 2), (Mainz Hbf, location_stop, (121, 130)), 0), ((halten nicht, (105, 117), 2), (Frankfurt (M) Flughafen Regionalbf, location_stop, (135, 169)), 0)]"


In [17]:
error_analysis.sample_fp(labeled_df=labeled_sd4m_roles, lf_outputs=L_sd_train, lf_index=36, label_of_interest=10, sample_size=1)[['between_tokens', 'trigger', 'argument', 'somajo_doc', 'label', 'event_types', 'event_arg_roles']]

Unnamed: 0,between_tokens,trigger,argument,somajo_doc,label,event_types,event_arg_roles
3511,"[in, Düsseldorf, Flugh, ., Terminal, ist, die, Strecke, der, #S11]","{'id': 'c/60aa18dc-ba83-4e47-ad21-bc9b9f10f97e', 'text': 'gesperrt', 'entity_type': 'trigger', 'start': 13, 'end': 14, 'char_start': 89, 'char_end': 97}","{'id': 'c/b2aebcd4-58a2-453e-b51e-59bdc453a9b2', 'text': 'polizeilicher Ermittlung', 'entity_type': 'trigger', 'start': 1, 'end': 3, 'char_start': 9, 'char_end': 33}","{'doc': [[Aufgrund, polizeilicher, Ermittlung, in, Düsseldorf, Flugh, .], [Terminal, ist, die, Strecke, der, #S11, gesperrt, ., https://t.co/txnhoB02BN, #bahn, #NW]], 'tokens': ['Aufgrund', 'polizeilicher', 'Ermittlung', 'in', 'Düsseldorf', 'Flugh', '.', 'Terminal', 'ist', 'die', 'Strecke', 'der', '#S11', 'gesperrt', '.', 'https://t.co/txnhoB02BN', '#bahn', '#NW'], 'sentences': [{'text': 'Aufgrund polizeilicher Ermittlung in Düsseldorf Flugh.', 'start': 0, 'end': 7, 'char_start': 0, 'char_end': 54}, {'text': 'Terminal ist die Strecke der #S11 gesperrt. https://t.co/txnhoB02BN #bahn #NW', 'start': 7, 'end': 18, 'char_start': 55, 'char_end': 132}]}",7,"[(polizeilicher Ermittlung, (9, 33), 7), (gesperrt, (89, 97), 1)]","[((gesperrt, (89, 97), 1), (polizeilicher Ermittlung, trigger, (9, 33)), 7), ((gesperrt, (89, 97), 1), (#S11, location_route, (84, 88)), 0)]"


In [18]:
error_analysis.sample_abstained_instances(labeled_df=labeled_sd4m_roles, lf_outputs=L_sd_train, lf_index=19, label_of_interest=5, sample_size=1)[['text', 'between_tokens', 'trigger', 'argument', 'label', 'event_types', 'event_arg_roles']]

Unnamed: 0,text,between_tokens,trigger,argument,label,event_types,event_arg_roles
2358,Verspätungen #SBahnStgt um 20:30\nS1 5 +5\nS2 9 +1\nS3 19 +4\nS4 11 +4\nS5 5\nS6 5 +4\nS60 0\nS? 3 +3\nhttps://t.co/9fBVIDuTku\n,"[#SBahnStgt, um]","{'id': 'c/668c10fa-d621-4818-a656-633382de6cd8', 'text': 'Verspätungen', 'entity_type': 'trigger', 'start': 0, 'end': 1, 'char_start': 0, 'char_end': 12}","{'id': 'c/c9e60444-39d1-4ccd-a346-1872135b4697', 'text': '20:30', 'entity_type': 'date', 'start': 3, 'end': 4, 'char_start': 27, 'char_end': 32}",5,"[(Verspätungen, (0, 12), 3)]","[((Verspätungen, (0, 12), 3), (20:30, date, (27, 32)), 5), ((Verspätungen, (0, 12), 3), (S1, location_route, (33, 35)), 0), ((Verspätungen, (0, 12), 3), (5 +5, duration, (36, 40)), 1), ((Verspätungen, (0, 12), 3), (S2, location_route, (41, 43)), 0), ((Verspätungen, (0, 12), 3), (9 +1, duration, (44, 48)), 1), ((Verspätungen, (0, 12), 3), (S3, location_route, (49, 51)), 0), ((Verspätungen, (0, 12), 3), (19 +4, duration, (52, 57)), 1), ((Verspätungen, (0, 12), 3), (S4, location_route, (58, 60)), 0), ((Verspätungen, (0, 12), 3), (11 +4, duration, (61, 66)), 1), ((Verspätungen, (0, 12), 3), (S5, location_route, (67, 69)), 0), ((Verspätungen, (0, 12), 3), (5, duration, (70, 71)), 1), ((Verspätungen, (0, 12), 3), (S6, location_route, (72, 74)), 0), ((Verspätungen, (0, 12), 3), (5 +4, duration, (75, 79)), 1)]"


In [19]:
error_analysis.sample_abstained_instances(labeled_df=labeled_sd4m_roles, lf_outputs=L_sd_train, lf_index=0, label_of_interest=0, sample_size=1)[['text', 'between_tokens', 'trigger', 'argument', 'label', 'event_types']]

Unnamed: 0,text,between_tokens,trigger,argument,label,event_types
3539,Lichtenberger Straße ab Montag gesperrt https://t.co/XayNfKGNHW https://t.co/vGnz2ynIlA\n,"[ab, Montag]","{'id': 'c/06fe009a-7484-479e-917d-35df775024ef', 'text': 'gesperrt', 'entity_type': 'trigger', 'start': 4, 'end': 5, 'char_start': 31, 'char_end': 39}","{'id': 'c/bfe85722-3bbe-42ea-9551-465c0ad633db', 'text': 'Lichtenberger Straße', 'entity_type': 'location_street', 'start': 0, 'end': 2, 'char_start': 0, 'char_end': 20}",0,"[(gesperrt, (31, 39), 4)]"


## Step 5: Train the Label model and label the data

### Train the label model

In [20]:
dataframe_file = DATA_DIR + '/pickled_daystream_role_examples'
pickled_dataframe_file = Path(dataframe_file + '.pkl')

df_daystream = None
Y_daystream = None

if pickled_dataframe_file.exists():
    with open(pickled_dataframe_file, 'rb') as pickled_dataframe:
        df_daystream, Y_daystream = pickle.load(pickled_dataframe)
else:
    df_daystream, Y_daystream = pipeline.build_event_role_examples(daystream)
    with open(pickled_dataframe_file, 'wb') as pickled_dataframe:
        pickle.dump((df_daystream, Y_daystream), pickled_dataframe)
if 'event_roles' in df_daystream:
    df_daystream.drop('event_roles', axis=1, inplace=True)

In [21]:
L_daystream = applier.apply(df_daystream)

100%|██████████| 47376/47376 [05:07<00:00, 153.88it/s]


In [22]:
from snorkel.labeling import LFAnalysis
LFAnalysis(L_daystream, lfs).lf_summary()

Unnamed: 0,j,Polarity,Coverage,Overlaps,Conflicts
lf_location_adjacent_markers,0,[0],0.00496,0.004264,2.1e-05
lf_location_adjacent_trigger_verb,1,[0],0.000697,0.000443,0.000127
lf_location_beginning_street_stop_route,2,[0],0.003441,0.003441,2.1e-05
lf_location_first_sentence_street_stop_route,3,[0],0.028643,0.028643,6.3e-05
lf_location_first_sentence_priorities,4,[0],0.030902,0.028981,6.3e-05
lf_delay_event_sentence,5,[1],0.004053,0.002132,0.0
lf_delay_preceding_arg,6,[1],0.000887,0.000887,0.0
lf_delay_preceding_trigger,7,[1],0.001245,0.001245,0.0
lf_direction_markers,8,[2],0.00553,0.004116,0.00019
lf_direction_markers_order,9,[2],0.003567,0.003567,0.000127


In [23]:
# Labeling functions summary statistics over all LFs

print(f"Number of role labeling functions: \t{len(get_role_list_lfs())}")
print(f"Coverage: \t{LFAnalysis(L_daystream, lfs).label_coverage()}")  # percentage of objects that had at least one label
print(f"Overlap: \t{LFAnalysis(L_daystream, lfs).label_overlap()}")  # percentage of objects with more than one label
print(f"Conflicts: \t{LFAnalysis(L_daystream, lfs).label_conflict()}")  # percentage of objects with conflicting labels

Number of role labeling functions: 	42
Coverage: 	0.8263466734211415
Overlap: 	0.5286220871327254
Conflicts: 	0.0014142181695373185


In [24]:
from snorkel.labeling import LabelModel

daystream_model = LabelModel(cardinality=11, verbose=True)
daystream_model.fit(L_train=L_daystream, n_epochs=5000, log_freq=500, seed=12345, Y_dev=Y_sd_train)

INFO:root:Computing O...
INFO:root:Estimating \mu...
INFO:root:[0 epochs]: TRAIN:[loss=0.259]
INFO:root:[500 epochs]: TRAIN:[loss=0.004]
INFO:root:[1000 epochs]: TRAIN:[loss=0.002]
INFO:root:[1500 epochs]: TRAIN:[loss=0.002]
INFO:root:[2000 epochs]: TRAIN:[loss=0.001]
INFO:root:[2500 epochs]: TRAIN:[loss=0.001]
INFO:root:[3000 epochs]: TRAIN:[loss=0.001]
INFO:root:[3500 epochs]: TRAIN:[loss=0.001]
INFO:root:[4000 epochs]: TRAIN:[loss=0.001]
INFO:root:[4500 epochs]: TRAIN:[loss=0.001]
INFO:root:Finished Training


### Look at label model performance

Here we evaluate the LabelModel on the SD4M development data, because we used the SD4M training data to develop our labeling functions our model is likely overfitted on the SD4M training data. The included `score` function from Snorkel is limited and more easily applicable in a binary classification setting. We will instead use the predictions and sklearn metrics ourselves.
For each model we will first report the metrics for all classes and then the metrics without the majority negative class.

In [25]:
from wsee.utils.scorer import score_model

positive_event_role_indices = [idx for idx, _ in enumerate(ROLE_LABELS)][:-1]

We create a MajorityLabelVoter and a LabelModel version that does not use the SD4M training data to infer a class balance prior for comparison.

In [26]:
from snorkel.labeling import MajorityLabelVoter

daystream_mlv = MajorityLabelVoter(cardinality=11, verbose=True)
daystream_without_sd4m_cb = LabelModel(cardinality=11, verbose=True)
daystream_without_sd4m_cb.fit(L_train=L_daystream,n_epochs=5000, log_freq=500, seed=12345)

INFO:root:Computing O...
INFO:root:Estimating \mu...
INFO:root:[0 epochs]: TRAIN:[loss=0.644]
INFO:root:[500 epochs]: TRAIN:[loss=0.002]
INFO:root:[1000 epochs]: TRAIN:[loss=0.001]
INFO:root:[1500 epochs]: TRAIN:[loss=0.001]
INFO:root:[2000 epochs]: TRAIN:[loss=0.001]
INFO:root:[2500 epochs]: TRAIN:[loss=0.001]
INFO:root:[3000 epochs]: TRAIN:[loss=0.001]
INFO:root:[3500 epochs]: TRAIN:[loss=0.001]
INFO:root:[4000 epochs]: TRAIN:[loss=0.000]
INFO:root:[4500 epochs]: TRAIN:[loss=0.000]
INFO:root:Finished Training


In [27]:
L_sd_dev = applier.apply(df_sd_dev)

100%|██████████| 491/491 [00:04<00:00, 113.52it/s]


#### With tie_break_policy set to "random"
Sometimes there might be instances where all the labeling functions abstain or where we might encounter a tie between the labeling functions.
Here we use the tie break policy "random", where the label models randomly choose among tied option using deterministic hash.
(When all labeling functions abstain all options/classes are tied.)
Note that coverage is still calculated as normal, i.e. as the ratio of labeled data points and all data points.

**Label Model**

In [28]:
score_model(model=daystream_model, L=L_sd_dev, Y=Y_sd_dev, tie_break_policy="random")

  'precision', 'predicted', average, warn_for)


Unnamed: 0,Metric,Micro Average,Macro Average
0,precision,0.830957,0.737816
1,recall,0.830957,0.75477
2,f1,0.830957,0.725519
3,accuracy,0.830957,0.830957
4,coverage,1.0,1.0


In [29]:
score_model(model=daystream_model, L=L_sd_dev, Y=Y_sd_dev, tie_break_policy="random", labels=positive_event_role_indices)

Unnamed: 0,Metric,Micro Average,Macro Average
0,precision,0.789474,0.725883
1,recall,0.769231,0.743085
2,f1,0.779221,0.711639
3,accuracy,0.830957,0.830957
4,coverage,1.0,1.0


**Label model without a class balance prior inferred from SD4M training set**

In [30]:
score_model(model=daystream_without_sd4m_cb, L=L_sd_dev, Y=Y_sd_dev, tie_break_policy="random")

Unnamed: 0,Metric,Micro Average,Macro Average
0,precision,0.688391,0.469263
1,recall,0.688391,0.742119
2,f1,0.688391,0.536639
3,accuracy,0.688391,0.688391
4,coverage,0.759674,0.759674


In [31]:
score_model(model=daystream_without_sd4m_cb, L=L_sd_dev, Y=Y_sd_dev, tie_break_policy="random", labels=positive_event_role_indices)

Unnamed: 0,Metric,Micro Average,Macro Average
0,precision,0.518395,0.420876
1,recall,0.794872,0.754507
2,f1,0.62753,0.515303
3,accuracy,0.688391,0.688391
4,coverage,0.759674,0.759674


**Majority Label Voter**

In [32]:
score_model(model=daystream_mlv, L=L_sd_dev, Y=Y_sd_dev, tie_break_policy="random")

Unnamed: 0,Metric,Micro Average,Macro Average
0,precision,0.694501,0.482654
1,recall,0.694501,0.833643
2,f1,0.694501,0.557398
3,accuracy,0.694501,0.694501
4,coverage,0.763747,0.763747


In [33]:
score_model(model=daystream_mlv, L=L_sd_dev, Y=Y_sd_dev, tie_break_policy="random", labels=positive_event_role_indices)

Unnamed: 0,Metric,Micro Average,Macro Average
0,precision,0.525253,0.435558
1,recall,0.8,0.854507
2,f1,0.634146,0.537628
3,accuracy,0.694501,0.694501
4,coverage,0.763747,0.763747


### Do predictions on the daystream data

In [34]:
daystream_probs = daystream_model.predict_proba(L=L_daystream)

In the proposed workflow one would filter out all the datapoints that were not labeled by any of the labeling functions.
In the actual pipeline we would multiply the probabilities of abstains with zero so that they look like padding instances, when fed into the end model.
We propose this workaround since examples that are filtered out here are treated as negative examples per default in the end model.
We also cannot afford to filter out the whole document if just one trigger/role example was not labeled and potentially loose valuable training examples.

In [35]:
labeled_daystream_with_abstains = pipeline.merge_event_role_examples(df_daystream, utils.zero_out_abstains(daystream_probs, L_daystream))
labeled_daystream_with_abstains.reset_index(level=0).to_json(DATA_DIR + "/save_daystreamv6_roles_with_abstains.jsonl", orient='records', lines=True, force_ascii=False)

INFO:wsee:Merging event role examples that belong to the same document


## Step 7: Daystream Snorkel Labeling Check

To look at the daystream labeling it would be best to remove the abstains.

In [36]:
from snorkel.labeling import filter_unlabeled_dataframe

df_daystream_filtered, probs_daystream_filtered = filter_unlabeled_dataframe(
    X=df_daystream, y=daystream_probs, L=L_daystream
)

In [37]:
df_daystream_filtered['role_probs'] = list(probs_daystream_filtered)
df_daystream_filtered['most_probable_class'] = [ROLE_LABELS[label_idx] for label_idx in probs_daystream_filtered.argmax(axis=1)]
df_daystream_filtered['max_class_prob'] = ["{:.2f}".format(class_prob) for class_prob in probs_daystream_filtered.max(axis=1)]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.


In [38]:
for role_class in ROLE_LABELS:
    print(f"{role_class}: {len(df_daystream_filtered[df_daystream_filtered['most_probable_class'] == role_class])} instances")

location: 1573 instances
delay: 192 instances
direction: 259 instances
start_loc: 620 instances
end_loc: 374 instances
start_date: 578 instances
end_date: 98 instances
cause: 194 instances
jam_length: 22 instances
route: 40 instances
no_arg: 35199 instances


In [39]:
df_daystream_filtered[df_daystream_filtered['most_probable_class'] == 'route'].sample(1)[['text', 'trigger', 'argument', 'most_probable_class', 'max_class_prob', 'role_probs']]

Unnamed: 0,text,trigger,argument,most_probable_class,max_class_prob,role_probs
24133,Update! #RE2 Bahnhof #FrankfurtFlughafenRegionalbf jetzt doch komplett gesperrt. Es kann zu kurzfristigen Umleitungen über den Fernbahnhof kommen. Bitte Reiseverbindung vor Abfahrt prüfen.,"{'id': 'c/4f507898-f2d1-43ec-8848-9e02a6649c7e', 'text': 'gesperrt', 'entity_type': 'trigger', 'start': 8, 'end': 9, 'char_start': 71, 'char_end': 79}","{'id': 'c/d3619f13-e1b7-4793-bbb6-1f61a708b1bb', 'text': '#RE2', 'entity_type': 'location_route', 'start': 2, 'end': 3, 'char_start': 8, 'char_end': 12}",route,1.0,"[2.4448004486308372e-05, 1.383903928250819e-07, 4.3938042657787925e-07, 5.6005805894121545e-11, 1.2557914136490533e-10, 3.3197433240426747e-09, 1.0169574821978974e-08, 2.2496540019897624e-07, 5.10450336571369e-07, 0.999974215211592, 9.926462407221225e-09]"
