### Snorkel Python Tutorial
+ Labeling Function : for labeling unlabelled data
+ Transformation Function: for data augmentation
+ Slicing Function: for data subset selection for important dataset


##### Main Fxns
+ Labeling Function
  -a labeling function is a function that outputs a label for some subset of the training dataset. 
+ Transformation Functions
  - For data augmentation: which is the strategy of artificially augmenting existing labeled training datasets by creating transformed copies of the data points
+ Slicing Function
  - slicing functions (SFs), handles the reality that many datasets have certain subsets or slices that are more important than others. In Snorkel, we can write SFs to (a) monitor specific slices and (b) improve model performance over them by adding representational capacity targeted on a per-slice basis.


#### Installation
```bash
pip install snorkel
```
#### Main Features
+ labeling functions (LFs) in Snorkel: noisy, programmatic rules and heuristics that assign labels to unlabeled training data

In [None]:
data = ["What would you name your boat if you had one? ",
"What's the closest thing to real magic? ",
"Who is the messiest person you know? ",
"What will finally break the internet? ",
"What's the most useless talent you have? ",
"What would be on the gag reel of your life? ",
"Where is the worst smelling place you've been?",
"What Secret Do You Have That No One Else Knows Except Your Sibling/S?"
"What Did You Think Was Cool Then, When You Were Young But Isn’t Cool Now?"
"When Was The Last Time You Did Something And Regret Doing It?"
"What Guilty Pleasure Makes You Feel Alive?"
"Any fool can write code that a computer can understand. Good programmers write code that humans can understand.",
"First, solve the problem. Then, write the code.",
"Experience is the name everyone gives to their mistakes.",
" In order to be irreplaceable, one must always be different",
"Java is to JavaScript what car is to Carpet.",
"Knowledge is power.",
"Sometimes it pays to stay in bed on Monday, rather than spending the rest of the week debugging Monday’s code.",
"Perfection is achieved not when there is nothing more to add, but rather when there is nothing more to take away.", 
"Ruby is rubbish! PHP is phpantastic!",
" Code is like humor. When you have to explain it, it’s bad.",
"Fix the cause, not the symptom.",
"Optimism is an occupational hazard of programming: feedback is the treatment. " ,
"When to use iterative development? You should use iterative development only on projects that you want to succeed.",
"Simplicity is the soul of efficiency.",
"Before software can be reusable it first has to be usable.",
"Make it work, make it right, make it fast.",
"Programmer: A machine that turns coffee into code.",
"Computers are fast; programmers keep it slow.",
"When I wrote this code, only God and I understood what I did. Now only God knows.",
"A son asked his father (a programmer) why the sun rises in the east, and sets in the west. His response? It works, don’t touch!",
"How many programmers does it take to change a light bulb? None, that’s a hardware problem.",
"Programming is like sex: One mistake and you have to support it for the rest of your life.",
"Programming can be fun, and so can cryptography; however, they should not be combined.",
"Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning.",
"Copy-and-Paste was programmed by programmers for programmers actually.",
"Always code as if the person who ends up maintaining your code will be a violent psychopath who knows where you live.",
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.",
"Algorithm: Word used by programmers when they don’t want to explain what they did.",
"Software and cathedrals are much the same — first we build them, then we pray.",
"There are two ways to write error-free programs; only the third works.",
"If debugging is the process of removing bugs, then programming must be the process of putting them in.",
"99 little bugs in the code. 99 little bugs in the code. Take one down, patch it around. 127 little bugs in the code …",
"Remember that there is no code faster than no code.",
"One man’s crappy software is another man’s full-time job.",
"No code has zero defects.",
"A good programmer is someone who always looks both ways before crossing a one-way street.",
"Deleted code is debugged code.",
"Don’t worry if it doesn’t work right. If everything did, you’d be out of a job.",
"It’s not a bug — it’s an undocumented feature.",
"It works on my machine.",
"It compiles; ship it.",
"There is no Ctrl-Z in life.",
"Whitespace is never white.",
"What’s your favorite way to spend a day off?",
"What type of music are you into?",
"What was the best vacation you ever took and why?",
"Where’s the next place on your travel bucket list and why?",
"What are your hobbies, and how did you get into them?",
"What was your favorite age growing up?",
"Was the last thing you read?",
"Would you say you’re more of an extrovert or an introvert?",
"What's your favorite ice cream topping?",
"What was the last TV show you binge-watched?",
"Are you into podcasts or do you only listen to music?",
"Do you have a favorite holiday? Why or why not?",
"If you could only eat one food for the rest of your life, what would it be?",
"Do you like going to the movies or prefer watching at home?",
"What’s your favorite sleeping position?",
"What’s your go-to guilty pleasure?",
"In the summer, would you rather go to the beach or go camping?",
"What’s your favorite quote from a TV show/movie/book?",
"How old were you when you had your first celebrity crush, and who was it?",
"What's one thing that can instantly make your day better?",
"Do you have any pet peeves",
"What’s your favorite thing about your current job?",
"What annoys you most?",
"What’s the career highlight you’re most proud of?",
"Do you think you’ll stay in your current gig awhile? Why or why not?",
"What type of role do you want to take on after this one?",
"Are you more of a work to live or a live to work type of person?",
"Does your job make you feel happy and fulfilled? Why or why not?",
"How would your 10-year-old self react to what you do now?",
"What do you remember most about your first job?",
"How old were you when you started working?",
"What’s the worst job you’ve ever had?",
"What originally got you interested in your current field of work?",
"Have you ever had a side hustle or considered having one?",
"What’s your favorite part of the workday?",
"What’s the best career decision you’ve ever made?",
"What’s the worst career decision you’ve ever made?",
"Do you consider yourself good at networking?"]

In [None]:
# Load EDA Pkgs
import pandas as pd
import numpy as np

In [None]:
# Len of Data
len(data)

88

In [None]:
# Shuffle data
import random


In [None]:
random.shuffle(data)

In [None]:
data

['When I wrote this code, only God and I understood what I did. Now only God knows.',
 'A good programmer is someone who always looks both ways before crossing a one-way street.',
 'If you could only eat one food for the rest of your life, what would it be?',
 'Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning.',
 'What would you name your boat if you had one? ',
 'Fix the cause, not the symptom.',
 'Whitespace is never white.',
 'Knowledge is power.',
 'What do you remember most about your first job?',
 'Algorithm: Word used by programmers when they don’t want to explain what they did.',
 'Deleted code is debugged code.',
 'Don’t worry if it doesn’t work right. If everything did, you’d be out of a job.',
 'It’s not a bug — it’s an undocumented feature.',
 "Where is the worst smelling place you've been?",
 'What Secret Do You Have T

In [None]:
# Convert to DF
df = pd.DataFrame({'sentences':data})

In [None]:
df.head()

Unnamed: 0,sentences
0,Debugging is twice as hard as writing the code...
1,What's the closest thing to real magic?
2,What was your favorite age growing up?
3,How old were you when you started working?
4,Does your job make you feel happy and fulfille...


In [None]:
df.to_csv("unlabelled_sentences.csv",index=False)

In [None]:
# Split dataset
from sklearn.model_selection import train_test_split
df_train, df_test = train_test_split(df, train_size = 0.5)

#### Using Snorkel to Label our Dataset
+ Labeling functions (LFs) in Snorkel: noisy, programmatic rules and heuristics that assign labels to unlabeled training data

#### Requirements
+ Keyword searches: looking for specific words in a sentence
+ Pattern matching: looking for specific syntactical patterns
+ Third-party models: using an pre-trained model (usually a model for a different task than the one at hand)
+ Distant supervision: using external knowledge base
+ Crowdworker labels: treating each crowdworker as a black-box function that assigns labels to subsets of the data

In [None]:
df.shape

(88, 1)

In [None]:
!pip install snorkel

Collecting snorkel
  Downloading snorkel-0.9.7-py3-none-any.whl (145 kB)
[K     |████████████████████████████████| 145 kB 5.3 MB/s 
Collecting munkres>=1.0.6
  Downloading munkres-1.1.4-py2.py3-none-any.whl (7.0 kB)
Collecting networkx<2.4,>=2.2
  Downloading networkx-2.3.zip (1.7 MB)
[K     |████████████████████████████████| 1.7 MB 32.6 MB/s 
Collecting tensorboard<2.0.0,>=1.14.0
  Downloading tensorboard-1.15.0-py3-none-any.whl (3.8 MB)
[K     |████████████████████████████████| 3.8 MB 27.7 MB/s 
Building wheels for collected packages: networkx
  Building wheel for networkx (setup.py) ... [?25l[?25hdone
  Created wheel for networkx: filename=networkx-2.3-py2.py3-none-any.whl size=1556009 sha256=73022dcd880fd81db8cdb997cc858bb1f58b4b49e20a89ccec6215e8b2b8b3e8
  Stored in directory: /root/.cache/pip/wheels/44/e6/b8/4efaab31158e9e9ca9ed80b11f6b11130bac9a9672b3cbbeaf
Successfully built networkx
Installing collected packages: tensorboard, networkx, munkres, snorkel
  Attempting uninst

In [None]:
# Create our labeling fxn
from snorkel.labeling import labeling_function
from snorkel.labeling import PandasLFApplier
from snorkel.labeling import LFAnalysis

In [None]:
# Define Constants
ABSTAIN = -1
QUOTE = 0
QUESTION = 1

In [None]:
@labeling_function()
def lf_contains_5Ws(x):
    # Return a label of QUESTION if "what|why|when|how|where|which|who|whose" in sentence text, otherwise QUOTE
    return QUESTION if any(word in x['sentences'].lower() for word in "what|why|when|where|who|which".split("|") ) else QUOTE
   

In [None]:
# Using Keyword Lookup: Method 1
@labeling_function()
def lf_keyword_lookup(x):
  keywords =  "what|why|when|how|where|which|who|whose".split("|")
  return QUESTION if any(word in x.sentences.lower() for word in keywords) else ABSTAIN
    

In [None]:
# Keyword lookup : Method 2
@labeling_function()
def lf_contains_questions(x):
    # Return a label of QUESTION if "what|why|when|how|where|which|who|whose" in sentence text, otherwise QUOTE
    for word in "what|why|when|how|where|which|who|whose".split("|"):
      if word in x.sentences.lower():
        return QUESTION
      else:
        return ABSTAIN

In [None]:
# Regex Lookup/Pattern Lookup
import re

@labeling_function()
def regex_check_out(x):
    return QUESTIONS if re.search(r"what.*?", x.sentences, flags=re.I) else ABSTAIN

In [None]:
### Apply on Pandas
lfs = [lf_contains_questions,lf_keyword_lookup,lf_contains_5Ws]

applier = PandasLFApplier(lfs=lfs)
L_train = applier.apply(df=df_train)

100%|██████████| 44/44 [00:00<00:00, 4570.77it/s]


#### Narrative
+ PandasLFApplier.apply() function produces a Label Matrix
+ The Label matrix, a fundamental concept in Snorkel. It’s a NumPy array 
+ The matrix consist of one column for each LF and one row for each data point,

#### Evaluate performance on training set
+ Calculate the coverage of these LFs (i.e., the percentage of the dataset that they label) 

In [58]:
# Find percentage of dataset that was labels [Coverage]
coverage_questions, coverage_keyword,coverage_5w = (L_train != ABSTAIN).mean(axis=0)
print(f"questions coverage: {coverage_questions * 100:.1f}%")
print(f"keyword coverage: {coverage_keyword * 100:.1f}%")
print(f"5w coverage: {coverage_5w * 100:.1f}%")

questions coverage: 36.4%
keyword coverage: 56.8%
5w coverage: 100.0%


In [59]:
L_train

array([[-1, -1,  0],
       [ 1,  1,  1],
       [-1, -1,  0],
       [-1, -1,  0],
       [ 1,  1,  1],
       [-1,  1,  1],
       [-1, -1,  0],
       [ 1,  1,  1],
       [ 1,  1,  1],
       [-1,  1,  0],
       [-1,  1,  1],
       [-1, -1,  0],
       [ 1,  1,  1],
       [-1, -1,  0],
       [ 1,  1,  1],
       [-1,  1,  1],
       [ 1,  1,  1],
       [-1, -1,  0],
       [ 1,  1,  1],
       [-1,  1,  1],
       [-1, -1,  0],
       [-1, -1,  0],
       [ 1,  1,  1],
       [-1, -1,  0],
       [-1, -1,  0],
       [ 1,  1,  1],
       [ 1,  1,  1],
       [-1, -1,  0],
       [ 1,  1,  1],
       [-1,  1,  1],
       [-1, -1,  0],
       [-1, -1,  0],
       [ 1,  1,  1],
       [-1,  1,  1],
       [-1, -1,  0],
       [-1, -1,  0],
       [ 1,  1,  1],
       [ 1,  1,  1],
       [-1, -1,  0],
       [-1, -1,  0],
       [-1, -1,  0],
       [-1,  1,  0],
       [ 1,  1,  1],
       [-1,  1,  1]])

#### Evaluation of Labeling
+ Polarity: The set of unique labels this LF outputs (excluding abstains)
+ Coverage: The fraction of the dataset the LF labelled
+ Overlaps: The fraction of the dataset where this LF and at least one other LF label
+ Conflicts: The fraction of the dataset where this LF and at least one other LF label and disagree
+ Correct: The number of data points this LF labels correctly (if gold labels are provided)
+ Incorrect: The number of data points this LF labels incorrectly (if gold labels are provided)
+ Empirical Accuracy: The empirical accuracy of this LF (if gold labels are provided)

In [None]:
from snorkel.labeling import LFAnalysis
LFAnalysis(L=L_train, lfs=lfs).lf_summary()

Unnamed: 0,j,Polarity,Coverage,Overlaps,Conflicts
lf_contains_questions,0,[1],0.363636,0.363636,0.0
lf_keyword_lookup,1,[1],0.568182,0.568182,0.045455
lf_contains_5Ws,2,"[0, 1]",1.0,0.568182,0.045455


### Narrative
+ We choose the labeling fxn with the higher coverage since it does well


In [None]:
# Find where in the dataset it was labeled as Questions such
df_train.iloc[L_train[:, 1] == QUESTION]

Unnamed: 0,sentences
51,What was the last TV show you binge-watched?
27,How would your 10-year-old self react to what ...
19,Code is like humor. When you have to explain ...
28,What type of music are you into?
67,What’s the worst job you’ve ever had?
71,"Programming can be fun, and so can cryptograph..."
76,A good programmer is someone who always looks ...
21,What’s your favorite way to spend a day off?
63,What will finally break the internet?
17,Where is the worst smelling place you've been?


In [None]:
# Assign Labels
df_train2 = df_train.copy()

In [None]:
# Find where in the dataset it was labeled as Questions and assign it the new label
df_train2['Labels'] = df_train.iloc[L_train[:, 1] == QUESTION]

In [None]:
df_train2

Unnamed: 0,sentences,Labels
87,Optimism is an occupational hazard of programm...,
51,What was the last TV show you binge-watched?,What was the last TV show you binge-watched?
50,"In order to be irreplaceable, one must always...",
29,Deleted code is debugged code.,
27,How would your 10-year-old self react to what ...,How would your 10-year-old self react to what ...
19,Code is like humor. When you have to explain ...,Code is like humor. When you have to explain ...
11,"Make it work, make it right, make it fast.",
28,What type of music are you into?,What type of music are you into?
67,What’s the worst job you’ve ever had?,What’s the worst job you’ve ever had?
71,"Programming can be fun, and so can cryptograph...","Programming can be fun, and so can cryptograph..."


In [None]:
# Find where in the dataset it was  not labeled as Questions and assign it the new label
# df_train2['Labels'] = df_train.iloc[L_train[:, 1] == ABSTAIN]

In [None]:
### Group Datapoints by their predicted labels
### get_label_buckets(...) to group data points by their predicted label and/or true labels.

from snorkel.analysis import get_label_buckets

In [None]:
buckets = get_label_buckets(L_train[:, 0], L_train[:, 1])


In [None]:
buckets

{(-1,
  -1): array([ 1,  2,  3,  5,  8,  9, 11, 12, 13, 17, 20, 27, 28, 30, 32, 33, 35,
        37, 38, 39]),
 (-1, 1): array([ 6,  7, 16, 18, 19, 26, 29, 40, 42]),
 (1, 1): array([ 0,  4, 10, 14, 15, 21, 22, 23, 24, 25, 31, 34, 36, 41, 43])}

In [None]:
df_train.iloc[buckets[(ABSTAIN, QUESTION)]]

Unnamed: 0,sentences
19,Code is like humor. When you have to explain ...
71,"Programming can be fun, and so can cryptograph..."
76,A good programmer is someone who always looks ...
17,Where is the worst smelling place you've been?
47,A son asked his father (a programmer) why the ...
5,Do you have a favorite holiday? Why or why not?
72,Where’s the next place on your travel bucket l...
80,How many programmers does it take to change a ...
39,How old were you when you had your first celeb...


### Writing an LF that uses a third-party model

In [None]:
from snorkel.preprocess.nlp import SpacyPreprocessor

# The SpacyPreprocessor parses the text in text_field and
# stores the new enriched representation in doc_field
spacy = SpacyPreprocessor(text_field="sentences", doc_field="doc", memoize=True)

In [None]:
# @labeling_function(pre=[spacy])
# def has_person(x):
#     """Ham comments mention specific people and are short."""
#     if len(x.doc) < 20 and any([ent.label_ == "PERSON" for ent in x.doc.ents]):
#         return HAM
#     else:
#         return ABSTAIN

In [None]:
@labeling_function(pre=[spacy])
def lf_is_question(x):
    """Questions usually have a ."""
    if x.doc.text.endswith('?') or any([token.pos_ == "ADV" for token in x.doc]):
        return QUESTION
    else:
        return ABSTAIN

In [None]:
### Simplified
from snorkel.labeling.lf.nlp import nlp_labeling_function


# @nlp_labeling_function()
# def has_person_nlp(x):
#     """Ham comments mention specific people and are short."""
#     if len(x.doc) < 20 and any([ent.label_ == "PERSON" for ent in x.doc.ents]):
#         return HAM
#     else:
#         return ABSTAIN

In [None]:
### Apply on Pandas
lfs2 = [lf_contains_questions,lf_keyword_lookup,lf_contains_5Ws,lf_is_question]

applier2 = PandasLFApplier(lfs=lfs2)
L_train2 = applier2.apply(df=df_train)

100%|██████████| 44/44 [00:00<00:00, 3847.26it/s]


In [None]:
LFAnalysis(L=L_train2, lfs=lfs2).lf_summary()

Unnamed: 0,j,Polarity,Coverage,Overlaps,Conflicts
lf_contains_questions,0,[1],0.363636,0.363636,0.0
lf_keyword_lookup,1,[1],0.568182,0.568182,0.045455
lf_contains_5Ws,2,"[0, 1]",1.0,0.840909,0.318182
lf_is_question,3,[1],0.795455,0.795455,0.318182


#### Converting labels into One single label using
+ MajorityLabelVote:
  - A simple baseline for doing this is to take the majority vote on a per-data point basis: if more LFs voted SPAM than HAM, label it SPAM (and vice versa). 

  - Issues
    + varied accuracies and coverages, 
    + LFs may be correlated, resulting in certain signals being overrepresented in a majority-vote-based model
+ LabelModel

In [45]:
from snorkel.labeling.model import MajorityLabelVoter

majority_model = MajorityLabelVoter()
preds_train = majority_model.predict(L=L_train)

In [46]:
preds_train

array([ 0,  1,  0,  0,  1,  1,  0,  1,  1, -1,  1,  0,  1,  0,  1,  1,  1,
        0,  1,  1,  0,  0,  1,  0,  0,  1,  1,  0,  1,  1,  0,  0,  1,  1,
        0,  0,  1,  1,  0,  0,  0, -1,  1,  1])

#### Using LabelModel to Pick the Best

In [None]:
from snorkel.labeling.model import LabelModel

In [None]:
label_model = LabelModel(cardinality=2, verbose=True)
label_model.fit(L_train=L_train, n_epochs=500, log_freq=100, seed=123)

In [47]:
res = label_model.predict(L_train)

In [56]:
# Remove Abstain from label during prediction
df_train2["new_label"] = label_model.predict(L=L_train, tie_break_policy="abstain")

In [49]:
L_train

array([[-1, -1,  0],
       [ 1,  1,  1],
       [-1, -1,  0],
       [-1, -1,  0],
       [ 1,  1,  1],
       [-1,  1,  1],
       [-1, -1,  0],
       [ 1,  1,  1],
       [ 1,  1,  1],
       [-1,  1,  0],
       [-1,  1,  1],
       [-1, -1,  0],
       [ 1,  1,  1],
       [-1, -1,  0],
       [ 1,  1,  1],
       [-1,  1,  1],
       [ 1,  1,  1],
       [-1, -1,  0],
       [ 1,  1,  1],
       [-1,  1,  1],
       [-1, -1,  0],
       [-1, -1,  0],
       [ 1,  1,  1],
       [-1, -1,  0],
       [-1, -1,  0],
       [ 1,  1,  1],
       [ 1,  1,  1],
       [-1, -1,  0],
       [ 1,  1,  1],
       [-1,  1,  1],
       [-1, -1,  0],
       [-1, -1,  0],
       [ 1,  1,  1],
       [-1,  1,  1],
       [-1, -1,  0],
       [-1, -1,  0],
       [ 1,  1,  1],
       [ 1,  1,  1],
       [-1, -1,  0],
       [-1, -1,  0],
       [-1, -1,  0],
       [-1,  1,  0],
       [ 1,  1,  1],
       [-1,  1,  1]])

In [48]:
res

array([0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 0,
       1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1])

In [51]:
df_train2['final_label'] = res

In [57]:
df_train2

Unnamed: 0,sentences,Labels,final_label,new_label
87,Optimism is an occupational hazard of programm...,,0,0
51,What was the last TV show you binge-watched?,What was the last TV show you binge-watched?,1,1
50,"In order to be irreplaceable, one must always...",,0,0
29,Deleted code is debugged code.,,0,0
27,How would your 10-year-old self react to what ...,How would your 10-year-old self react to what ...,1,1
19,Code is like humor. When you have to explain ...,Code is like humor. When you have to explain ...,1,1
11,"Make it work, make it right, make it fast.",,0,0
28,What type of music are you into?,What type of music are you into?,1,1
67,What’s the worst job you’ve ever had?,What’s the worst job you’ve ever had?,1,1
71,"Programming can be fun, and so can cryptograph...","Programming can be fun, and so can cryptograph...",0,0


#### Narrative
+ Our LabelModel was able to predict very well

In [53]:
# Check Accuracy
# majority_acc = majority_model.score(L=L_test, Y=Y_test, tie_break_policy="random")[
#     "accuracy"
# ]
# print(f"{'Majority Vote Accuracy:':<25} {majority_acc * 100:.1f}%")

# label_model_acc = label_model.score(L=L_test, Y=Y_test, tie_break_policy="random")[
#     "accuracy"
# ]
# print(f"{'Label Model Accuracy:':<25} {label_model_acc * 100:.1f}%")

In [None]:
#### Summary Code

# Import Pkgs
from snorkel.labeling import labeling_function,PandasLFApplier,LFAnalysis
from snorkel.labeling.model import MajorityLabelVoter,LabelModel

## Define Constants
ABSTAIN = -1
QUOTE = 0
QUESTION = 1

## Create Labeling Functions
# Using Keyword Lookup: Method 1
@labeling_function()
def lf_keyword_lookup(x):
  keywords =  "what|why|when|how|where|which|who|whose".split("|")
  return QUESTION if any(word in x.sentences.lower() for word in keywords) else ABSTAIN


@labeling_function()
def lf_regex_contains_what(x):
    return QUESTIONS if re.search(r"what.*?", x.sentences, flags=re.I) else ABSTAIN


## Apply on Pandas
lfs = [lf_keyword_lookup,lf_regex_contains_what]
applier = PandasLFApplier(lfs=lfs)
L_train = applier.apply(df=df_train)

### Evaluate Labeling Performance
LFAnalysis(L=L_train, lfs=lfs).lf_summary()

# Build Model
label_model = LabelModel(cardinality=2, verbose=True)
label_model.fit(L_train=L_train, n_epochs=500, log_freq=100, seed=123)

# Make Prediction
# Remove Abstain from label during prediction
df_train2["new_label"] = label_model.predict(L=L_train, tie_break_policy="abstain")

In [60]:
#### Thanks For Watching
#### Jesus Saves @JCharisTech
#### Jesse E.Agbe(JCharis)