# Chemical-Disease Relation (CDR) Tutorial

In this example, we'll be writing an application to extract *mentions of* **chemical-induced-disease relationships** from Pubmed abstracts, as per the [BioCreative CDR Challenge](http://www.biocreative.org/resources/corpora/biocreative-v-cdr-corpus/).  This tutorial will show off some of the more advanced features of Snorkel, so we'll assume you've followed the Intro tutorial.

Let's start by reloading from the last notebook.

In [1]:
%load_ext autoreload
%autoreload 2

%matplotlib inline

from snorkel import SnorkelSession

session = SnorkelSession()

In [None]:
from snorkel.models import Candidate, candidate_subclass

BiomarkerCondition = candidate_subclass('BiomarkerCondition', ['biomarker', 'condition'])

train_cands = session.query(BiomarkerCondition).filter(BiomarkerCondition.split == 0).all()
dev_cands = session.query(BiomarkerCondition).filter(BiomarkerCondition.split == 1).all()
print train_cands

[BiomarkerCondition(Span("EGFR", sentence=4224, chars=[85,88], words=[17,17]), Span("lung cancer", sentence=4224, chars=[156,166], words=[37,38])), BiomarkerCondition(Span("EGFR", sentence=4224, chars=[134,137], words=[32,32]), Span("lung cancer", sentence=4224, chars=[156,166], words=[37,38])), BiomarkerCondition(Span("NRAS", sentence=4224, chars=[120,123], words=[26,26]), Span("lung cancer", sentence=4224, chars=[156,166], words=[37,38])), BiomarkerCondition(Span("NRF2", sentence=4224, chars=[42,45], words=[7,7]), Span("lung cancer", sentence=4224, chars=[156,166], words=[37,38])), BiomarkerCondition(Span("BRAF", sentence=4224, chars=[74,77], words=[13,13]), Span("lung cancer", sentence=4224, chars=[156,166], words=[37,38])), BiomarkerCondition(Span("ALK", sentence=4224, chars=[95,97], words=[20,20]), Span("lung cancer", sentence=4224, chars=[156,166], words=[37,38])), BiomarkerCondition(Span("KRAS", sentence=4224, chars=[127,130], words=[29,29]), Span("lung cancer", sentence=4224, c

# Part III: Writing LFs

This tutorial features some more advanced LFs than the intro tutorial, with more focus on distant supervision and dependencies between LFs.

### Distant supervision approaches

We'll use the [Comparative Toxicogenomics Database](http://ctdbase.org/) (CTD) for distant supervision. The CTD lists chemical-condition entity pairs under three categories: therapy, marker, and unspecified. Therapy means the chemical treats the condition, marker means the chemical is typically present with the condition, and unspecified is...unspecified. We can write LFs based on these categories.

### Text pattern approaches

Now we'll use some LF helpers to create LFs based on indicative text patterns. We came up with these rules by using the viewer to examine training candidates and noting frequent patterns.

In [None]:
import re
#from snorkel.lf_terms import *
from snorkel.lf_helpers import get_doc_candidate_spans
from snorkel.lf_helpers import get_sent_candidate_spans
from snorkel.lf_helpers import get_left_tokens, get_right_tokens

#umls_dict              = load_umls_dictionary()
#chemicals              = load_chemdner_dictionary()
#abbrv2text, text2abbrv = load_specialist_abbreviations()

keyWords = ["associate", "express", "marker", "biomarker", "elevated", "decreased",
            "correlation", "correlates", "found", "diagnose", "variant", "appear",
            "connect", "relate", "exhibit", "indicate", "signify", "show", "demonstrate",
            "reveal", "suggest", "evidence", "elevation", "indication", "diagnosis",
            "variation", "modification", "suggestion", "link", "derivation", "denote",
            "denotation", "demonstration", "magnification", "depression", "boost", "level",
            "advance", "augmentation", "lessening", "enhancement", "expression", "buildup",
            "diminishing", "diminishment", "reduction", "drop", "dwindling", "lowering"]

negationWords = ["not", "nor", "neither"]

# Document-level LFs:
#--------------------
#def LF_undefined_abbreviation(c):
#    '''Candidate is a known abbreviation, but no corresponding full name in document'''
#    doc_spans = get_doc_candidate_spans(c)
#    phrase = c[0].get_span().lower()
#    mentions = set([s.get_span().lower() for s in doc_spans])
#    if len(phrase) > 1 and phrase in abbrv2text and not set(abbrv2text[phrase].keys()).intersection(mentions):
#        return -
    
# Sentence-level LFs:
#---------------------
def LF_contiguous_mentions(c):
    '''Contiguous candidates are likely wrong'''
    neighbor_spans = get_sent_candidate_spans(c)
    start, end = c[0].get_word_start(), c[0].get_word_end()
    for s in neighbor_spans:
        if s.get_word_end() + 1 == start or s.get_word_start() - 1 == end:
            return -1
    return 0

# Mention-level LFs:
#-------------------
def LF_tumors_growths(c):
    phrase = " ".join(c[0].get_attrib_tokens('lemmas'))
    return 1 if re.search("^(\w* ){0,2}(['] )*(tumor|tumour|polyp|pilomatricoma|cyst|lipoma)$", phrase) else 0

def LF_cancer(c):
    '''<TYPE> cancer'''
    phrase = " ".join(c[0].get_attrib_tokens('lemmas'))
    return 1 if re.search("\w* cancer",phrase) else 0

def LF_disease_syndrome(c):
    '''<TYPE> disease or <TYPE> syndrome'''
    phrase = " ".join(c[0].get_attrib_tokens('lemmas'))
    return 1 if re.search("\w* (disease|syndrome)+",phrase) else 0

#def LF_indicators(c):
#    '''Indicator words'''
#    return 1 if " ".join(c[0].get_attrib_tokens()).lower() in indicators else 0

#def LF_common_disease(c):
#    '''Common disease'''
#    return 1 if " ".join(c[0].get_attrib_tokens()).lower() in common_disease else 0

#def LF_common_disease_acronyms(c):
#    '''Common disease acronyms'''
#    return 1 if " ".join(c[0].get_attrib_tokens()) in common_disease_acronyms else 0

def LF_deficiency_of(c):
    '''deficiency of <TYPE>'''
    phrase = " ".join(c[0].get_attrib_tokens()).lower()
    return 1 if phrase.endswith('deficiency') or phrase.startswith('deficiency') or phrase.endswith('dysfunction') else 0

#def LF_positive_indicator(c):
#    flag = False
#    for i in c[0].get_attrib_tokens():
#        if i.lower() in positive_indicator:
#            flag = True
#            break
#    return 1 if flag else 0

def LF_left_positive_argument(c):    
    phrase = " ".join(c[0].get_attrib_tokens('lemmas')).lower()
    pattern = "(\w+ ){1,2}(infection|lesion|neoplasm|attack|defect|anomaly|abnormality|degeneration|carcinoma|lymphoma|tumor|tumour|deficiency|malignancy|hypoplasia|disorder|deafness|weakness|condition|dysfunction|dystrophy)$"
    return 1 if re.search(pattern,phrase) else 0

def LF_right_negative_argument(c):    
    phrase = " ".join(c[0].get_attrib_tokens('lemmas')).lower()
    pattern = "^(history of|mitochondrial|amino acid)( \w+){1,2}"
    return 1 if re.search(pattern, phrase) else 0

def LF_medical_afixes(c):
    pattern = "(\w+(pathy|stasis|trophy|plasia|itis|osis|oma|asis|asia)$|^(hyper|hypo)\w+)"
    phrase = " ".join(c[0].get_attrib_tokens('lemmas')).lower()
    return 1 if re.search(pattern,phrase) else 0

#def LF_adj_diseases(c):
#    return 1 if ' '.join(c[0].get_attrib_tokens()) in adj_diseases else 0


# Dictionary LFs:
#----------------
#def LF_SNOWMED_CT_sign_or_symptom(c):
#    return 1 if c[0].get_span() in umls_dict["snomedct"]["sign_or_symptom"] else 0

#def LF_SNOWMED_CT_disease_or_syndrome(c):
#    return 1 if c[0].get_span() in umls_dict["snomedct"]["disease_or_syndrome"] else 0

#def LF_MESH_disease_or_syndrome(c):
#    return 1 if c[0].get_span() in umls_dict["mesh"]["disease_or_syndrome"] else 0

#def LF_MESH_sign_or_symptom(c):
#    return 1 if c[0].get_span() in umls_dict["mesh"]["sign_or_symptom"] else 0


# Negative LFs:
#--------------
#def LF_organs(c):
#    phrase = " ".join(c[0].get_attrib_tokens()).lower()
#    return -1 if phrase in organs else 0      

#def LF_chemical_name(c):
#    phrase = " ".join(c[0].get_attrib_tokens())
#    return -1 if phrase in chemicals and not phrase.isupper() else 0

#def LF_bodysym(c):
#    phrase = " ".join(c[0].get_attrib_tokens()).lower()
#    return -1 if phrase in bodysym else 0  

def LF_protein_chemical_abbrv(c):
    '''Gene/protein/chemical name'''
    lemma = " ".join(c[0].get_attrib_tokens('lemmas'))
    return -1 if re.search("\d+",lemma) else 0

def LF_base_pair_seq(c): 
    lemma = " ".join(c[0].get_attrib_tokens('lemmas'))
    return -1 if re.search("^[GACT]{2,}$",lemma) else 0

#def LF_too_vague(c):
#    phrase = " ".join(c[0].get_attrib_tokens('lemmas')).lower()
#    phrase_ = " ".join(c[0].get_attrib_tokens()).lower()
#    return -1 if phrase in vague or phrase_ in vague else 0


#COMMENTED OUT BECAUSE BREAKS CURRENT SNORKEL CODE BASE
# def LF_neg_surfix(c):
#     terms = ['deficiency', 'the', 'the', 'of', 'to', 'a']
#     rw = get_right_tokens(c, window=1, attrib='lemmas')
#     if len(rw) > 0 and rw[0].lower() in terms:
#         return -1
#     return 0

#def LF_non_common_disease(c):
#    '''Non common diseases'''
#    return -1 if " ".join(c[0].get_attrib_tokens()).lower() in non_common_disease else 0

#def LF_non_disease_acronyms(c):
#    '''Non common disease acronyms'''
#    return -1 if " ".join(c[0].get_attrib_tokens()) in non_disease_acronyms else 0

def LF_pos_in(c):
    '''Candidates beginning with a preposition or subordinating conjunction'''
    pos_tags = c[0].get_attrib_tokens('pos_tags')
    return -1 if "IN" in pos_tags[0:1] else 0


#def LF_right_window_incomplete(c):
#    return -1 if right_terms.intersection(get_right_tokens(c, window=2, attrib='lemmas')) else 0

#def LF_negative_indicator(c):
#    flag = False
#    for i in c[0].get_attrib_tokens():
#        if i.lower() in negative_indicator:
#            flag = True
#            break
#    return -1 if flag else 0

x = '''
# extra custom
#--------------
def presenceOfNot(m):
    for word in negationWords:
        if (word in m[0].get_right_tokens('lemmas', 20)) and (word in m.pre_window2('lemmas', 20)):
            return True
    return False
# 1
def LF_remove_same_word(m):
    if(m.mention1(attribute='words')[0] == m.mention2(attribute='words')[0]):
        return -1
    
def LF_distance(m):
    print "FIRST"
    print type(m)
    # if 'neuroendocrine' in m.lemmas:
    #     print m.lemmas
    # print m.dep_labels
    distance = abs(m.e2_idxs[0] - m.e1_idxs[0])
    count = 0
    for lemma in m.lemmas:
        if lemma == ',':
            count += 1
    if count > 1 and ',' in m.pre_window1('lemmas', 1):
        print m
        return 0
    if distance == 0:
        return -1
    if distance < 8:
        # print "RETURNING ONE"
        return 0
    else:
        return -1
    
def LF_roman_numeral(m):
    biomarker = (m.mention1(attribute='words')[0])
    unicodedata.normalize('NFKD', biomarker).encode('ascii','ignore')
    if re.match(r'((?<=\s)|(?<=^))(M{1,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})|M{0,4}(CM|CD|D?C{1,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})|M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{1,3})(IX|IV|V?I{0,3})|M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{1,3}))(?=\s)',
                biomarker):
        print "MATCHED ROMAN"
        print m.mention1(attribute='words')
        return -1

# 4
def LF_marker(m):
    return 1 if ( ('marker' in m[0].get_attrib_tokens('lemmas', 6) or 'biomarker' in m[.post_window1('lemmas', 6)) and (
        'marker' in m.pre_window2('lemmas', 6) or 'biomarker' in m.pre_window2('lemmas', 6)) ) or (('marker' in m.pre_window1('lemmas', 6) or 'biomarker' in m.pre_window1('lemmas', 6)) and (
        'marker' in m.post_window2('lemmas', 6) or 'biomarker' in m.post_window2('lemmas', 6)))  else 0

# 9 (-1 if biomarker is confused with a name of a person)
def LF_People(m):
    return -1 if ('NNP' in m.mention1(attribute='poses')) else 0
# 51
def LF_possible(m):
    return -1 if ('possible' in m.pre_window1('lemmas', 20)) else 0
# 52
def LF_explore(m):
    return -1 if ('explore' in m.pre_window1('lemmas', 20)) else 0
# 53
def LF_key(m):
    # print m.pre_window1('lemmas', 20)
    return -1 if ('abbreviation' in m.pre_window1('lemmas', 20) or (
        'word' in m.pre_window1('lemmas', 20) and 'key' in m.pre_window1('lemmas', 20))) else 0
# 54
def LF_investigate(m):
    return -1 if ('investigate' in m.pre_window1('lemmas', 20)) else 0
# 55
def LF_yetToBeConfirmed(m):
    return -1 if ('yet' in m.post_window1('lemmas', 20) and 'to' in m.post_window1('lemmas', 20) and 'be' in m.post_window1(
        'lemmas', 20) and 'confirmed' in m.post_window1('lemmas', 20)) else 0
# 56
def LF_notAssociated(m):
    return -1 if ('not' in m.post_window1('lemmas', 20) and 'associated' in m.post_window2('lemmas', 20)) else 0
# 56
def LF_notRelated(m):
    return -1 if ('not' in m.post_window1('lemmas', 20) and 'related' in m.post_window2('lemmas', 20)) else 0
# 57
def LF_doesNotShow(m):
    return -1 if (
        'does' in m.post_window1('lemmas', 20) and 'not' in m.post_window1('lemmas', 20) and 'show' in m.post_window2(
            'lemmas', 20)) else 0
# 58
def LF_notLinked(m):
    return -1 if ('not' in m.post_window1('lemmas', 20) and 'linked' in m.post_window2('lemmas', 20)) else 0
# 59
def LF_notCorrelated(m):
    return -1 if ('not' in m.post_window1('lemmas', 20) and 'correlated' in m.post_window2('lemmas', 20)) else 0
# 60
def LF_disprove(m):
    return -1 if ('disprove' in m.post_window1('lemmas', 20)) else 0
# 62
def LF_doesNotSignify(m):
    return -1 if (
        'does' in m.post_window1('lemmas', 20) and 'not' in m.post_window1('lemmas', 20) and 'signify' in m.post_window(
            'lemmas', 20)) else 0
# 63
def LF_doesNotIndicate(m):
    print "SECOND"
    return -1 if (
        'does' in m.post_window1('lemmas', 20) and 'not' in m.post_window1('lemmas', 20) and 'indicate' in m.post_window(
            'lemmas', 20)) else 0
# 64
def LF_doesNotImply(m):
    print "THIRD"
    return -1 if (
        'does' in m.post_window1('lemmas', 20) and 'not' in m.post_window1('lemmas', 20) and 'imply' in m.post_window(
            'lemmas', 20)) else 0
# 65
def LF_studies(m):
    return 1 if (
        'studies' in m.pre_window1('lemmas', 20) and 'have' in m.pre_window1('lemmas', 20) and'reported' in m.pre_window1(
            'lemmas', 20)) else 0
# 66
def LF_studies2(m):
    return 1 if (
        'studies' in m.pre_window1('lemmas', 20) and 'have' in m.pre_window1('lemmas', 20) and 'disclosed' in m.pre_window1(
            'lemmas', 20)) else 0
# 67
def LF_studies3(m):
    return 1 if (
        'studies' in m.pre_window1('lemmas', 20) and 'have' in m.pre_window1('lemmas', 20) and'disclosed' in m.pre_window1('lemmas', 20)) else 0
# 68
def LF_studies4(m):
    return 1 if (
        'studies' in m.pre_window1('lemmas', 20) and 'have' in m.pre_window1('lemmas', 20) and 'expressed' in m.pre_window1(
            'lemmas', 20)) else 0
# 69
def LF_interesting(m):
    return 1 if (
        'is' in m.post_window1('lemmas', 20) and 'an' in m.post_window1('lemmas', 20) and 'interesting' in m.post_window1(
            'lemmas', 20) and 'target' in m.post_window1('lemmas', 20) and 'is' in m.pre_window2('lemmas', 20) and 'an' in
        m.pre_window2('lemmas', 20) and 'interesting' in m.pre_window2('lemmas', 20) and 'target' in m.pre_window2(
            'lemmas', 20)) else 0
# 70
def LF_discussion(m):
    return 1 if (
        'discussion' in m.pre_window1('lemmas', 20)) else 0
# 71
def LF_conclusion(m):
    if ('conclusion' in m.pre_window1('lemmas', 20) or 'conclusion' in m.pre_window2('lemmas', 20)):
        # print "FOUND"
        return 1
    else:
        return 0
# 72
def LF_recently(m):
    return 1 if (
        'recently' in m.pre_window1('lemmas', 20) or 'recently' in m.post_window1('lemmas', 20)) else 0
# 73
def LF_induced(m):
    return 1 if (
        'induced' in m.post_window1('lemmas', 20) and 'induced' in m.pre_window2('lemmas', 20)) else 0
# 74
def LF_treatment(m):
    return 1 if (
        'treatment' in m.pre_window1('lemmas', 20) or 'treatment' in m.post_window1('lemmas', 20)) else 0
# 75
def LF_auxpass(m):
    if not ('auxpass' and 'aux') in (m.post_window1('dep_labels', 20) and m.pre_window2('dep_labels', 20)):
        return -1
    else:
        return 0
# 75
def LF_inbetween(m):
    # with open('diseaseDatabase.pickle', 'rb') as f:
    #     diseaseDictionary = pickle.load(f)
    # with open('diseaseAbbreviationsDatabase.pickle', 'rb') as f:
    #     diseaseAbb = pickle.load(f)
    # with open('markerData.pickle', 'rb') as f:
    #     markerDatabase = pickle.load(f)
    # for marker in markerDatabase:
    #     if(marker in list[m.e1_idxs[0] : m.e2_idxs[0]]):
    #         return -1
    # for disease in diseaseDictionary:
    #     if (disease in list[m.e1_idxs[0]: m.e2_idxs[0]]):
    #         return -1
    # for disease in diseaseAbb:
    #     if (marker in list[m.e1_idxs[0]: m.e2_idxs[0]]):
    #         return -1
    return 0
# 76
def LF_patientsWith(m):
    return 1 if ('patient' in m.post_window2('lemmas', 3)) and ('with' in m.post_window2('lemmas',2)) else 0
# 77
def LF_isaBiomarker(m):
    post_window1_lemmas = m.post_window1('lemmas',20)
    pre_window2_lemmas = m.pre_window2('lemmas',20)
    if ('biomarker' in post_window1_lemmas and 'biomarker' in pre_window2_lemmas) or ('marker' in post_window1_lemmas and 'marker' in pre_window2_lemmas) or ('indicator' in post_window1_lemmas and 'indicator' in pre_window2_lemmas):
        marker_idx_post_window1 = -1
        markers = ['biomarker','marker','indicator']
        for marker in markers:
            try:
                # print post_window1_lemmas
                findMarker = post_window1_lemmas.index(marker)
                if not findMarker == -1:
                    marker_idx_post_window1 = findMarker
                    print marker
            except:
                pass
        if 'cop' in m.post_window1('dep_labels',20):
            try:
                cop_idx_post_window1 = m.post_window1('dep_labels',20).index('cop')
            except:
                pass
            
            print "MarkerIdx:"
            print marker_idx_post_window1
            print "ROOTIdx:"
            try:
                print  m.post_window1('dep_labels',marker_idx_post_window1)
                print  m.post_window1('dep_labels',marker_idx_post_window1).index('ROOT')
            except:
                pass
            print '\n'
            
            return 1 if ('nsubj' in m.mention1(attribute='dep_labels')) and (marker_idx_post_window1-cop_idx_post_window1 < 4)  else 0
    return 0
# 78
def LF_suspect(m):
    return -1 if ('suspect' in m.pre_window1('lemmas', 20) or 'suspect' in m.post_window1('lemmas', 20)) else 0
# 79
def LF_mark(m):
    return -1 if ( 'vmod' in m.post_window1('dep_labels', 20) and 'mark' in m.post_window1('dep_labels', 20) or'vmod' in m.pre_window1('dep_labels', 20) and 'mark' in m.pre_window1('dep_labels', 20)) else 0
'''

### Composite LFs

The following LFs take some of the strongest distant supervision and text pattern LFs, and combine them to form more specific LFs. These LFs introduce some obvious dependencies within the LF set, which we will model later.

### Rules based on context hierarchy

These last two rules will make use of the context hierarchy. The first checks if there is a chemical mention much closer to the candidate's disease mention than the candidate's chemical mention. The second does the analog for diseases.

### Running the LFs on the training set

In [None]:
LFs = [LF_contiguous_mentions, LF_tumors_growths, LF_cancer, LF_disease_syndrome, LF_deficiency_of, LF_left_positive_argument, LF_right_negative_argument, LF_medical_afixes, LF_protein_chemical_abbrv, LF_base_pair_seq, LF_pos_in]

In [None]:
from snorkel.annotations import LabelAnnotator
labeler = LabelAnnotator(lfs=LFs)

In [None]:
%time L_train = labeler.apply(split=0)
L_train

In [None]:
L_train.lf_stats(session)

# Part IV: Training the generative model

As mentioned above, we want to include the dependencies between our LFs when training the generative model. Snorkel makes it easy to do this! `DependencySelector` runs a fast structure learning algorithm over the matrix of LF outputs to identify a set of likely dependencies. We can see that these match up with our prior knowledge. For example, it identified a "reinforcing" dependency between `LF_c_induced_d` and `LF_ctd_marker_induce`. Recall that we constructed the latter using the former.

In [None]:
from snorkel.learning.structure import DependencySelector
ds = DependencySelector()
deps = ds.select(L_train, threshold=0.1)
len(deps)

In [None]:
deps

In [None]:
deps = set()

Now we'll train the generative model, using the `deps` argument to account for the learned dependencies. We'll also model LF propensity here, unlike the intro tutorial. In addition to learning the accuracies of the LFs, this also learns their likelihood of labeling an example.

In [None]:
from snorkel.learning import GenerativeModel

gen_model = GenerativeModel(lf_propensity=True)
gen_model.train(
    L_train, deps=deps, decay=0.95, step_size=0.1/L_train.shape[0], reg_param=0.0
)

In [None]:
train_marginals = gen_model.marginals(L_train)

In [None]:
import matplotlib.pyplot as plt
plt.hist(train_marginals, bins=20)
plt.show()

In [None]:
gen_model.learned_lf_stats()

In [None]:
from snorkel.annotations import save_marginals
save_marginals(session, L_train, train_marginals)

### Checking performance against development set labels

Finally, we'll run the labeler on the development set, load in some external labels, then evaluate the LF performance. The external labels are applied via a small script for convenience. It maps the document-level relation annotations found in the CDR file to mention-level labels. Note that these will not be perfect, although they are pretty good. If we wanted to keep iterating, we could use `snorkel.lf_helpers.test_LF` against the dev set, or look at some false positive and false negative candidates.

In [None]:
from load_external_annotations_new import load_external_labels
load_external_labels(session, BiomarkerCondition, annotator_name='gold', label_fname='articles/output_spans_gold.tsv')

In [None]:
from snorkel.annotations import load_gold_labels
L_gold_dev = load_gold_labels(session, annotator_name='gold', split=1)
L_gold_dev
print L_gold_dev.get_candidate(session,99)[0]

In [22]:
 L_dev = labeler.apply_existing(split=1)

Clearing existing...
Running UDF...



In [None]:
_ = gen_model.score(session, L_dev, L_gold_dev)

In [None]:
L_dev.lf_stats(session, L_gold_dev, gen_model.learned_lf_stats()['Accuracy'])