# Spacy/Medspacy IAA

## Resources

Prodigy forum answer about IAA for spans https://support.prodi.gy/t/proper-way-to-calculate-inter-annotator-agreement-for-spans-ner/5760

Spacy scorer object https://spacy.io/api/scorer

## End Goal

### Functionality

Provide a collection of methods to evaluate IAA between _n_ arbitrary spacy `doc` objects. Provide methods that aid in error analysis such as providing lists of differences.

Priorities:
* Pairwise F1
    * configurable strict/loose matching
    * configurable inclusion of labels/attributes (calculate just span vs span+class agreement)

* Imported python files
    * reasonable docstrings on methods/classes
    
* Unit tests
    * add CI to repo for automated testing later

Extra features:
* List of differences between docs
* 

Expected challenges
* Spacy scorer functions are useful, but _only_ do strict span matching
* Fewer resources (obviously?) available for comparisons between 3+ docs


In [3]:
import spacy

In [5]:
#!python -m spacy download en_core_web_md

Collecting en-core-web-md==3.1.0
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_md-3.1.0/en_core_web_md-3.1.0-py3-none-any.whl (45.4 MB)
[K     |████████████████████████████████| 45.4 MB 3.1 MB/s eta 0:00:01
Installing collected packages: en-core-web-md
Successfully installed en-core-web-md-3.1.0
[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_md')


In [6]:
nlp = spacy.load("en_core_web_sm")
nlp2 = spacy.load("en_core_web_md")

In [7]:
doc = nlp("this is a test document made in utah or mississippi, or salt lake city.")

In [8]:
doc.ents

(utah, mississippi)

In [9]:
doc2 = nlp2("this is a test document made in utah or mississippi, or salt lake city.")

In [10]:
doc2.ents

(utah, mississippi, lake city)

In [40]:
from spacy.tokens import Span
spand = list()
spand += [Span(doc, 2, 4, label="PERSON"),Span(doc,7,8,label="GPE"),Span(doc,9,10,label="PERSON"),Span(doc,13,14,label="GPE")]
print(span.text, span.label_)

# Add the span to the doc's entities
doc.ents = spand

# Print entities' text and labels
print([(ent.text, ent.label_) for ent in doc.ents])

a test PERSON
[('a test', 'PERSON'), ('utah', 'GPE'), ('mississippi', 'PERSON'), ('lake', 'GPE')]


In [45]:
tp,fp,fn = agreement(doc,doc2,1,1)
print(tp,fp,fn)

2 1 2


In [15]:
#In order to make the code a little more adaptable for situations of multiple overlapping entities, as well as for 
#transparency and testing the code, I wrote the overlaps code to output a mapping of which entities are being matched. 
#Then agreement can parse this output for how many valid overlaps exist.

#This makes the code a little more complicated to understand, but I think it makes everything more transparent and adaptable.

#Note that it may be slightly more efficient to calculate tp,fp,fn as iterators within the code, as opposed to reiterating through
#the resultant list.

def overlaps(doc1_ents, doc2_ents,labels):
    '''Calculates overlapping entities between two spacy documents. Also checks for matching labels if label=1.
    
    Return:
        Dictionaries with the mapping of matching entity indices:
            keys: entity index from one annotation
            value: matched entity index from other annotation
        
        Ex: "{1 : [2] , 3 : [4,5]}" means that entity 1 from doc1 matches entity 1 in doc2, and entity 3 in doc1 matches 
        entity 4 and 5 from doc2.
    '''
    
    doc1_matches = dict()
    doc2_matches = dict()

    for index1,ent1 in enumerate(doc1_ents):
        for index2,ent2 in enumerate(doc2_ents):
            if (ent1.end_char >= ent2.start_char) & (ent1.start_char <= ent2.end_char) & ((labels==0) | (ent1.label_ == ent2.label_)):
                if index1 not in doc1_matches.keys():
                    doc1_matches[index1] = [index2]
                else:
                    doc1_matches[index1].append(index2)
                if index2 not in doc2_matches.keys():
                    doc2_matches[index2] = [index1]
                else:
                    doc2_matches[index2].append(index1)
                
    return doc1_matches, doc2_matches
    

In [17]:
def exact_match(ent1, ent2):
    '''calculate whether two ents have exact overlap
    returns bool
    '''

In [20]:
def agreement(doc1, doc2, loose=1, labels=1):
    '''Calculates confusion matrix for agreement between two documents.
    
       returns true positive, false positive, and false negative
    '''
    tp = 0 #True Positives
    fp = 0 #False Positives
    fn = 0 #False Negatives
    
    doc1_ents = doc1.ents
    doc2_ents = doc2.ents
    
    if loose:
        doc1_matches, doc2_matches = overlaps(doc1_ents, doc2_ents,labels)
    else:
        doc1_matches, doc2_matches = exact_match(doc1_ents, doc2_ents,labels)
    
    doc1_match_num = len(doc1_matches.keys())
    doc2_match_num = len(doc2_matches.keys())
    
    duplicate_matches = 0
    for value in doc2_matches.values():
        duplicate_matches += len(value) - 1
    
    tp = doc1_match_num - duplicate_matches #How many entity indices from doc1 matched, minus duplicated matches
    fp = len(doc2_ents) - doc2_match_num #How many entities from doc2 that didn't match
    fn = len(doc1_ents) - doc1_match_num #How many entities from doc1 that didn't match
    
    return (tp, fp, fn)


In [2]:
from quicksectx import IntervalNode, IntervalTree, Interval
tree = IntervalTree()
tree.add(1, 3, 100)
tree.add(3, 7, 110)
tree.add(2, 5, 120)
tree.add(4, 6, 130)
print(tree.pretty_print())

Inv(3, 7, d=110)
l:  Inv(1, 3, d=100)
r:    Inv(2, 5, d=120)
r:  Inv(4, 6, d=130)



In [None]:
def pairwise_f1(confusion matrix):
    '''calculate f1 with given true positive, false positive, and false negative values'''
    
    return f1

In [None]:
def corpus_agreement(docs1, docs2, loose):
    '''calculate f1 over an entire corpus of documents'''
    corpus_tp, corpus_fp, corpus_fn = (0,0,0)
    
    for i, doc1 in enumerate(docs1):
        tp,fp,fn = agreement(doc1, docs2[i],loose)
        corpus_tp += tp
        corpus_fp += fp
        corpus_fn += fn
    
    return pairwise_f1(tp,fp,fn)