## NER GS Overlap

This notebook finds the overlap of entities in the benchmark-annotated NER GSs with our un-typed NER GS

We summarize:
- the total number of entities generated by each benchmark-annotated GS
- the number of entities in each benchmark-annotated GS that match or partially match an entity in our GS, regardless of label
- the sum of the matches and partial matches divided by the total number of entities in our un-typed GS

To do this, we use the NER eval kit developed by davidsbatista at https://github.com/davidsbatista/NER-Evaluation/tree/master, used in the NER evaluation.

In [26]:
import sys
sys.path.append("../evaluations/automatic/NER-Evaluation")
sys.path.append("../evaluations/automatic")
from ner_semeval import get_faa_tokenized, get_true_pred_ents, check_named_entities, eval
import pandas as pd

In [27]:
conll_tags = ['PER', 'ORG', 'MISC', 'LOC']
ace_nltk_tags = ['PER','ORG','LOC','FAC','GPE'] # RESTRICTED SET
ace_tags = ['PER','ORG','LOC','FAC','GPE','VEHICLE','WEAPON']
on_tags = ['PER','ORG','LOC','FAC','GPE','PRODUCT','NORP','QUANTITY','EVENT','WORK_OF_ART','CARDINAL','DATE','PERCENT','TIME','ORDINAL','MONEY','LAW','LANGUAGE']

In [29]:
bench_data = {'Conll-2003':{'tags':conll_tags, 'path':'processed/ner_conll.csv'},'ACE-2005':{'tags':ace_tags,'path':'processed/ner_ace.csv'},'ACE Phase 1':{'tags':ace_nltk_tags, 'path':'processed/ner_ace_nltk.csv'},'OntoNotes 5.0':{'tags':on_tags,'path':'processed/ner_on.csv'}}

In [30]:
for benchmark in bench_data:
    # Mostly copied from ner_semeval.main: #####
    
    faa = get_faa_tokenized('../data/FAA_data/faa.conll')
    
    utfaa = pd.read_csv('processed/ner.csv')
    utfaa['labels'] = ['ORG']*len(utfaa) # dummy labels to not break the script
    bench = pd.read_csv(bench_data[benchmark]['path'])
    
    all_true_ents, all_pred_ents = get_true_pred_ents(utfaa, bench, faa)
    
    # Check that true and pred ents were processed without error
    for named_entities, df in zip([all_true_ents, all_pred_ents],[utfaa, bench]):
        probs = check_named_entities(named_entities, utfaa['id'].unique(), df)
        if len(probs) > 0:
            print(f"Warning: The following mentions could not be matched to span indices in documents. Ignore if none of these are present in GS: {probs}")
    
    tags = bench_data[benchmark]['tags']
    
    results = eval(all_true_ents, all_pred_ents, tags)
    
    ###################
    
    # Use results['partial'] to get stats:
    bench_data[benchmark]['total'] = f"{results['partial']['actual']}"
    bench_data[benchmark]['match'] = f"{results['partial']['correct']}"
    bench_data[benchmark]['partial'] = f"{results['partial']['partial']}"
    bench_data[benchmark]['overlap'] = f"{(match + partial)/results['partial']['possible']:.2}"

In [31]:
# Print out:

print("|               | Total | Match | Partial | Overlap |")
print("|---------------|-------|-------|---------|---------|")
for benchmark in bench_data:
    print(f"| {benchmark:13} | {bench_data[benchmark]['total']:6}| {bench_data[benchmark]['match']:6}| {bench_data[benchmark]['partial']:8}| {bench_data[benchmark]['overlap']:8}|")

|               | Total | Match | Partial | Overlap |
|---------------|-------|-------|---------|---------|
| Conll-2003    | 44    | 36    | 8       | 0.12    |
| ACE-2005      | 195   | 133   | 54      | 0.11    |
| ACE Phase 1   | 122   | 89    | 26      | 0.12    |
| OntoNotes 5.0 | 61    | 52    | 9       | 0.12    |
