### Load functions

In [25]:
%run "C:/Users/asclo/Desktop/HHS/NIH Dashboard/Python Notebooks/Final Deliverables/FunctionsForAnalysis.py"

### Load Test Data

In [3]:
abstract = pd.read_csv('C:/Users/asclo/Desktop/HHS/NIH Dashboard/Python Notebooks/Final Deliverables/All pmid abstracts from hpo annotations.csv',sep=',', index_col = False)
abstract = abstract.rename(columns = {'Pubmed_ID': 'text_id', 'Abstract': 'text'})
abstract['text_id'] = abstract['text_id'].astype(str)
abstract = abstract.iloc[0:2, :]

### Load HPO Annotation File

In [10]:
hpo_annotations = get_hpo_annotations_and_clean()

### Load graph data


In [5]:
g = hpo_hierarchy_graph_load()
phenotypic_abnormality_list = graph_phenotypic_abnormality(g)

### Setup for Analysis

one_annotation_direct_matching and one_annotation_matching_relatives inputs are only the information you want directly tested against each other.

In [15]:
# SETUP FOR FUNCTION

#Annotations to test
one_mondo_annotation = get_one_mondo_annotation(abstract.iloc[0, :]['text'], abstract.iloc[0, :]['text_id'], min_word_length = 4, longest_only = 'true', 
                        include_abbreviation = 'false', include_acronym = 'false', include_numbers = 'false')

id_ref = 'PMID:' + one_mondo_annotation['text_id']

one_mondo_annotation = one_mondo_annotation[one_mondo_annotation['id'].str.contains('HP:')]
one_mondo_annotation = one_mondo_annotation[['id', 'text_id']]
one_mondo_annotation.columns = ['hpo','text_id']

#known annotations
hpo_annotations = hpo_annotations[hpo_annotations['reference'].isin(id_ref)]

# Direct Matching Example
one_annotation_direct_matching uses one set of annotations to test against one set of known annotations.  In the example set up above, we are looking at one PubMedID that is in the hpo annotations.

This function returns a dataset that countains exact matches, test annotations that do not have matches, and known annotations with no matches.  This is all of the outputs when combining the two datasets together.

**Exact Matches:**  These are hpo codes that are in both known annotations and the group of hpos to test.

**Test Set Annotations With No Match:**  These hpo codes are in the test set(in this case Mondo), but are NOT in the list of known HPOs.

**Known Annotations with No Match:** These hpo codes are in the known annotations, but were not captured in the Abstract run through the Mondo annotator.


In [12]:
direct_example = one_annotation_direct_matching(hpo_annotations, one_mondo_annotation, graph_network = g)
direct_example

Unnamed: 0,uniqueid,hpo,text_id,exact_match,test_set_annotations_with_no_match,known_annotations_with_no_match
7,HP:0010864PMID:17088400,HP:0010864,17088400.0,1,0.0,0
0,,HP:0032320,17088400.0,0,1.0,0
8,,HP:0001417,17088400.0,0,1.0,0
11,,HP:0001290,17088400.0,0,1.0,0
13,,HP:0002205,17088400.0,0,1.0,0
29,,HP:0001249,17088400.0,0,1.0,0
83,,HP:0008947,17088400.0,0,1.0,0
87,,HP:0001344,17088400.0,0,1.0,0
89,,HP:0001250,17088400.0,0,1.0,0
92,,HP:0001257,17088400.0,0,1.0,0


# Direct and Relative Matching Example

one_annotation_matching_relatives  uses one set of annotations to test against one set of known annotations.  In this example, we are looking at one PubMedID that is in the hpo annotations.  This is the same data as in the Direct Matching Example above.


This function returns a dataset that countains exact matches, matches on either the parent or child level, test annotations that do not have matches, and known annotations with no matches.  This is all of the outputs when combining the two datasets together.

**Exact Matches:**  These are hpo codes that are in both known annotations and the group of hpos to test.

**Relative Matches:** The column relative_match indicates if the original_hpo in the test dataset has matched an hpo from the known annotations.  In this case, HP:0002191 is either a parent or child of HP:0001257.

**Test Set Annotations With No Match:**  These hpo codes are in the test set(in this case Mondo), but are NOT in the list of known HPOs.

**Known Annotations with No Match:** These hpo codes are in the known annotations, but were not captured in the Abstract run through the Mondo annotator.


In [16]:
relative_example = one_annotation_matching_relatives(hpo_annotations, one_mondo_annotation, graph_network = g)
relative_example

Unnamed: 0,uniqueid,hpo,original_hpo,text_id,exact_match,relative_match,test_set_annotations_with_no_match,known_annotations_with_no_match
7,HP:0010864PMID:17088400,HP:0010864,,17088400.0,1,0.0,0.0,0
4,HP:0002191PMID:17088400,HP:0002191,HP:0001257,17088400.0,0,1.0,0.0,0
0,,HP:0032320,,17088400.0,0,0.0,1.0,0
8,,HP:0001417,,17088400.0,0,0.0,1.0,0
11,,HP:0001290,,17088400.0,0,0.0,1.0,0
13,,HP:0002205,,17088400.0,0,0.0,1.0,0
29,,HP:0001249,,17088400.0,0,0.0,1.0,0
83,,HP:0008947,,17088400.0,0,0.0,1.0,0
87,,HP:0001344,,17088400.0,0,0.0,1.0,0
89,,HP:0001250,,17088400.0,0,0.0,1.0,0


# Scoring
The current scoring is fairly straight forward given that we have a bianry classification (the hpo either is or is not in the known annotations), and our classifier is not a percentage of certainty. Typcially, a classification algorithm will give a prediction in percentage form.  

Our test is a little counter intuitive since we are testing two distinct samples against each other.  This means that every right answer changes the sample size of the combined data.  However, what remains clear is that Precision, Recall and the F1 score are the important measures.

### Explaination of Measurements Outlined Below

**Annotations to Test:**  These are annotations from MetaMap or Mondo that can be matched with the gold standard list

**Known Annotations:** These are the number of annotations from the gold standard

**Accurately Predicted:**  This is the number of annotations to test that are in the known annotations

**Additional Measures**

**Precision:** (Accurately Predicted / Annotations to Test) Precision is the ratio of correctly predicted positive observations to the total predicted positive observations.  High percison rates indicate a low false positive rate.  This is the key measure for us since our test dataset has all positive observations (i.e. a list of hpo codes we believe exists in the annotated text).  *In other words, this is the percentage that is correct out of the tested annotations.*

**Recall:** (Accurately Predicted / Known Annotations) Recall is the ratio of correctly predicted positive observations to the all observations in actual class - yes. *This is the percent of known annotations that are correct.* 

**F1 Score:** F1 Score is the weighted average of Precision and Recall. Therefore, this score takes both false positives and false negatives into account. In other words, a high f score means that the percent of accurate tested annotations is high and percent of known annotations found is high

**Confusion Matrix:**  True_positive, False_positive, False_Negative, and True_Negative make up the Confusion matrix for the scoring.



### Scoring Direct Comparison

In [26]:
scoring(direct_example)

Unnamed: 0,ScoringType,Annotations_to_Test,Known_Annotations,Accurately_Predicted,Precision,Recall,F1_Score,True_Positive,False_Positive,False_Negative,True_Negative
0,Direct,13,8,1,0.076923,0.125,0.095238,1,12,7,0


### Scoring Relative Comparison

In [27]:
scoring(relative_example)

Unnamed: 0,ScoringType,Annotations_to_Test,Known_Annotations,Accurately_Predicted,Precision,Recall,F1_Score,True_Positive,False_Positive,False_Negative,True_Negative
0,Relative,13,8,2,0.153846,0.25,0.190476,2,11,6,0
