# Voting Systems

In this notebook we compare different voting systems to get the final prediction. Our models were trained on the blind test data, that is not biased by the training process and use the dev-test data to test our model.

## Load Test Data and Models

In [56]:
from read_write_files import read_json,save_json,get_parser_paths
from helper_functions import get_sense_lists,align_parsers_to_gold
import numpy as np
import conll16st.scorer as scorer
import random
from collections import Counter

In [80]:
sense_model_path = "data/project_files/test/sense_model.json"
total_alignment_path = "data/project_files/blind/total_alignment.json"
test_data_path = "data/gold_standard/blind/relations.json"

In [81]:
total_alignments = read_json(total_alignment_path)
sense_model = read_json(sense_model_path)
test_data = read_json(test_data_path)

In [16]:
gold_senses,parser_senses,parser_names = get_sense_lists(total_alignments)

## Voting Systems

In [17]:
def voting(parser_preds,parser_names,model,voting_algorithm):
    new_senses = []
    
    parser_pred_zip = zip(*parser_preds)
    for predictions in parser_pred_zip:
        result = voting_algorithm(predictions,model)
        new_senses += [result]
        
    return new_senses

In [75]:
def best_wins_voting(predictions,model):
    probs = []
    sense_predictions = []
    for ind,pred in enumerate(predictions):
        sense_predictions += [pred]
        sense_dic = model[ind]["sense_pred"]
        if (pred == "None") or not (sense_dic.has_key(pred)):
            probs += [0]
        else:
            probs += [sense_dic[pred]["f1"]]
            
    result = np.argmax(probs)
    
    if sum(probs) == 0:
        result = -1
    #    sense_counter = Counter(sense_predictions)
    #    best_sense = a.most_common(1)[0][0]
    #    probs
    
    return result

In [76]:
best_parser_indexes = voting(parser_senses,parser_names,sense_model,best_wins_voting)

## Exchange new Attributes in Relation File

Only for sense evaluation, because we take the arg span from the gold file to have a clear mapping between the gold and the prediction (only sense is exchanged)

In [86]:
def exchange_sense_values(alignment_list,best_relation_indexes):
    new_relations = []
    
    for best_parser,alignments in zip(*[best_relation_indexes,alignment_list]):
        if best_parser != -1:
            new_rel = alignments["parsers"][best_parser]
            
            #best_sense = best_parser_result["Sense"][0]
            #new_rel = alignments["gold"].copy()
            #new_rel["Sense"] = best_sense
        
            new_relations.append(new_rel)
        
    return new_relations

In [87]:
new_relations = exchange_sense_values(total_alignments,best_parser_indexes)

In [88]:
scorer.evaluate(test_data,new_relations)

Explicit connectives         : Precision 0.8403 Recall 0.7572 F1 0.7966
Arg 1 extractor              : Precision 0.8224 Recall 0.8197 F1 0.8210
Arg 2 extractor              : Precision 0.8108 Recall 0.8081 F1 0.8094
Arg1 Arg2 extractor combined : Precision 0.6988 Recall 0.6964 F1 0.6976
Sense classification--------------
*Micro-Average                    precision 0.2988	recall 0.2978	F1 0.2983
Comparison.Concession             precision 1.0000	recall 0.0000	F1 0.0000
Comparison.Contrast               precision 0.0965	recall 0.4074	F1 0.1560
Contingency.Cause.Reason          precision 0.1935	recall 0.0800	F1 0.1132
Contingency.Cause.Result          precision 0.5000	recall 0.0204	F1 0.0392
Contingency.Condition             precision 0.5312	recall 0.6538	F1 0.5862
EntRel                            precision 0.2736	recall 0.1450	F1 0.1895
Expansion.Alternative             precision 1.0000	recall 0.3333	F1 0.5000
Expansion.Conjunction             precision 0.3260	recall 0.5920	F1 0.4205
Ex

(<conll16st.confusion_matrix.ConfusionMatrix at 0x1085e2e90>,
 <conll16st.confusion_matrix.ConfusionMatrix at 0x10bc23050>,
 <conll16st.confusion_matrix.ConfusionMatrix at 0x105642a10>,
 <conll16st.confusion_matrix.ConfusionMatrix at 0x1083de710>,
 <conll16st.confusion_matrix.ConfusionMatrix at 0x1083de8d0>,
 0.2988,
 0.2978,
 0.2983)

In [89]:
scorer.evaluate(test_data,read_json("data/submissions/sense_only/blind/oslopots/output/output.json"))

Explicit connectives         : Precision 1.0000 Recall 0.9874 F1 0.9937
Arg 1 extractor              : Precision 1.0000 Recall 1.0000 F1 1.0000
Arg 2 extractor              : Precision 1.0000 Recall 1.0000 F1 1.0000
Arg1 Arg2 extractor combined : Precision 1.0000 Recall 1.0000 F1 1.0000
Sense classification--------------
*Micro-Average                    precision 0.5360	recall 0.5352	F1 0.5356
Comparison.Concession             precision 1.0000	recall 0.0660	F1 0.1239
Comparison.Contrast               precision 0.2347	recall 0.4182	F1 0.3007
Contingency.Cause.Reason          precision 0.4706	recall 0.4384	F1 0.4539
Contingency.Cause.Result          precision 0.5455	recall 0.2449	F1 0.3380
Contingency.Condition             precision 0.9286	recall 1.0000	F1 0.9630
EntRel                            precision 0.3824	recall 0.7150	F1 0.4983
Expansion.Alternative             precision 1.0000	recall 0.3333	F1 0.5000
Expansion.Conjunction             precision 0.6512	recall 0.7377	F1 0.6918
Ex

(<conll16st.confusion_matrix.ConfusionMatrix at 0x10c16fd50>,
 <conll16st.confusion_matrix.ConfusionMatrix at 0x1083dec50>,
 <conll16st.confusion_matrix.ConfusionMatrix at 0x1083de9d0>,
 <conll16st.confusion_matrix.ConfusionMatrix at 0x1083de850>,
 <conll16st.confusion_matrix.ConfusionMatrix at 0x1083deb90>,
 0.536,
 0.5352,
 0.5356)