In [None]:
import pandas as pd, numpy as np

## Find standalone models with highest accuracy from all experiments
### Top 3

1. SVMSMOTE - RF: 87.4% acc
2. SNV_AUGMENTEDV3 - RF: 86.7% acc
3. RAW - Ensemble [rf, knn, xgb, svmrbf, nb]: 86% acc (poor recall 67.2%)

## Find standalone models with highest recall from all experiments
This one was a bit tricky, since many models, such as GNB and SVM-Sig produced very high neoplasia recall, but had very low accuracy overall. Hence, I filtered out any models with less than 75% acc

### Top 3

1. FS_AUGMENTED - CART: 82.8% rec
2. SNV_AUGMENTEDV3 - KNN: 81% rec
3. SNV_FS_BALANCED: kNN: 78.9% rec



### Honorable mention
4. AUGMENTEDV3 - RF & kNN: 75.9% rec *2nd highest scoring standalone model in acc (86.7%) & (80.4% acc)
4. KMEANSSMOTE - kNN: 75.9% rec (83.9% acc)
5. SVMSMOTE: RF: 74% rec *highest scoring standalone model in acc (87.4%)


In [36]:
# function to sort scoreboard
import json
def sort_scores(filename='metrics/scoreboard.json'):
    with open(filename,'r') as file:
        # First we load existing data into a dict.
        file_data = json.load(file)
        acc_sort = dict(sorted(file_data.items(), key=lambda item: item[1]['top6']['accuracy'], reverse=True))
        rec_sort = dict(sorted(file_data.items(), key=lambda item: item[1]['top6']['recall'], reverse=True))

    with open('metrics/top6acc.json', 'w') as new_file:
        json.dump(acc_sort, new_file, indent = 4)
    acc_keys = list(acc_sort.keys())
    print('acc:',list(acc_sort.keys()))

    with open('metrics/top6rec.json', 'w') as new_file:
        json.dump(rec_sort, new_file, indent = 4)
    rec_keys = list(rec_sort.keys())
    print('rec:',list(rec_sort.keys()))

    dic = {}
    # make dic keys
    for key in acc_keys:
        dic[key] = []

    for key in acc_keys:
        acc_rank = acc_keys.index(key)+1
        rec_rank = rec_keys.index(key)+1
        dic[key] = [acc_rank + 3*(rec_rank), {'acc_rank':acc_rank, 'rec_rank':rec_rank}]
    
    sorted_dic = dict(sorted(dic.items(), key=lambda item: item[1][0]))
    with open('metrics/rank_sums.json', 'w') as rank:
        json.dump(sorted_dic, rank, indent = 4)

    print(f"dic: \n{json.dumps(dic)} \nlen(keys): {len(acc_keys)}")
    print(f"sorted_dic keys: {list(sorted_dic.keys())}")
sort_scores()

acc: ['snv_FS_balanced', 'svmsmote', 'snv_FS_svmsmote', 'raw', 'kmeanssmote', 'balanced', 'snv_balanced', 'augmentedv2', 'feature_select', 'bordersmote', 'smote', 'snv_svmsmote', 'adasynsmote', 'augmented', 'augmentedv3', 'snv_raw', 'snv_augmentedv3', 'my_balancedv2', 'augmented_FS', 'augmentedv2_FS', 'augmentedv3_FS', 'feature_selectv2']
rec: ['augmentedv3_FS', 'snv_augmentedv3', 'augmentedv3', 'augmented_FS', 'bordersmote', 'snv_FS_svmsmote', 'my_balancedv2', 'svmsmote', 'augmentedv2', 'augmented', 'adasynsmote', 'snv_svmsmote', 'kmeanssmote', 'augmentedv2_FS', 'raw', 'smote', 'snv_raw', 'feature_select', 'feature_selectv2', 'balanced', 'snv_FS_balanced', 'snv_balanced']
dic: 
{"snv_FS_balanced": [64, {"acc_rank": 1, "rec_rank": 21}], "svmsmote": [26, {"acc_rank": 2, "rec_rank": 8}], "snv_FS_svmsmote": [21, {"acc_rank": 3, "rec_rank": 6}], "raw": [49, {"acc_rank": 4, "rec_rank": 15}], "kmeanssmote": [44, {"acc_rank": 5, "rec_rank": 13}], "balanced": [66, {"acc_rank": 6, "rec_rank": 2

# Find best overall experiments (based on top 6 accuracy and recall)
The difference between #1 accuracy-ranked experiment and last place is 0.83 - 0.775 = 0.055, i.e. 5.5% accuracy difference

The difference between #1 recall-ranked experiment and last place is 0.741 - 0.439 = 0.302, i.e. 30.2% recall difference

Hence, while the ranked sums is useful, it isn't fair to equally weight accuracy and recall rank. I added a 3x weight multiplier to recall rank to give it higher precedence.

The ranking for the best experiments is now as such:
1. 'snv_FS_svmsmote', 
2. 'snv_augmentedv3',
3. 'augmentedv3',
4. 'augmentedv3_FS',
5. 'bordersmote',
6. 'svmsmote', 
7. 'augmented_FS', 
8. 'augmentedv2', 
9. 'my_balancedv2', 
10. 'kmeanssmote', 
11. 'augmented', 
12. 'adasynsmote', 
13. 'snv_svmsmote', 
14. 'raw', 
15. 'smote', 
16. 'augmentedv2_FS', 
17. 'feature_select', 
18. 'snv_FS_balanced', 
19. 'balanced', 
20. 'snv_raw', 
21. 'snv_balanced', 
22. 'feature_selectv2'

Will optimise best overall experiments using bayesian optimisation, to try and further optimise models

Might tweak class weights to get better recall

Then will find best ensemble model

Then will end implementation there, with 2 models. One which optimises accuracy and one which optimses recall (as well as accuracy, but tweaks class wieghs to better favour neoplasia recall). In my recommendations for further research, I will suggest people to figure how to strike the balance such that the high recall of the neoplasia recall model is gained without having to sacrifice overall accuracy.