<center><img src="logo.png" /></center>

# <center>Avey: A High-Level Overview</center>

## Table of Contents

* [Introduction](#intro)
* [Cases](#cases)
* [Metrics and Calculations](#metrics)
* [Results](#results)

<p>


## Introduction<a class="anchor" id="intro"></a>
<p>This notebook is a supplement of the <a href="https://www.medrxiv.org/content/10.1101/2022.03.08.22272076v1.full"> paper </a> we are submitting. In this notebook, we share all the cases (after manual cleaning and matching) that we analysed and their results. We also share results from multiple experiments, few of which were discussed in the paper.</p>


#### Load data

In [1]:
# EXPERIMENT_TAG = 'avey'
EXPERIMENT_TAG = 'harvard'

In [2]:
# load data

import json
import pandas as pd
from collections import defaultdict
import math
import numpy as np
import os

def loadData(fileName):
    '''Loading data from result files'''
    with open(f'{fileName}.json', 'r', encoding='utf-8') as file:
        data = json.load(file)
        return data

def normalize(cases):
    '''make all the ddx list of a case of the same length by padding with None'''
    for case in cases.values():
        maxLen = max(len(result) for result in case.values())
        for result in case.values():
            result += [None]*(maxLen-len(result))

        assert len(set(len(result) for result in case.values())) == 1
    
    return cases

def getDataframe(case):
    '''Convert each test case into a dataframe'''
    caseLen = len(next(iter(case.values())))
    if EXPERIMENT_TAG == 'avey':
        gold = ['gold','gold_old']
    else:
        gold = ['gold']
    return pd.DataFrame(
        case,
        columns=[*gold,
        *sorted([key for key in case.keys() if 'gold' not in key and 'doctor' not in key]),
        *sorted([key for key in case.keys() if 'gold' not in key and 'doctor' in key]),
        ],
        index= list(range(1,1+caseLen))
        )


In [3]:
# We need to make all the differentials of the same length to ease comparison
# We pad the lists with None
data = loadData('data/allResults')
if EXPERIMENT_TAG != 'harvard':
    dataOld = loadData('data/allResults_old')

for caseNum in data:
    for app in data[caseNum]:
        # removing empty strings
        data[caseNum][app] = [r for r in data[caseNum][app] if r]

    # adding old results
    if EXPERIMENT_TAG != 'harvard' and caseNum in dataOld:
        for app in dataOld[caseNum]:
            data[caseNum][f"{app}_old"] = [r for r in dataOld[caseNum][app] if r]

normalizedData = normalize(data)
if EXPERIMENT_TAG == 'harvard':
    cases = {int(id): getDataframe(case) for id, case in normalizedData.items() if int(id)>500}
else:
    cases = {int(id): getDataframe(case) for id, case in normalizedData.items() if int(id)<=500}

display(f'We have {len(cases)} cases in the experiment.')

'We have 44 cases in the experiment.'

In [4]:
caseClassificationOld = loadData("data/case-classification_old")
failures_temp = (
    pd.read_excel("data/AppTest-2024-02-06.xlsx")
    .dropna(subset=["invalid_code"])
    .groupby(["medical_app", "invalid_code"])["case_study"]
    .agg(set)
    .reset_index()
).set_index(["medical_app", "invalid_code"]).to_dict('index')

failures = defaultdict(dict)
for (app, err), failedCases in  failures_temp.items():
    failures[app][err] = failedCases['case_study']

for app in failures:
    failures[app]["session failed"] = set()
    for err, failedCases in failures[app].items():
        if err != 'NDX':
            failures[app]["session failed"] |= failedCases

    failures[app]["no disease found"] = failures[app]["NDX"]
    del failures[app]["NDX"]

failures = dict(failures)

if EXPERIMENT_TAG == 'avey':
    for app, data in caseClassificationOld['apps'].items():
        failures[app] = data 
# failures

In [5]:
df = pd.read_excel(
    "data/Results - Apps performance - team copy.xlsx",
    sheet_name=(
        "Harvard cases"
        if EXPERIMENT_TAG == "harvard"
        else "Detailed diseases coverage - 50"
    ),
    index_col=1,
).dropna(subset=["Common vs Less common"])["Common vs Less common"]

df.index = [int(str(i).replace('#','')) for i in df.index]
caseClassification = {
    'common':set(),
    'less common':set(),
    'apps':failures,
}
for case_num, typ in df.items():
    if typ == 'Common':
        caseClassification['common'].add(case_num)
    else:
        caseClassification['less common'].add(case_num)

assert set(cases.keys()).issubset(
    caseClassification["common"] | caseClassification["less common"]
), f'Vignette results have extra ids that are not in classification {set(cases.keys()) - (caseClassification["common"] | caseClassification["less common"])}'

print(f'We have {len(caseClassification["common"])} common diseases and {len(caseClassification["less common"])} less common diseases')
display('App v Error code: Case Count')
pd.DataFrame({app: {err: len(v) for err, v in errs.items()} for app, errs in caseClassification["apps"].items()}).transpose()

We have 38 common diseases and 7 less common diseases


'App v Error code: Case Count'

Unnamed: 0,MAC,OTH,session failed,no disease found,A01,CNA,A16,A18
Avey,3.0,1.0,4.0,29.0,,,,
Avey v2,,1.0,1.0,13.0,,,,
Buoy,7.0,7.0,23.0,3.0,7.0,2.0,,
Healthily,,,64.0,18.0,,5.0,59.0,
K Health,10.0,3.0,95.0,1.0,,19.0,,63.0
Mediktor,1.0,,1.0,41.0,,,,
Symptomate,1.0,,9.0,17.0,,,,8.0
WebMD,,,6.0,1.0,5.0,1.0,,


## TODO: NEED TO REMOVE DUPLICATE TESTS AND VERIFY THE NUMBER OF ERRORS

<p>

### Let us have a look at all the cases. <a class="anchor" id="cases"></a>
The cases have been labelled common and less common by our doctors. We have also labelled the cases with apps that failed on them.

In [6]:
from IPython.display import display

for caseNum, case in cases.items():
    isCommonString = 'common' if caseNum in caseClassification['common'] else 'less common'
    sessionFailed, noDiseaseFound = ([], [])
    for app, failedCases in caseClassification['apps'].items():
        if caseNum in failedCases['session failed']:
            sessionFailed.append(app)
        elif caseNum in failedCases['no disease found']:
            noDiseaseFound.append(app)

    print(f'Case number {caseNum} ({isCommonString})')
    if sessionFailed:
        print(
            f'Session failed to start for: {", ".join(sessionFailed)}.')
    if noDiseaseFound:
        print(
            f'No diseases were found for: {", ".join(noDiseaseFound)}.')
    display(case)
    case.to_excel(f'individual_case_outputs/{EXPERIMENT_TAG}/{caseNum}.xlsx')
    print('\n'*2)


Case number 517 (common)
Session failed to start for: Healthily, K Health.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,acute pharyngitis,tosillitis,acute pharyngitis,acute pharyngitis,influenza,acute pharyngitis,,,esophageal foreign body,acute pharyngitis,influenza,tonsils,gas tonsillitis,acute pharyngitis
2,,scarlet fever,diphtheria,diphtheria,coronavirus (covid-19),influenza,,,acute pharyngitis,viral gastroenteritis,swine influenza,acute pharyngitis,,
3,,glandular fever,infectious mononucleosis,infectious mononucleosis,acute pharyngitis,,,,mononucleosis,pneumonia,acute appendicitis,,,
4,,cytomegalovirus infection,covid 19,epiglottitis,,,,,,infectious mononucleosis,bacterial pneumonia,,,
5,,,,covid 19,,,,,,scarlet fever,acute pharyngitis,,,
6,,,,,,,,,,,septicemia,,,
7,,,,,,,,,,,peritonitis,,,
8,,,,,,,,,,,pyelonephritis,,,





Case number 518 (common)
Session failed to start for: Avey.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,acute pharyngitis,bacterial tonsilitis,,acute pharyngitis,covid 19,acute pharyngitis,quinzy,infectious mononucleosis,acute pharyngitis,acute pharyngitis,acute pharyngitis,tonsillitis,tonsillitis/ start treatment,acute pharyngitis
2,,acute pharyngitis,,diphtheria,infectious mononucleosis,gastroesophageal reflux disease,acute pharyngitis,acute pharyngitis,infectious mononucleosis,,bacterial tonsilitis,acute pharyngitis,,
3,,glandular fever,,infectious mononucleosis,,bacterial tonsilitis,,,peritonsillar abscesses,,adenovirus infection,,,
4,,quinzy,,acute laryngitis,,,,,,,common cold,,,
5,,,,epiglottitis,,,,,,,flu,,,
6,,,,,,,,,,,infectious mononucleosis,,,
7,,,,,,,,,,,chronic sinusitis,,,





Case number 521 (common)
Session failed to start for: Avey.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,cellulitis,erythema nodosum,,osteomyelitis,covid 19,cellulitis,toxic shock syndrome,cellulitis,cellulitis,erysipelas,cellulitis,erythema nodosum.,erythema nodosum,cellulitis
2,,keg abscess,,cellulitis,dematitis,erysipelas,gangrene,deep vein thrombosis,erysipeles,deep vein thrombosis,frostbite,cellulitis,,dvt
3,,lofgren syndrome,,septic arthritis,skin rash,contact dermatitis,,,meningococcemia,cellulitis,boil,drug allergy reaction,,
4,,peripheral vascular disease,,,,deep vein thrombosis,,,,,contact dermatitis,autoimmune desease,,
5,,,,,,,,,,,lyme disease,,,





Case number 527 (common)
Session failed to start for: Healthily, K Health.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,salmonella,acute viral gastroenteritis,salmonella,shigellosis,acute viral gastroenteritis,salmonella,,,acute hepatitis due to hepatitis a virus,salmonella,noroviruses,salmonella,salmonella,salmonella
2,,acute appendicitis,peritonitis,parasitic gastroenteritis,coronavirus disease 2019,viral gastroenteritis,,,haemolytic-uraemic syndrome - hus,stomach flu,irritable bowel syndrome,,,viral gastroenteritis
3,,campylobacter gastroenteritis,parasitic gastroenteritis,peritonitis,salmonella,,,,salmonella,,lactose intolerance,,,
4,,acute gastritis,acute mesenteric ischemia,colitis,,,,,,,influenza,,,
5,,shigella gastroenteritis,,salmonella,,,,,,,acute viral gastroenteritis,,,
6,,,,,,,,,,,salmonella,,,
7,,,,,,,,,,,giardiasis,,,





Case number 532 (common)


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,acute bronchitis,acute bronchitis,influenza,acute bronchitis,covid 19,acute bronchitis,acute bronchitis,upper respiratory tract infection,pneumonia,pneumonia,bacterial pneumonia,acute bronchitis,urti with post nasal drip,urti
2,,pneumonia,acute bronchitis,common cold,common cold,common cold,,pneumonia,acute bronchitis,,influenza,,,acute bronchitis
3,,flu,covid 19,influenza,acute sinusitis,covid 19,,,covid 19,,acute bronchitis,,,
4,,covid 19,common cold,covid 19,,influenza,,,tuberculosis,,asthma,,,
5,,common cold,diphtheria,bacterial tracheitis,,,,,inflenza,,common cold,,,
6,,,,,,,,,pulmonary embolism,,chronic sinusitis,,,
7,,,,,,,,,asthma,,whooping cough,,,
8,,,,,,,,,lung cancer,,lung cancer,,,





Case number 534 (common)
No diseases were found for: Avey.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,acute pharyngitis,acute pharyngitis,,acute laryngitis,covid 19,common cold,acute bronchitis,upper respiratory infection,covid 19,acute pharyngitis,common cold,acute tonsillitis.,acute pharyngitis,acute pharyngitis
2,,acute laryngitis,,acute pharyngitis,,influenza,acute pharyngitis,acute pharyngitis,acute pharyngitis,common cold,bacterial tonsilitis,acute pharyngitis,,
3,,covid 19,,covid 19,,acute pharyngitis,,,influenza,tension headache,adenovirus infection,,,
4,,bacterial tonsilitis,,common cold,,,,,,obstructive sleep apnea,hay fever,,,
5,,common cold,,influenza,,,,,,acute bronchitis,influenza,,,
6,,,,,,,,,,,chronic sinusitis,,,
7,,,,,,,,,,,infectious mononucleosis,,,
8,,,,,,,,,,,acute sinusitis,,,
9,,,,,,,,,,,acute laryngitis,,,





Case number 535 (common)


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,allergic rhinitis,allergic rhinitis,allergic rhinitis,allergic rhinitis,allergic rhinitis,allergic rhinitis,allergic rhinitis,allergic rhinitis,allergic rhinitis,allergic rhinitis,acute sinusitis,allergic rhinitis,allergic rhinitis,allergic rhinitis
2,,non-allergic rhinitis,adenoid hypertrophy,adenoid hypertrophy,chronic sinusitis,non-allergic rhinitis,,allergic reaction,common cold - viral respiratory infection,,chronic sinusitis,,,
3,,nasal polyps,non allergic rhinitis,non allergic rhinitis,,,,,coronavirus - covid-19 - omicron - variant: b ...,,nasal polyps,,,
4,,,chronic sinusitis,chronic sinusitis,,,,,,,,,,
5,,,covid 19,upper airway obstruction,,,,,,,,,,





Case number 536 (common)


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,back pain,back pain,vertebral fracture,back pain,back pain,back pain,back pain,back pain,vertebral fracture,back pain,back pain,back pain,back pain,back pain
2,,osteomyelitis,herniated disc,vertebral fracture,,degenerative disc disease,,,back pain,,lumbar herniated disk,,,
3,,multiple myeloma,degenerative disc disease,herniated disc,,herniated disc,,,sciatica,,vertebral compression fracture (thoracic),,,
4,,scoliosis,back pain,degenerative disc disease,,,,,renal colic,,ankylosing spondylitis,,,
5,,gallstones,spinal stenosis,spinal stenosis,,,,,,,lumbar spinal stenosis,,,





Case number 538 (common)
Session failed to start for: K Health.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,canker sore,canker sore,canker sore,canker sore,mucocele,canker sore,ulcerative colitis,,canker sore,canker sore,hand foot and mouth disease,canker sore,canker sore,canker sore
2,,behçet's syndrome,hyperangina,herpangina,recurrent oral herpes,herpes simplex,crohn's disease,,systemic lupus erythematosus,mouth ulcer after an injury,acute hiv infection,bad mouth haygen,,
3,,hyperangina,behçet syndrome,behçet syndrome,,behcet disease,,,behçet syndrome,cold sore,celiac disease,,,
4,,,inflammatory bowel disease,inflammatory bowel disease,,,,,,,crohns,,,
5,,,hiv infection,hiv infection,,,,,,,canker sore,,,
6,,,,,,,,,,,measles,,,
7,,,,,,,,,,,iron deficiency anemia,,,
8,,,,,,,,,,,strep throat,,,





Case number 503 (common)
Session failed to start for: K Health.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,asthma,asthma,foreign body aspiration,acute bronchitis,asthma,asthma,asthma,,asthma,asthma,asthma,asthma,asthma,asthma
2,,foreign body aspiration,pneumonia,cystic fibrosis,acute bronchitis,post-infectious cough,,,acute bronchitis,,acute bronchitis,,,acute bronchitis
3,,paradoxical vocal fold motion,upper airway obstruction,asthma,,bronchospasm,,,bronchial hyperresponsiveness,,pneumonia,,,pneumonia
4,,hyperventilation syndrome,acute bronchitis,pneumonia,,acute bronchitis,,,,,respiratory syncytial virus,,,flu
5,,,chronic obstructive pulmonary disease,acute laryngitis,,,,,,,croup,,,cardiac causes
6,,,,,,,,,,,emphysema,,,
7,,,,,,,,,,,pulmonary embolism,,,
8,,,,,,,,,,,influenza,,,





Case number 504 (common)


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,copd flare (more severe),copd flare (more severe),pneumonia,copd flare (more severe),copd flare (more severe),copd flare (more severe),copd flare (more severe),copd flare (more severe),pneumonia,pneumonia,acute bronchitis,bacterial pnemonia,copd flare (more severe),copd flare (more severe)
2,,pneumonia,atelectasis,pneumonia,pneumonia,pneumonia,,common cold,acute bronchitis,,copd flare (more severe),,,flu
3,,sepsis,acute bronchitis,atelectasis,pulmonary embolism,lung cancer,,asthma,copd flare (more severe),,heart failure,,,pneumonia
4,,,asthma,asthma,,heart failure,,pneumonia,,,pneumonia,,,
5,,,covid 19,acute bronchitis,,,,,,,asthma,,,
6,,,,,,,,,,,coronavirus disease 2019,,,
7,,,,,,,,,,,pulmonary embolism,,,





Case number 523 (common)


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,influenza,influenza,influenza,influenza,common cold,influenza,influenza,influenza,influenza,acute bronchitis,influenza,influenza,influenza,influenza
2,,pneumonia,covid 19,covid 19,influenza,coronavirus disease 2019,acute bronchitis,upper respiratory infection,coronavirus disease 2019,otitis media,bacterial pneumonia,,rule out meningitis,acute bronchitis
3,,acute schistosomiasis,common cold,common cold,bacterial pneumonia,common cold,,,swine influenza,influenza,chronic sinusitis,,,
4,,acute bronchitis,pneumonia,acute bronchitis,,sinusitis,,,bronchitis,coronavirus disease 2019,acute sinusitis,,,
5,,coronavirus disease 2019,dengue fever,pneumonia,,,,,tuberculosis,acute bacterial gastroenteritis,acute appendicitis,,,
6,,,,,,,,,mononucleosis,atypical pneumonia,endocarditis,,,
7,,,,,,,,,tonsillopharyngitis,viral meningitis,,,,
8,,,,,,,,,aspiration pneumonia,laryngitis,,,,
9,,,,,,,,,community-acquired pneumonia,,,,,
10,,,,,,,,,sepsis,,,,,





Case number 524 (common)
Session failed to start for: Healthily, K Health.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,mononucleosis,mononucleosis,tonsillopharyngitis,tonsillopharyngitis,covid 19,thyroiditis,,,mononucleosis,acute streptococcal pharyngitis,mononucleosis,mononucleosis,streptococcal pharyngitis,scarlet fever
2,,tonsilitis,diphtheria,diphtheria,influenza,viral pharyngitis,,,bacterial tonsilitis,mononucleosis,strep throat,scarlet fever,,mononucleosis
3,,influenza,mononucleosis,mononucleosis,mononucleosis,tonsillitis,,,,common cold,bacterial tonsilitis,,,
4,,,epiglottitis,epiglottitis,,mononucleosis,,,,scarlet fever,influenza,,,
5,,,covid 19,covid 19,,,,,,,acute sinusitis,,,
6,,,,,,,,,,,adenovirus infection,,,
7,,,,,,,,,,,acute laryngitis,,,
8,,,,,,,,,,,covid 19,,,





Case number 526 (common)
Session failed to start for: Healthily, K Health.
No diseases were found for: Avey, Avey v2.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,pneumonia,acute bronchitis,,,pneumonia,common cold,,,bronchitis,pneumonia,croup,pneumonia,pneumonia,pneumonia
2,,pneumonia,,,common cold,bronchitis,,,pneumonia,acute bronchitis,pneumonia,,,
3,,common cold,,,coronavirus,pneumonia,,,pulmonary tuberculosis,common cold,asthma,,,
4,,coronavirus,,,,,,,,,chronic sinusitis,,,
5,,whooping cough,,,,,,,,,bronchitis,,,
6,,,,,,,,,,,whooping cough,,,
7,,,,,,,,,,,influenza,,,
8,,,,,,,,,,,common cold,,,





Case number 543 (common)


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,viral upper respiratory infection,viral upper respiratory infection,viral upper respiratory infection,viral upper respiratory infection,viral upper respiratory infection,viral upper respiratory infection,viral upper respiratory infection,viral upper respiratory infection,viral upper respiratory infection,acute bronchitis,viral upper respiratory infection,viral upper respiratory infection,acute sinusitis,acute sinusitis
2,,bacterial sinusitis,nonallergic rhinitis,acute bronchitis,,sinusitis,,acute sinusitis,bronchitis,viral upper respiratory infection,chronic sinusitis,bronchitis and sinusitis,,
3,,acute bronchitis,acute laryngitis,tonsillopharyngitis,,seasonal allergies,,,bronchial hyperresponsiveness,acute bacterial sinusitis,hay fever,,,
4,,,,,,,,,pneumonia,atypical pneumonia,whooping cough,,,
5,,,,,,,,,uncomplicated fever,,measles,,,
6,,,,,,,,,,,asthma,,,





Case number 545 (common)
Session failed to start for: Buoy, Healthily, K Health, Symptomate.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,vomiting,pediatric meningitis,sepsis,sepsis,,acute bacterial gastroenteritis,,,covid 19,,middle ear infection,vomiting,gastroenteritis,acute gastritis
2,,paediatric pneumonia,infective endocarditis,infective endocarditis,,food poisoning,,,sepsis,,bacterial pneumonia,,uti,meningitis
3,,viral stomach bug in children,adrenal insufficiency,adrenal insufficiency,,pyloric stenosis,,,,,influenza (flu),,urti,
4,,roseola infantum,herpetic withlow,herpetic withlow,,acute appendicitis,,,,,strep throat,,meningitis,
5,,ecoli gastroenteritis,herpes simplex,herpes simplex,,,,,,,viral gastroenteritis,,,
6,,,,,,,,,,,intestinal obstruction,,,
7,,,,,,,,,,,viral pneumonia,,,





Case number 501 (less common)


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,acute liver failure,hemorrhagic stroke,encephalopathy,transient ischemic attack,acute liver failure,migraine,cerebral stroke,dehydration,acute liver failure,acute liver failure,drug allergy,acute liver failure,acute liver failure,acute liver failure
2,,mini stroke,cerebral stroke,cerebral stroke,low blood sodium,medication side effects,mini stroke,anxiety disorder,delirium,dehydration,cerebral thrombosis,,acute hepatitis,paracetamol poisoning
3,,brainstem infarction,rocky mountain spotted fever,meningitis,,brain tumor,,transient ischemic attack,,severe arrhythmia,cerebral stroke,,,sepsis
4,,acute liver failure,meningitis,,,,,meningitis,,alcohol withdrawal,brain aneurysm,,,
5,,wernicke encephalopathy,brain abscess,,,,,,,cerebral stroke,hypoglycaemia,,,
6,,,,,,,,,,hemorrhagic stroke,congestive heart failure,,,





Case number 506 (common)


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,heart attack,acute aortic regurgitation,unstable angina,stable angina,inflammation of heart muscle,angina,pulmonary embolism,heart attack,musculoskeletal chest pain,heart attack,heart attack,heart attack,heart attack,heart attack
2,,unstable angina pectoris,heart attack,acute pericarditis,angina pectoris,acute episode of chronic obstructive pulmonary...,,stable angina,pericarditis,unstable angina pectoris,broken (fractured) rib(s),,,
3,,heart attack,pericardial effusion,costochondritis,,heart diseas,,,pneumonia,,costochondritis,,,
4,,cardiac temponade,myocarditis,esophageal stricture,,,,,pneumothorax,,heartburn/gerd,,,
5,,acute respiratory failure,primary spontaneous pneumothorax,myocarditis,,,,,pulmonary embolism,,unstable angina pectoris,,,
6,,,,,,,,,costochondritis,,,,,
7,,,,,,,,,lung cancer,,,,,





Case number 502 (common)
Session failed to start for: Buoy, Healthily, K Health, Symptomate.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,appendicitis,appendicitis,peritonitis,peritonitis,,gastroenteritis,,,acute bacterial gastroenteritis,,appendicitis,secondary peritonitis due bacterial infection,appendicitis,appendicitis
2,,cryptosporidium infection,acute bacterial gastroenteritis,parasitic gastroenteritis,,gastritis,,,haemolytic-uraemic syndrome,,acute cholecystitis,appendicitis,fmf,bowel infarction
3,,campylobacter gastroenteritis,shigellosis,shigellosis,,appendicitis,,,food poisoning,,constipation,colecystitis,,bowel perforation and peritonitis
4,,peritonitis,parasatic gastroenteritis,acute bacterial gastroenteritis,,intestinal obstruction,,,irritable bowel syndrome,,gas,peptic ulcer,,
5,,acute viral gastroenteritis,inflammatory bowel disease,acute viral gastroenteritis,,,,,,,intestinal obstruction,,,





Case number 505 (common)


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,deep vein thrombosis,deep vein thrombosis,cellulitis,chronic venous insufficiency,shin splint,deep vein thrombosis,deep vein thrombosis,deep vein thrombosis,deep vein thrombosis,deep vein thrombosis,deep vein thrombosis,deep vein thrombosis,deep vein thrombosis,deep vein thrombosis
2,,cellulitis,osteomyelitis,deep vein thrombosis,cellulitis,chronic venous insufficiency,,bacterial skin infection,chronic venous insufficiency,sciatica,peripheral vascular disease,,ruptured baker's cyst,cellulitis
3,,superficial thrombophlebitis,superficial thrombophlebitis,cellulitis,skin rash,bakers cyst,,erysipelas,heart failure,lumbosacral neuropathy,cellulitis,,,
4,,,deep vein thrombosis,superficial thrombophlebitis,,,,superficial thrombophlebitis,superficial thrombophlebitis,chronic venous insufficiency,superficial thrombophlebitis,,,
5,,,,lymphedema,,,,,filariasis,varicose veins,varicose vein,,,
6,,,,,,,,,peripheral vascular disease,,pulmonary embolism,,,
7,,,,,,,,,,,bakers cyst,,,
8,,,,,,,,,,,shin splints,,,





Case number 507 (less common)
Session failed to start for: Healthily, K Health.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,hemolytic uremic syndrome,viral gastroenteritis,food poisoning,acute bacterial gastroenteritis,viral gastroenteritis,intestinal infection,,,hemolytic uremic syndrome,acute bacterial gastroenteritis,antibiotics associated diarrhea,gastroenteritis with entameba hostelitica,infectious gastroenteritis,infectious colitis
2,,campylobacter gastroenteritis,acute viral gastroenteritis,hemolytic uremic syndrome,,irritable bowel syndrome,,,gastroenteritis,hemolytic uremic syndrome,viral gastroenteritis,hemolytic uremic syndrome,,hemolytic uremic syndrome
3,,shigella gastroenteritis,hemolytic uremic syndrome,food poisoning,,inflammatory bowel disease,,,crohn disease,food allergy,salmonella,,,clostridium difficile
4,,hemolytic uremic syndrome,parasitic gastroenteritis,shingles,,hemorrhoids,,,dysentery syndrome,viral gastroenteritis,food poisoning,,,
5,,salmonella gastroenteritis,shigellosis,parasitic gastroenteritis,,food allergy,,,ulcerative colitis,,giardiasis,,,
6,,,,,,,,,complicated gastroenteritis,,food allergy,,,
7,,,,,,,,,acute pancreatitis,,hiv infection,,,
8,,,,,,,,,,,indigestion,,,
9,,,,,,,,,,,hemolytic uremic syndrome,,,





Case number 508 (common)
No diseases were found for: Mediktor.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,kidney stones,kidney stones,kidney stones,kidney stones,kidney stones,kidney stones,kidney stones,pyelonephritis,,kidney stones,urethral calculus,kidney stones,kidney stones,renal colic
2,,pyelonephritis,pyelonephritis,pyelonephritis,gastoenteritis,gastritis,,urethritis,,,diverticulitis,obstructed inguinal hernia,pyelonephritis,kidney stones
3,,ruptured renal cyst,abdominal aortic aneurysm,abdominal aortic aneurysm,food poisoning,diverticulitis,,kidney stones,,,acute appendicitis,testicular infection(orchitis),,pyelonephritis
4,,,cholelithiasis,viral hepatitis,,acute bacterial gastroenteritis,,diverticulitis,,,epididymitis,,,
5,,,non alcoholic fatty liver disease,cystitis,,peptic ulcer disease,,,,,pyelonephritis,,,
6,,,,,,,,,,,acute prostatitis,,,
7,,,,,,,,,,,bladder stone,,,





Case number 509 (less common)
No diseases were found for: Avey v2.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,malaria,ecoli gastroenteritis,primary immunodeficiency,,coronavirus (covid-19),malaria,gastroenteritis,gastroenteritis,traveler's diarrhea,influenza flu adults,influenza (flu),malaria,malaria,malaria
2,,thyrotoxic storm,myelodysplastic syndrome,,non-bacterial brain inflammation,dengue fever,,meningitis,gastroenteritis,septicemia,septicemia,dengue fever,traveller's diarrhea,infectious colitis
3,,viral gastroenteritis,leukemia,,kidney infection,zika virus,,,complicated gastroeneritis,acute appendicitis,appendicitis,yellow fever,,
4,,serotonin syndrome,myeloproliferative neoplasms,,,,,,hemolytic uremic syndrome,bacterial pneumonia,bacterial pneumonia,infectious diarrhea,,
5,,acute panic attack,lymphoma,,,,,,uncomplicated fever,peritonitis,peritonitis,,,
6,,,,,,,,,,stomach flu,,,,
7,,,,,,,,,,bacterial gastroenteritis,,,,





Case number 510 (common)
No diseases were found for: Mediktor.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,meningitis,meningitis,meningitis,meningitis,non bacterial brain inflammation,meningitis,meningitis,meningitis,,meningitis,influenza,meningitis,meningitis,meningitis
2,,migraine headache,brain abscess,brain abscess,meningitis,encephalitis,,upper respiratory infection,,,acute sinusitis,migraine attack,tonsillar abscess,
3,,,migraine headache,rocky mountain spotted fever,influenza,migraine headache,,general headache,,,bacterial pneumonia,,,
4,,,rocky mountain spotted fever,migraine headache,,viral infection,,gastroenteritis,,,viral pneumonia,,,
5,,,cerebral stroke,cerebral stroke,,,,,,,tonsillopharyngitis,,,
6,,,,,,,,,,,meningitis,,,
7,,,,,,,,,,,swine influenza,,,
8,,,,,,,,,,,migraine headache,,,





Case number 511 (common)


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,pneumonia,flu,pneumonia,pneumonia,pneumonia,flu,pneumonia,upper respiratory tract infection,pneumonia,pneumonia,pneumonia,pneumonia,pneumonia,pneumonia
2,,common cold,atelectasis,aspergillosis,covid 19,common cold,,chronic obstructive pulmonary disease,acute bronchitis,,flu,pulmonary edema,,
3,,covid 19,covid 19,pleural effusion,heart infection,covid 19,,pneumonia,covid 19,,acute bronchitis,,,
4,,,acute bronchitis,pulmonary embolism,,,,,tuberculosis,,asthma,,,
5,,,aspergillosis,bronchiectasis,,,,,asthma,,common cold,,,
6,,,,,,,,,influenza,,,,,
7,,,,,,,,,pulmonary embolism,,,,,
8,,,,,,,,,foreign body aspiration,,,,,
9,,,,,,,,,heart failure,,,,,





Case number 512 (less common)


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,pulmonary embolism,pulmonary embolism,pulmonary embolism,pulmonary embolism,covid 19,pulmonary embolism,pulmonary embolism,pulmonary embolism,primary spontaneous pneumothorax,pulmonary embolism,congestive heart failure,pulmonary embolism,pulmonary embolism,pulmonary embolism
2,,aspiration of oropharyngeal secretions,primary spontaneous pneumothorax,pneumonia,pulmonary embolism,heart attack,,asthma,pulmonary embolism,primary spontaneous pneumothorax,pulmonary embolism,,pericarditis,pleuritis
3,,respiratory failure,pericarditis,primary spontaneous pneumothorax,heart attack,primary spontaneous pneumothorax,,pneumonia,typical pneumonia,,asthma,,,pneumonia
4,,,pneumonia,pericarditis,,,,atrial fibrillation,,,emphysema,,,
5,,,aortic dissection,aspergillosis,,,,,,,,,,





Case number 513 (less common)
Session failed to start for: Healthily, K Health.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,rocky mountain spotted fever,rocky mountain spotted fever,rocky mountain spotted fever,rocky mountain spotted fever,coronavirus (covid-19),meningococcal meningitis,,,influenza,bacterial meningitis,influenza (flu),rocky mountain spotted fever,rocky mountain spotted fever,urti
2,,boutonneuse fever,rubella,rubella,non-bacterial brain inflammation,viral meningitis,,,influenza a - h1n1,leptospirosis,septicemia,scarlet fever,,cellulitis
3,,chikungunya,infectious mononucleosis,lyme disease,lyme disease,lyme disease,,,coronavirus (covid-19),viral meningitis,bacterial pneumonia,autoimmune deasese,,allergic dermatitis
4,,,dengue fever,dengue fever,,,,,coronavirus - covid-19 - omicron - variant: b1...,tension headaches,swine influenza,,,
5,,,lyme disease,malaria,,,,,infection - sepsis,typhoid fever,appendicitis,,,
6,,,,,,,,,mononucleosis - infectious mononucleosis,inflammation of the brain,peritonitis,,,
7,,,,,,,,,meningococcemia,acute streptococcal pharyngitis,pyelonephritis,,,
8,,,,,,,,,,neck strain,,,,





Case number 514 (common)


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,stroke,stroke,transient ischemic attack,stroke,stroke,stroke,subarachnoid hemorrhage,transient ischemic attack,intraparenchymal cerebral hemorrhage,stroke,stroke,intracranial heamorrege,stroke,stroke
2,,,stroke,transient ischemic attack,blood vessel issue in the brain,brain tumor,,stroke,,transient ischemic attack,migraine headache,stroke,,
3,,,botulism,meningitis,,multiple sclerosis,,,,subarachnoid intracranial hemorrhage,brain aneurysm,,,
4,,,aortic dissection,botulism,,migraine headache,,,,,cerebeller hemorrhage,,,
5,,,guillain barre syndrome,hypercalcemia,,,,,,,meningitis,,,





Case number 515 (less common)
No diseases were found for: Buoy.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,tetanus,tetanus,tetanus,tetanus,,throat cancer,retropharyngeal abscess,temporomandibular joint dysfunction,tetanus,tetanus,peritonsillar abscess,aspiration pnemonia,tetanus,tetanus
2,,quinsy,,,,tetanus,,,parkinson's disease,wound dehiscence,,pulmonary embolism,,
3,,ludwig angina,,,,oropharyngeal cancer,,,,bruxism,,,,
4,,,,,,temporomandibular joint dysfunction,,,,,,,,
5,,,,,,peritonsillar abscess,,,,,,,,





Case number 516 (common)
Session failed to start for: Buoy, Healthily, K Health.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,acute otitis media,acute otitis media,common cold,bronchiolitis,,acute otitis media,,,common cold,common cold,common cold,acute otitis media,acute otitis media,acute otitis media
2,,common cold,influenza,influenza,,infelunza,,,acute nasal catarrh,acute otitis media,chronic sinusitis,,,urti
3,,roseola infantum,bronchiolitis,common cold,,common cold,,,coronavirus - covid-19 - omicron - variant: b ...,rsv infection,coronavirus,,,
4,,,covid 19,covid 19,,,,,,acute broncholitis,hay fever,,,
5,,,allergic rhinitis,tonsillopharyngitis,,,,,,,influanza,,,
6,,,,,,,,,,,croup,,,
7,,,,,,,,,,,whooping cough,,,
8,,,,,,,,,,,measeles,,,





Case number 519 (common)


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,acute sinusitis,acute sinusitis,nasal foreign body,chronic sinusitis,acute sinusitis,chronic sinusitis,acute sinusitis,acute sinusitis,acute sinusitis,acute sinusitis,acute sinusitis,acute sinusitis,acute sinusitis,acute sinusitis
2,,chronic sinusitis,adenoid hypertrophy,adenoid hypertrophy,coronavirus,acute sinusitis,,upper respiratory infection,hay fever,common cold,chronic sinusitis,,,
3,,,nasal septal hematoma,non allergic rhinitis,,allergic rhinitis,,,common cold,,coronavirus,,,
4,,,nasal septum deviation,allergic rhinitis,,,,,,,common cold,,,
5,,,chronic sinusitis,covid 19,,,,,,,hay fever,,,





Case number 520 (common)


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,back pain,acute sciatica,herniated disc,herniated disc,back pain,spinal cord injury,herniated disc,lumbar radiculopathy,sciatica,lumbosacral neuropathy,back pain,compressin peroneal nerve or injury.,sciatica with nerve compression,lumbar disc prolapse
2,,chronic lumbosacral radiculopathy,guillain-barré syndrome,gbs,lumbar radiculopathy,,back pain,back pain,spondylolisthesis,sciatica,vertebral fracture,sciatic nerve injury.,,
3,,spinal epidural abscess,transverse myelitis,transverse myelitis,,,,,sacroiliac joint dysfunction,,broken (fractured) rib(s),back pain,,
4,,,back pain,degenerative disc disease,,,,,,,osteoarthritis,,,
5,,,degenerative disc disease,back pain,,,,,,,acute necrotizing pancreatitis,,,
6,,,,,,,,,,,fractured tailbone,,,
7,,,,,,,,,,,polycystic kidney disease,,,
8,,,,,,,,,,,brusitis trochanteric,,,





Case number 525 (common)
No diseases were found for: Avey.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,peptic ulcer disease,chronic gastritis,,peptic ulcer disease,gerd,peptic ulcer disease,peptic ulcer disease,gerd,peptic ulcer disease,peptic ulcer disease,peptic ulcer disease,peptic ulcer disease,gastritis,peptic ulcer disease
2,,peptic ulcer disease,,esophagitis,inguinal hernia,gastritis,,acute bacterial gastroenteritis,acute bacterial gastroenteritis,acute bacterial gastroenteritis,heartburn/gerd,,peptic ulcer disease,gastritis
3,,epigastric pain syndrome,,acute bacterial gastroenteritis,bladder cancer,gastroesophagial reflux disease,,peptic ulcer disease,acute pancreatitis,,acute necrotizing pancreatitis,,,
4,,hiatal hernia,,upper gastrointestinal bleeding,,gallstones,,,,,appendicitis,,,
5,,,,acute cholangitis,,,,,,,heart attack,,,





Case number 528 (less common)
Session failed to start for: K Health.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,shingles,shingles,shingles,shingles,covid 19,shingles,shingles,,shingles,shingles,shingles,shingles,shingles,shingles
2,,contact dermatitis,rocky mountain spotted fever,allergic reaction,lyme disease,cellulitis,,,zika virus,,genital herpes,,,
3,,hives,allergic reaction,urticaria,anaplasmosis,contact dermatitis,,,genital herpes,,pneumonia,,,
4,,,cellulitis,herpes simplex,,costochondritis,,,chickenpox,,flu,,,
5,,,urticaria,contact dermatitis,,,,,tinea pedis,,myocardial infarction,,,
6,,,,,,,,,impetigo,,esophagitis,,,
7,,,,,,,,,monkeypox,,costochondritis,,,
8,,,,,,,,,,,pulmonary embolism,,,





Case number 529 (common)


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,urinary tract infection,urinary tract infection,urinary tract infection,urinary tract infection,urinary tract infection,urinary tract infection,urinary tract infection,urinary tract infection,urinary tract infection,urinary tract infection,urinary tract infection,urinary tract infection,urinary tract infection,urinary tract infection
2,,bladder stone,trichomoniasis,trichomoniasis,,vaginitis,,,obstructive uropathy,,ureteral calculus,,,
3,,,vulvovaginal candidiasis,chlamydia,,,,,,,vulvovaginal candidiasis,,,
4,,,,,,,,,,,bacterial vaginosis,,,
5,,,,,,,,,,,diverticulitis,,,





Case number 530 (common)


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,vertigo,vertigo,vertigo,vertigo,vertigo,vertigo,vertigo,vertigo,vertigo,vertigo,vertigo,vertigo,vertigo,vertigo
2,,earwax blockage,vestibular neuritis,cerebral stroke,meniere disease,meniere disease,,orthostatic hypotension,common dizziness,,labyrinthitis,,,benign paroxysmal positional vertigo
3,,cataract,meniere disease,,,labyrinthitis,,,,,congestive heart failure,,,
4,,age related hearing loss,cerebral stroke,,,cardiovascular disease,,,,,myocardial infarction,,,





Case number 531 (common)


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,acute bronchitis,acute bronchitis,common cold,acute bronchitis,acute bronchitis,common cold,acute bronchitis,upper respiratory tract infection,acute bronchitis,acute bronchitis,pneumonia,acute bronchitis,acute bronchitis,acute bronchitis
2,,asthma,covid 19,bronchiectasis,covid 19,acute bronchitis,,acute sinusitis,asthma,common cold,influenza,pharengitis,,
3,,,tonsillopharyngitis,pneumonia,,postnasal drip,,pneumonia,typical pneumonia,tonsillopharyngitis,asthma,,,
4,,,atelectasis,cystic fibrosis,,,,,common cold,,common cold,,,
5,,,acute laryngitis,acute laryngitis,,,,,,,chronic sinusitis,,,





Case number 537 (common)
Session failed to start for: Healthily, K Health.
No diseases were found for: Avey, Avey v2, Symptomate.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,bee sting without anaphylaxis,irritant contact cheilitis,,,allergic reaction,hereditary angioedema,,,bee sting without anaphylaxis,,food allergy,bee sting without anaphylaxis,bee sting without anaphylaxis,bee sting without anaphylaxis
2,,melkersson-rosenthal syndrome,,,,idiopathic angioedema,,,angioedema,,hives,,,
3,,bee sting without anaphylaxis,,,,acei induced angioedema,,,cutaneous leishmaniasis,,anaphylaxis,,,
4,,food allergy,,,,allergic angioedema,,,,,hereditary angioedema,,,
5,,hereditary angioedema,,,,,,,,,acei induced angioedema,,,
6,,,,,,,,,,,williams syndrome,,,
7,,,,,,,,,,,ascaris worms,,,





Case number 539 (common)


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,candidal yeast infection,candidal yeast infection,bacterial vaginosis,bacterial vaginosis,bacterial vaginosis,candidal yeast infection,candidal yeast infection,candidal yeast infection,candidal yeast infection,candidal yeast infection,candidal yeast infection,candidal yeast infection,candidal yeast infection,candidal yeast infection
2,,allergic vulvovaginitis,trichomoniasis,trichomoniasis,trichomoniasis,bacterial vaginosis,,bacterial vaginosis,valvovaginitis,cervicitis,pinworms,,,
3,,bacterial vaginosis,candidal yeast infection,atrophic vaginitis,candidal yeast infection,hormonal changes,,vulvovaginitis,genital herpes,,bacterial vaginosis,,,
4,,,atrophic vaginitis,candidal yeast infection,,,,,,,crabs,,,
5,,,gonorrhoea,gonorrhoea,,,,,,,lichen sclerosus et atrophicus,,,
6,,,,,,,,,,,hemorrhoids,,,
7,,,,,,,,,,,scabies,,,





Case number 540 (common)
Session failed to start for: Buoy, Healthily, K Health, WebMD.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,constipation,meckels diverticulum,hirsch sprung disease,hirschsprung disease,,millk allergy,,,constipation,constipation,,constipation,cow's milk intolerance (cow milk should not be...,constipation
2,,constipation,constipation,constipation,,anal fissure,,,irritable bowel syndrome,,,,,anal fissure
3,,,celiac disease,celiac disease,,infantile colic,,,hemolytic uremic syndrome,,,,,
4,,,hypothyroidism,hypothyroidism,,,,,,,,,,





Case number 541 (common)
Session failed to start for: Healthily, K Health.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,eczema,eczema,eczema,eczema,eczema,contact dermatitis,,,eczema,non specific cough,eczema,eczema,eczema,eczema
2,,contact dermatitis,psoriasis,psoriasis,psoriasis,eczema,,,neurodermatitis,eczema,drug allergy,,,
3,,candidal intertrigo,lichen planus,lichen planus,non specific dermatitis,allergic reaction,,,,tinea corporis,urticaria,,,
4,,,contact dermatitis,contact dermatitis,,,,,,urticaria,insect bite,,,
5,,,scabies,tinea versicolor,,,,,,acute bronchitis,scabies,,,
6,,,,,,,,,,asthma,hodgkins disease,,,





Case number 542 (common)
No diseases were found for: Mediktor.


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,stye,stye,stye,stye,stye,orbital cellulitis,stye,stye,,stye,acute sinusitis,stye,stye,stye
2,,bacterial conjunctivitis,lipoma,lipoma,,stye,periorbital cellulitis,bacterial conjunctivitis,,periorbital cellulitis,bacterial conjunctivitis,calazion,blepharitis,
3,,shingles,chalazion,chalazion,,chalazion,,blepharitis,,,chronic sinusitis,,,
4,,,blepharitis,blepharitis,,,,episcleritis,,,cluster headache,,,
5,,,,,,,,,,,allergic conjunctivitis,,,





Case number 544 (common)


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,viral upper respiratory infection,viral upper respiratory infection,viral upper respiratory infection,viral upper respiratory infection,viral upper respiratory infection,viral upper respiratory infection,viral upper respiratory infection,viral upper respiratory infection,viral upper respiratory infection,viral upper respiratory infection,viral upper respiratory infection,viral upper respiratory infection,viral upper respiratory infection,viral upper respiratory infection
2,,,diphtheria,acute bronchitis,,allergic rhinitis,,,,acute streptococcal pharyngitis,hay fever,,,
3,,,acute bronchitis,diphtheria,,,,,,mononucleosis,bronchitis,,,
4,,,,,,,,,,,chronic sinusitis,,,





Case number 522 (common)


Unnamed: 0,gold,Ada,Avey,Avey v2,Buoy,ChatGPT - 4,Healthily,K Health,Mediktor,Symptomate,WebMD,doctor_MA,doctor_NJ,doctor_TH
1,copd flare (milder),acute bronchitis,copd flare (milder),copd flare (milder),copd flare (milder),copd flare (milder),acute bronchitis,asthma,copd flare (milder),asthma,asthma,acute bronchitis on top of chronic bronchitis,copd flare (milder),copd flare (milder)
2,,asthma,asthma,asthma,asthma,asthma,,upper respiratory infection,typical pneumonia,,copd flare (milder),copd flare (milder),acute rhinosinusitis with post nasal drip,flu
3,,occupational asthma,bronchiectasis,bronchiectasis,covid 19,lung cancer,,pneumonia,asthma,,congestive heart failure,asthma,,
4,,pneumonia,cystic fibrosis,cystic fibrosis,,chronic bronchitis,,,foreign body aspiration,,diastolic heart failure,,,
5,,covid 19,upper airway obstruction,heart failure,,emphysema,,,heart failure,,bacterial pneumonia,,,
6,,,,,,,,,lung cancer,,pulmonary embolism,,,
7,,,,,,,,,bronchial hyperresponsiveness,,influenza,,,
8,,,,,,,,,aspiration pneumonia,,,,,
9,,,,,,,,,tuberculosis,,,,,
10,,,,,,,,,biventricular heart failure,,,,,







<p>

## Let us define the metrics now. <a class="anchor" id="metrics"></a>

### Terms used
- TP: True positive (correct disease retrieved)
- TN: True negative (wrong disease **not** retrieved)
- FP: False positive (wrong disease retrieved)
- FN: False negative (correct disease **not** retrieved)
- gold standard - the correct list of diseases as determined by collective intelligence of doctors

### Precision
Precision helps us understand how exact our results are. It gives us an intuition about how many wrong diseases (false positives) are being retrieved. It is the ratio *number of correct diseases retrieved* to the *length of the complete list retrieved*.
$$precision = \frac{TP}{TP + FP} = \frac{TP}{\text{length of differential list}}$$

### Recall
Recall is a measure of how many of the correct diseases are being retrieved. It is the ratio *number of correct diseases retrieved* to the *length of the gold standard list*.
$$recall = \frac{TP}{TP + FN} = \frac{TP}{\text{length of the gold standard}}$$

### F1 Score
F1 score is the weighted harmonic mean of *precision* and *recall*. It is a metric that combines *precision* and *recall* and gives us 1 score for easier comparison.

Suppose $\beta$ defines how important is $recall$ to $precision$ then,
$$fscore_{\beta} = (1 + \beta^2)\frac{precision \cdot recall}{(\beta^2 \cdot precision) + recall}$$
Substituting $\beta = 1$,
$$fscore_{1} = \frac{2 \cdot precision \cdot recall}{ precision + recall}$$

### NDCG
NDCG or Normalized Discounted Cumulative Gain is measure of how accurate the ranking is. In our calculations, we use
$$DCG = \sum_{i=1}^n\frac{2^{relevance_i}-1}{log_2(i+1)}$$
where $n$ is the number of differentials in the returned list and  
$relevance_i = |gold standard| - rank_{gold\ standard}(ddx[i])$ if $ddx[i]$ is present, 0 otherwise.

$$NDCG = \frac{DCG_{ddx}}{DCG_{gold\ standard}}$$

### M Score
M Score determines where the top disease (gold standard) appears in the returned differential.
$$M_i = \text{gold standard[0]} \in \text{ddx[:i]}$$

### Position
Shows the position of the gold standard[0] in the returned differential. 

### Length
$$length = \frac{|ddx|}{|gold\ standard|}$$
<br>
<br>
<br>
<br>


In [7]:
import math

def getPrecision(goldStandard: pd.Series, candidate: pd.Series) -> float:
    tp = sum(int(disease in goldStandard.values and disease is not None)
             for disease in candidate)
    return tp if tp == 0 else tp/candidate.count()


def getRecall(goldStandard: pd.Series, candidate: pd.Series) -> float:
    tp = sum(int(disease in goldStandard.values and disease is not None)
             for disease in candidate)
    return tp/goldStandard.count()


def getFScore(precision: float, recall: float, beta: float = 1) -> float:
    return math.nan if precision+recall == 0 else \
        (1+beta**2)*precision*recall/(precision*(beta**2)+recall)


def getNDCG(goldStandard: pd.Series, candidate: pd.Series, scores) -> float:
    def discount(score: float, index: int) -> float:
        '''The index is 1 based'''
        return (math.pow(2, score)-1)/math.log2(index+1)

    maxDCG = sum(discount(scores[i], i+1) for i in range(len(scores)))

    candidateRelevance = []
    goldStandard = list(goldStandard)
    for index, disease in enumerate(candidate):
        if disease is not None and disease in goldStandard:
            candidateRelevance.append(
                discount(scores[goldStandard.index(disease)], index+1))
        else:
            candidateRelevance.append(0)

    return sum(candidateRelevance)/maxDCG


def getMScore(goldStandard: pd.Series, candidate: pd.Series, m=1) -> bool:
    return goldStandard.values[0] in candidate.values[:m]


def getPosition(goldStandard: pd.Series, candidate: pd.Series) -> bool:
    return math.nan if goldStandard.values[0] not in candidate.values else\
        1 + list(candidate.values).index(goldStandard.values[0])


def getLength(goldStandard: pd.Series, candidate: pd.Series) -> int:
    return math.nan if candidate.count() == 0 else \
        candidate.count()/goldStandard.count()


def getScoresCase(id,case: pd.DataFrame) -> pd.DataFrame:
    gold = case['gold']
    if EXPERIMENT_TAG == 'avey':
        gold_old = case['gold_old']
    else:
        gold_old = gold

    scores = [
        [getPrecision(gold_old if '_old' in col else gold, case[col]) for col in case.columns if "gold" not in col],
        [getRecall(gold_old if '_old' in col else gold, case[col]) for col in case.columns if "gold" not in col],
    ]

    scores.append(
        [getFScore(prec, rec, beta=1) for prec, rec in zip(scores[0], scores[1])]
    )
    scores.append(
        [getFScore(prec, rec, beta=2) for prec, rec in zip(scores[0], scores[1])]
    )

    # relevance for a list of 4 is 4, 3, 2, 1
    # relevance for a list of 2 is 2, 1
    scores.append(
        [
            getNDCG(
                gold_old if "_old" in col else gold,
                case[col],
                list(range((gold_old if "_old" in col else gold).count(), 0, -1)),
            )
            for col in case.columns
            if "gold" not in col
        ]
    )

    for m in range(1, 6, 2):
        scores.append(
            [
                getMScore(gold_old if "_old" in col else gold, case[col], m)
                for col in case.columns
                if "gold" not in col
            ]
        )

    try:
        scores.append(
            [
                getPosition(gold_old if "_old" in col else gold, case[col])
                for col in case.columns
                if "gold" not in col
            ]
        )
    except Exception as e:
        print(id,flush=True)
        raise e

    scores.append(
        [
            getLength(gold_old if "_old" in col else gold, case[col])
            for col in case.columns
            if "gold" not in col
        ]
    )

    return pd.DataFrame(
        scores,
        columns=[col for col in case.columns if "gold" not in col],
        index=[
            "precision",
            "recall",
            "f1-score",
            "f2-score",
            "NDCG",
            "M1",
            "M3",
            "M5",
            "position",
            "length (x of gs)",
        ],
    )

scores = {}
for id, case in cases.items():
    try: scores[id] = getScoresCase(id, case)
    except Exception as e:
        print(id,flush=True)
        raise e

# scores = {id: getScoresCase(id, case) for id, case in cases.items()}

In [8]:
for id, score in scores.items():
    score.to_excel(f'individual_results/{EXPERIMENT_TAG}/{id}.xlsx')

In [9]:
print(len(scores))

44


Let us define the experiment now. We will pick which cases to compute startistics for.

In [10]:
from enum import Enum 

class DiseaseType(Enum):
    ALL = 'all'
    COMMON = 'common-diseases'
    UNCOMMON = 'less-common-diseases'

class FailureType(Enum):
    ANY = 'any-error'
    NO_DDX = 'no-disease-found'
    SES_FAIL = 'session-failed'
    NONE = 'None'

In [11]:
from collections import defaultdict

experiments = {}

def getExpName(disease_type:DiseaseType,failure_type:FailureType=FailureType.NONE,app='NA'):
    return f"disease_type={disease_type.value} failure-type-ignored={failure_type.value} for app={app}"

def addExperiment(disease_type, casesToConsider):
    # add experiment to ignore no case
    experiments[getExpName(disease_type)] = set(casesToConsider)

    caseClassificationApps = {}
    for app, failure_type in caseClassification["apps"].items():
        caseClassificationApps[app] = {
            FailureType.NO_DDX: failure_type["no disease found"],
            FailureType.SES_FAIL: failure_type["session failed"],
            FailureType.ANY: set(failure_type["no disease found"])
            | set(failure_type["session failed"]),
        }

    for app, classifications in caseClassificationApps.items():
        for failure_type, caseNums in classifications.items():

            # experiment to ignore the cases for the particular app
            experiments[getExpName(disease_type,failure_type,app)] = set(casesToConsider) - set(caseNums)

            # experiment to ignore the cases for all the apps
            if getExpName(disease_type,failure_type,'any') in experiments:
                experiments[getExpName(disease_type,failure_type,'any')] = (
                    experiments[getExpName(disease_type,failure_type,'any')]
                    - set(caseNums)
                )
            else:
                experiments[getExpName(disease_type,failure_type,'any')] = experiments[getExpName(disease_type,failure_type,app)]


addExperiment(DiseaseType.COMMON, caseClassification["common"])
addExperiment(DiseaseType.UNCOMMON, caseClassification["less common"])
addExperiment(
    DiseaseType.ALL,
    set(caseClassification["less common"]) | set(caseClassification["common"]),
)

display("The experiments we are going to conduct are:")
dict(enumerate(sorted(list(experiments.keys()))))

'The experiments we are going to conduct are:'

{0: 'disease_type=all failure-type-ignored=None for app=NA',
 1: 'disease_type=all failure-type-ignored=any-error for app=Avey',
 2: 'disease_type=all failure-type-ignored=any-error for app=Avey v2',
 3: 'disease_type=all failure-type-ignored=any-error for app=Buoy',
 4: 'disease_type=all failure-type-ignored=any-error for app=Healthily',
 5: 'disease_type=all failure-type-ignored=any-error for app=K Health',
 6: 'disease_type=all failure-type-ignored=any-error for app=Mediktor',
 7: 'disease_type=all failure-type-ignored=any-error for app=Symptomate',
 8: 'disease_type=all failure-type-ignored=any-error for app=WebMD',
 9: 'disease_type=all failure-type-ignored=any-error for app=any',
 10: 'disease_type=all failure-type-ignored=no-disease-found for app=Avey',
 11: 'disease_type=all failure-type-ignored=no-disease-found for app=Avey v2',
 12: 'disease_type=all failure-type-ignored=no-disease-found for app=Buoy',
 13: 'disease_type=all failure-type-ignored=no-disease-found for app=Healt

<br>
<br>

### Results <a class="anchor" id="results"></a>

In [12]:
#shows the number of cases where Avey failed M1
#Do not run this after Avey v2 is replaces Avey

count = 0
for id, score in scores.items():
    if not score.loc["M1","Avey v2"]:
        print(id)
        count += 1 

count

521
527
534
503
524
526
545
501
506
502
505
507
509
516
519
520
537
539
540


19

In [13]:
def getStats(scores, row: str, col: str):
    values = []
    for score in scores.values():
        try:
            if not math.isnan(score.loc[row, col]):
                values.append(score.loc[row, col])
        except KeyError as e:
            if '_old' not in col:
                display(score,row,col)
                raise e
    if not values:
        return {
            (col, "average"): 0,
            (col, "variance"): 0,
            (col, "std Dev"): 0,
        }
    
    average = sum(values)/len(values)
    variance = sum((value-average)**2 for value in values)/len(values)
    stdDev = variance**0.5
    return {
        (col,"average"): round(average, 3), 
        (col,"variance"):round(variance, 3), 
        (col,"std Dev"):round(stdDev, 3),
        }


results = {}
for label, casesToConsider in experiments.items():
    # selectedScores = {id: score for id,
    #                   score in scores.items()}
    selectedScores = {id: score for id,
                      score in scores.items() if int(id) in casesToConsider}
    
    columns = next(iter(scores.values())).columns
    index = [
            "precision",
            "recall",
            "f1-score",
            "f2-score",
            "NDCG",
            "M1",
            "M3",
            "M5",
            "position",
            "length (x of gs)",
    ]

    metrics = defaultdict(list)
    for row in index:
        for stat in [getStats(selectedScores, row, col) for col in columns]:
            for key,val in stat.items():
                metrics[key].append(val)

    averageScores = pd.DataFrame(metrics, index=[f"stats_for_{x}" for x in index])

    # ignore nan for recall
    for col in columns:
        p = averageScores.loc["stats_for_precision", (col,'average')]
        r = averageScores.loc["stats_for_recall", (col, "average")]
        averageScores.loc["stats_for_f1-score", (col, "average")] = round(getFScore(p, r, 1), 3)
        averageScores.loc["stats_for_f2-score", (col, "average")] = round(getFScore(p, r, 2), 3)

        for key in ['variance','std Dev']:
            averageScores.loc["stats_for_f1-score", (col, key)] = np.nan
            averageScores.loc["stats_for_f2-score", (col, key)] = np.nan

    def calcStats(doctorResults):
        def extractItem(data, col='average'):
            return data[col]

        sum = pd.Series(extractItem(doctorResults[0]))
        for data in doctorResults[1:]:
            sum += extractItem(data)
        average = (sum/len(doctorResults)).round(3)
        return average

    doctorResultsAverage = calcStats(
        [
            averageScores.loc[:, "doctor_MA"] ,
            averageScores.loc[:, "doctor_NJ"] ,
            averageScores.loc[:, "doctor_TH"]
        ]
    )

    if EXPERIMENT_TAG != 'harvard':
        doctorResultsAverageOld = calcStats(
            [
                averageScores.loc[:, "doctor_MA_old"] ,
                averageScores.loc[:, "doctor_NJ_old"] ,
                averageScores.loc[:, "doctor_TH_old"]
            ]
        )
        averageScores["average_doctor_old"] = doctorResultsAverageOld

    # averageScores.insert(
    #     loc=8, column="average_doctor",
    #     value=doctorResultsAverage,
    # )

    averageScores["average_doctor"] = doctorResultsAverage

    results[label] = averageScores

In [14]:
def displayResults(results,printNumCases=True):
    for label, result in results.items():
        if printNumCases:
            print(f'Results for experiment `{label}`, which has {len(set(experiments[label]) & set(scores.keys()))} cases, is')
        else:
            print(f'Results for experiment `{label}` is')
        display(result)
        result.to_excel(f'stats/{EXPERIMENT_TAG}/{label}.xlsx')

displayResults({key:val for key, val in results.items()})

Results for experiment `disease_type=common-diseases failure-type-ignored=None for app=NA`, which has 37 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.291,0.052,0.227,0.149,0.013,0.115,0.187,0.012,0.109,0.338,...,2.074,0.095,0.308,0.649,0.174,0.417,0.695,0.115,0.34,0.691
stats_for_recall,0.811,0.153,0.392,0.649,0.228,0.477,0.811,0.153,0.392,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_f1-score,0.428,,,0.242,,,0.304,,,0.445,...,2.324,,,0.699,,,0.791,,,0.775
stats_for_f2-score,0.597,,,0.388,,,0.486,,,0.548,...,2.508,,,0.733,,,0.863,,,0.836
stats_for_NDCG,0.744,0.152,0.39,0.536,0.19,0.436,0.695,0.154,0.392,0.575,...,2.526,0.047,0.218,0.747,0.183,0.428,0.889,0.08,0.282,0.842
stats_for_M1,0.649,0.228,0.477,0.405,0.241,0.491,0.568,0.245,0.495,0.486,...,2.325,0.184,0.429,0.73,0.197,0.444,0.838,0.136,0.369,0.775
stats_for_M3,0.811,0.153,0.392,0.568,0.245,0.495,0.73,0.197,0.444,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_M5,0.811,0.153,0.392,0.649,0.228,0.477,0.811,0.153,0.392,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_position,1.267,0.329,0.573,1.708,1.123,1.06,1.633,1.366,1.169,1.417,...,3.374,0.243,0.493,1.036,0.034,0.186,1.088,0.08,0.284,1.125
stats_for_length (x of gs),3.514,1.439,1.2,4.613,0.431,0.656,4.629,0.633,0.796,2.273,...,4.514,0.722,0.85,1.297,0.371,0.609,1.595,0.728,0.853,1.505


Results for experiment `disease_type=common-diseases failure-type-ignored=no-disease-found for app=Avey`, which has 33 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.301,0.057,0.239,0.167,0.012,0.109,0.198,0.011,0.105,0.369,...,2.006,0.098,0.313,0.621,0.182,0.427,0.673,0.119,0.345,0.669
stats_for_recall,0.788,0.167,0.409,0.727,0.198,0.445,0.848,0.129,0.359,0.697,...,2.606,0.029,0.171,0.727,0.198,0.445,0.909,0.083,0.287,0.869
stats_for_f1-score,0.436,,,0.272,,,0.321,,,0.483,...,2.264,,,0.67,,,0.773,,,0.755
stats_for_f2-score,0.595,,,0.435,,,0.512,,,0.592,...,2.456,,,0.703,,,0.849,,,0.819
stats_for_NDCG,0.75,0.166,0.407,0.6,0.174,0.417,0.73,0.139,0.373,0.614,...,2.49,0.05,0.224,0.727,0.198,0.445,0.876,0.088,0.296,0.83
stats_for_M1,0.697,0.211,0.46,0.455,0.248,0.498,0.606,0.239,0.489,0.515,...,2.303,0.184,0.429,0.727,0.198,0.445,0.818,0.149,0.386,0.768
stats_for_M3,0.788,0.167,0.409,0.636,0.231,0.481,0.758,0.184,0.429,0.697,...,2.606,0.029,0.171,0.727,0.198,0.445,0.909,0.083,0.287,0.869
stats_for_M5,0.788,0.167,0.409,0.727,0.198,0.445,0.848,0.129,0.359,0.697,...,2.606,0.029,0.171,0.727,0.198,0.445,0.909,0.083,0.287,0.869
stats_for_position,1.154,0.207,0.455,1.708,1.123,1.06,1.643,1.444,1.202,1.435,...,3.35,0.25,0.5,1.0,0.0,0.0,1.1,0.09,0.3,1.117
stats_for_length (x of gs),3.364,1.383,1.176,4.613,0.431,0.656,4.606,0.663,0.814,2.31,...,4.606,0.768,0.876,1.303,0.393,0.627,1.636,0.777,0.881,1.535


Results for experiment `disease_type=common-diseases failure-type-ignored=no-disease-found for app=any`, which has 30 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.292,0.061,0.247,0.162,0.013,0.113,0.196,0.012,0.11,0.35,...,2.034,0.099,0.315,0.633,0.199,0.446,0.662,0.12,0.347,0.678
stats_for_recall,0.767,0.179,0.423,0.7,0.21,0.458,0.833,0.139,0.373,0.667,...,2.567,0.032,0.18,0.7,0.21,0.458,0.9,0.09,0.3,0.856
stats_for_f1-score,0.423,,,0.263,,,0.317,,,0.459,...,2.266,,,0.665,,,0.763,,,0.755
stats_for_f2-score,0.579,,,0.421,,,0.505,,,0.565,...,2.436,,,0.685,,,0.84,,,0.812
stats_for_NDCG,0.725,0.175,0.419,0.561,0.174,0.417,0.703,0.145,0.38,0.588,...,2.451,0.054,0.231,0.7,0.21,0.458,0.875,0.094,0.306,0.817
stats_for_M1,0.667,0.222,0.471,0.4,0.24,0.49,0.567,0.246,0.496,0.5,...,2.266,0.196,0.442,0.7,0.21,0.458,0.833,0.139,0.373,0.755
stats_for_M3,0.767,0.179,0.423,0.6,0.24,0.49,0.733,0.196,0.442,0.667,...,2.567,0.032,0.18,0.7,0.21,0.458,0.9,0.09,0.3,0.856
stats_for_M5,0.767,0.179,0.423,0.7,0.21,0.458,0.833,0.139,0.373,0.667,...,2.567,0.032,0.18,0.7,0.21,0.458,0.9,0.09,0.3,0.856
stats_for_position,1.174,0.231,0.48,1.81,1.202,1.096,1.72,1.562,1.25,1.45,...,3.35,0.269,0.518,1.0,0.0,0.0,1.074,0.069,0.262,1.117
stats_for_length (x of gs),3.433,1.446,1.202,4.607,0.453,0.673,4.6,0.707,0.841,2.308,...,4.466,0.773,0.879,1.233,0.379,0.616,1.633,0.766,0.875,1.489


Results for experiment `disease_type=common-diseases failure-type-ignored=session-failed for app=Avey`, which has 35 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.3,0.052,0.228,0.157,0.013,0.113,0.183,0.012,0.109,0.357,...,2.127,0.092,0.303,0.686,0.158,0.398,0.691,0.118,0.344,0.709
stats_for_recall,0.829,0.142,0.377,0.686,0.216,0.464,0.8,0.16,0.4,0.686,...,2.685,0.028,0.167,0.8,0.16,0.4,0.914,0.078,0.28,0.895
stats_for_f1-score,0.441,,,0.256,,,0.298,,,0.47,...,2.372,,,0.739,,,0.787,,,0.791
stats_for_f2-score,0.613,,,0.41,,,0.478,,,0.579,...,2.55,,,0.774,,,0.859,,,0.85
stats_for_NDCG,0.768,0.144,0.379,0.566,0.184,0.429,0.688,0.16,0.4,0.607,...,2.576,0.046,0.215,0.789,0.16,0.399,0.883,0.084,0.289,0.859
stats_for_M1,0.686,0.216,0.464,0.429,0.245,0.495,0.571,0.245,0.495,0.514,...,2.4,0.16,0.4,0.771,0.176,0.42,0.829,0.142,0.377,0.8
stats_for_M3,0.829,0.142,0.377,0.6,0.24,0.49,0.714,0.204,0.452,0.686,...,2.685,0.028,0.167,0.8,0.16,0.4,0.914,0.078,0.28,0.895
stats_for_M5,0.829,0.142,0.377,0.686,0.216,0.464,0.8,0.16,0.4,0.686,...,2.685,0.028,0.167,0.8,0.16,0.4,0.914,0.078,0.28,0.895
stats_for_position,1.241,0.321,0.567,1.708,1.123,1.06,1.643,1.444,1.202,1.417,...,3.336,0.222,0.472,1.036,0.034,0.186,1.094,0.085,0.291,1.112
stats_for_length (x of gs),3.486,1.507,1.228,4.613,0.431,0.656,4.667,0.586,0.765,2.258,...,4.457,0.591,0.769,1.314,0.387,0.622,1.6,0.754,0.868,1.486


Results for experiment `disease_type=common-diseases failure-type-ignored=session-failed for app=any`, which has 23 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.341,0.062,0.25,0.18,0.011,0.105,0.215,0.01,0.098,0.449,...,2.159,0.098,0.313,0.717,0.127,0.356,0.725,0.115,0.339,0.72
stats_for_recall,0.87,0.113,0.337,0.783,0.17,0.412,0.913,0.079,0.282,0.783,...,2.74,0.042,0.204,0.87,0.113,0.337,0.913,0.079,0.282,0.913
stats_for_f1-score,0.49,,,0.293,,,0.348,,,0.571,...,2.414,,,0.786,,,0.808,,,0.805
stats_for_f2-score,0.664,,,0.469,,,0.554,,,0.682,...,2.599,,,0.834,,,0.868,,,0.866
stats_for_NDCG,0.832,0.119,0.345,0.638,0.158,0.397,0.83,0.1,0.317,0.729,...,2.622,0.059,0.243,0.854,0.115,0.339,0.881,0.085,0.291,0.874
stats_for_M1,0.783,0.17,0.412,0.478,0.25,0.5,0.739,0.193,0.439,0.652,...,2.435,0.17,0.412,0.826,0.144,0.379,0.826,0.144,0.379,0.812
stats_for_M3,0.87,0.113,0.337,0.652,0.227,0.476,0.826,0.144,0.379,0.783,...,2.74,0.042,0.204,0.87,0.113,0.337,0.913,0.079,0.282,0.913
stats_for_M5,0.87,0.113,0.337,0.783,0.17,0.412,0.913,0.079,0.282,0.783,...,2.74,0.042,0.204,0.87,0.113,0.337,0.913,0.079,0.282,0.913
stats_for_position,1.15,0.227,0.477,1.778,1.284,1.133,1.429,1.102,1.05,1.222,...,3.372,0.267,0.516,1.05,0.048,0.218,1.095,0.086,0.294,1.124
stats_for_length (x of gs),3.261,1.584,1.259,4.571,0.531,0.728,4.565,0.767,0.876,2.174,...,4.304,0.507,0.712,1.304,0.212,0.46,1.435,0.42,0.648,1.435


Results for experiment `disease_type=common-diseases failure-type-ignored=any-error for app=Avey`, which has 31 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.312,0.058,0.24,0.177,0.011,0.104,0.194,0.011,0.106,0.392,...,2.063,0.095,0.308,0.661,0.168,0.409,0.668,0.123,0.35,0.688
stats_for_recall,0.806,0.156,0.395,0.774,0.175,0.418,0.839,0.135,0.368,0.742,...,2.645,0.031,0.177,0.774,0.175,0.418,0.903,0.087,0.296,0.882
stats_for_f1-score,0.45,,,0.288,,,0.315,,,0.513,...,2.316,,,0.713,,,0.768,,,0.772
stats_for_f2-score,0.612,,,0.462,,,0.504,,,0.63,...,2.502,,,0.748,,,0.844,,,0.834
stats_for_NDCG,0.778,0.157,0.396,0.639,0.161,0.401,0.725,0.145,0.381,0.654,...,2.546,0.049,0.221,0.774,0.175,0.418,0.868,0.092,0.304,0.849
stats_for_M1,0.742,0.191,0.438,0.484,0.25,0.5,0.613,0.237,0.487,0.548,...,2.386,0.156,0.395,0.774,0.175,0.418,0.806,0.156,0.395,0.795
stats_for_M3,0.806,0.156,0.395,0.677,0.219,0.467,0.742,0.191,0.438,0.742,...,2.645,0.031,0.177,0.774,0.175,0.418,0.903,0.087,0.296,0.882
stats_for_M5,0.806,0.156,0.395,0.774,0.175,0.418,0.839,0.135,0.368,0.742,...,2.645,0.031,0.177,0.774,0.175,0.418,0.903,0.087,0.296,0.882
stats_for_position,1.12,0.186,0.431,1.708,1.123,1.06,1.654,1.534,1.239,1.435,...,3.307,0.227,0.476,1.0,0.0,0.0,1.107,0.096,0.309,1.102
stats_for_length (x of gs),3.323,1.444,1.202,4.613,0.431,0.656,4.645,0.616,0.785,2.296,...,4.549,0.631,0.794,1.323,0.412,0.642,1.645,0.81,0.9,1.516


Results for experiment `disease_type=common-diseases failure-type-ignored=any-error for app=any`, which has 18 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.345,0.077,0.277,0.194,0.01,0.099,0.217,0.012,0.11,0.481,...,2.222,0.103,0.32,0.75,0.146,0.382,0.713,0.123,0.351,0.741
stats_for_recall,0.833,0.139,0.373,0.833,0.139,0.373,0.889,0.099,0.314,0.833,...,2.666,0.052,0.229,0.833,0.139,0.373,0.889,0.099,0.314,0.889
stats_for_f1-score,0.488,,,0.315,,,0.349,,,0.61,...,2.421,,,0.789,,,0.791,,,0.807
stats_for_f2-score,0.649,,,0.502,,,0.549,,,0.727,...,2.562,,,0.815,,,0.847,,,0.854
stats_for_NDCG,0.806,0.143,0.378,0.649,0.134,0.366,0.803,0.119,0.344,0.785,...,2.577,0.069,0.263,0.833,0.139,0.373,0.868,0.101,0.318,0.859
stats_for_M1,0.778,0.173,0.416,0.444,0.247,0.497,0.722,0.201,0.448,0.722,...,2.444,0.173,0.416,0.833,0.139,0.373,0.833,0.139,0.373,0.815
stats_for_M3,0.833,0.139,0.373,0.667,0.222,0.471,0.778,0.173,0.416,0.833,...,2.666,0.052,0.229,0.833,0.139,0.373,0.889,0.099,0.314,0.889
stats_for_M5,0.833,0.139,0.373,0.833,0.139,0.373,0.889,0.099,0.314,0.833,...,2.666,0.052,0.229,0.833,0.139,0.373,0.889,0.099,0.314,0.889
stats_for_position,1.133,0.249,0.499,1.933,1.396,1.181,1.5,1.375,1.173,1.2,...,3.297,0.298,0.546,1.0,0.0,0.0,1.062,0.059,0.242,1.099
stats_for_length (x of gs),3.222,1.728,1.315,4.556,0.58,0.762,4.5,0.917,0.957,2.167,...,4.0,0.469,0.685,1.167,0.139,0.373,1.389,0.349,0.591,1.333


Results for experiment `disease_type=common-diseases failure-type-ignored=no-disease-found for app=Avey v2`, which has 35 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.296,0.054,0.232,0.157,0.013,0.113,0.198,0.01,0.102,0.348,...,2.02,0.096,0.309,0.629,0.176,0.42,0.677,0.116,0.341,0.673
stats_for_recall,0.8,0.16,0.4,0.686,0.216,0.464,0.857,0.122,0.35,0.657,...,2.628,0.028,0.167,0.743,0.191,0.437,0.914,0.078,0.28,0.876
stats_for_f1-score,0.432,,,0.256,,,0.322,,,0.455,...,2.282,,,0.681,,,0.778,,,0.761
stats_for_f2-score,0.597,,,0.41,,,0.515,,,0.558,...,2.477,,,0.717,,,0.854,,,0.826
stats_for_NDCG,0.754,0.158,0.398,0.566,0.184,0.429,0.735,0.133,0.365,0.579,...,2.498,0.049,0.222,0.732,0.189,0.435,0.883,0.084,0.289,0.833
stats_for_M1,0.686,0.216,0.464,0.429,0.245,0.495,0.6,0.24,0.49,0.486,...,2.286,0.191,0.437,0.714,0.204,0.452,0.829,0.142,0.377,0.762
stats_for_M3,0.8,0.16,0.4,0.6,0.24,0.49,0.771,0.176,0.42,0.657,...,2.628,0.028,0.167,0.743,0.191,0.437,0.914,0.078,0.28,0.876
stats_for_M5,0.8,0.16,0.4,0.686,0.216,0.464,0.857,0.122,0.35,0.657,...,2.628,0.028,0.167,0.743,0.191,0.437,0.914,0.078,0.28,0.876
stats_for_position,1.179,0.218,0.467,1.708,1.123,1.06,1.633,1.366,1.169,1.435,...,3.397,0.253,0.503,1.038,0.037,0.192,1.094,0.085,0.291,1.132
stats_for_length (x of gs),3.429,1.388,1.178,4.613,0.431,0.656,4.629,0.633,0.796,2.29,...,4.6,0.74,0.86,1.314,0.387,0.622,1.629,0.748,0.865,1.533


Results for experiment `disease_type=common-diseases failure-type-ignored=session-failed for app=Avey v2`, which has 37 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.291,0.052,0.227,0.149,0.013,0.115,0.187,0.012,0.109,0.338,...,2.074,0.095,0.308,0.649,0.174,0.417,0.695,0.115,0.34,0.691
stats_for_recall,0.811,0.153,0.392,0.649,0.228,0.477,0.811,0.153,0.392,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_f1-score,0.428,,,0.242,,,0.304,,,0.445,...,2.324,,,0.699,,,0.791,,,0.775
stats_for_f2-score,0.597,,,0.388,,,0.486,,,0.548,...,2.508,,,0.733,,,0.863,,,0.836
stats_for_NDCG,0.744,0.152,0.39,0.536,0.19,0.436,0.695,0.154,0.392,0.575,...,2.526,0.047,0.218,0.747,0.183,0.428,0.889,0.08,0.282,0.842
stats_for_M1,0.649,0.228,0.477,0.405,0.241,0.491,0.568,0.245,0.495,0.486,...,2.325,0.184,0.429,0.73,0.197,0.444,0.838,0.136,0.369,0.775
stats_for_M3,0.811,0.153,0.392,0.568,0.245,0.495,0.73,0.197,0.444,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_M5,0.811,0.153,0.392,0.649,0.228,0.477,0.811,0.153,0.392,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_position,1.267,0.329,0.573,1.708,1.123,1.06,1.633,1.366,1.169,1.417,...,3.374,0.243,0.493,1.036,0.034,0.186,1.088,0.08,0.284,1.125
stats_for_length (x of gs),3.514,1.439,1.2,4.613,0.431,0.656,4.629,0.633,0.796,2.273,...,4.514,0.722,0.85,1.297,0.371,0.609,1.595,0.728,0.853,1.505


Results for experiment `disease_type=common-diseases failure-type-ignored=any-error for app=Avey v2`, which has 35 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.296,0.054,0.232,0.157,0.013,0.113,0.198,0.01,0.102,0.348,...,2.02,0.096,0.309,0.629,0.176,0.42,0.677,0.116,0.341,0.673
stats_for_recall,0.8,0.16,0.4,0.686,0.216,0.464,0.857,0.122,0.35,0.657,...,2.628,0.028,0.167,0.743,0.191,0.437,0.914,0.078,0.28,0.876
stats_for_f1-score,0.432,,,0.256,,,0.322,,,0.455,...,2.282,,,0.681,,,0.778,,,0.761
stats_for_f2-score,0.597,,,0.41,,,0.515,,,0.558,...,2.477,,,0.717,,,0.854,,,0.826
stats_for_NDCG,0.754,0.158,0.398,0.566,0.184,0.429,0.735,0.133,0.365,0.579,...,2.498,0.049,0.222,0.732,0.189,0.435,0.883,0.084,0.289,0.833
stats_for_M1,0.686,0.216,0.464,0.429,0.245,0.495,0.6,0.24,0.49,0.486,...,2.286,0.191,0.437,0.714,0.204,0.452,0.829,0.142,0.377,0.762
stats_for_M3,0.8,0.16,0.4,0.6,0.24,0.49,0.771,0.176,0.42,0.657,...,2.628,0.028,0.167,0.743,0.191,0.437,0.914,0.078,0.28,0.876
stats_for_M5,0.8,0.16,0.4,0.686,0.216,0.464,0.857,0.122,0.35,0.657,...,2.628,0.028,0.167,0.743,0.191,0.437,0.914,0.078,0.28,0.876
stats_for_position,1.179,0.218,0.467,1.708,1.123,1.06,1.633,1.366,1.169,1.435,...,3.397,0.253,0.503,1.038,0.037,0.192,1.094,0.085,0.291,1.132
stats_for_length (x of gs),3.429,1.388,1.178,4.613,0.431,0.656,4.629,0.633,0.796,2.29,...,4.6,0.74,0.86,1.314,0.387,0.622,1.629,0.748,0.865,1.533


Results for experiment `disease_type=common-diseases failure-type-ignored=no-disease-found for app=Buoy`, which has 37 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.291,0.052,0.227,0.149,0.013,0.115,0.187,0.012,0.109,0.338,...,2.074,0.095,0.308,0.649,0.174,0.417,0.695,0.115,0.34,0.691
stats_for_recall,0.811,0.153,0.392,0.649,0.228,0.477,0.811,0.153,0.392,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_f1-score,0.428,,,0.242,,,0.304,,,0.445,...,2.324,,,0.699,,,0.791,,,0.775
stats_for_f2-score,0.597,,,0.388,,,0.486,,,0.548,...,2.508,,,0.733,,,0.863,,,0.836
stats_for_NDCG,0.744,0.152,0.39,0.536,0.19,0.436,0.695,0.154,0.392,0.575,...,2.526,0.047,0.218,0.747,0.183,0.428,0.889,0.08,0.282,0.842
stats_for_M1,0.649,0.228,0.477,0.405,0.241,0.491,0.568,0.245,0.495,0.486,...,2.325,0.184,0.429,0.73,0.197,0.444,0.838,0.136,0.369,0.775
stats_for_M3,0.811,0.153,0.392,0.568,0.245,0.495,0.73,0.197,0.444,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_M5,0.811,0.153,0.392,0.649,0.228,0.477,0.811,0.153,0.392,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_position,1.267,0.329,0.573,1.708,1.123,1.06,1.633,1.366,1.169,1.417,...,3.374,0.243,0.493,1.036,0.034,0.186,1.088,0.08,0.284,1.125
stats_for_length (x of gs),3.514,1.439,1.2,4.613,0.431,0.656,4.629,0.633,0.796,2.273,...,4.514,0.722,0.85,1.297,0.371,0.609,1.595,0.728,0.853,1.505


Results for experiment `disease_type=common-diseases failure-type-ignored=session-failed for app=Buoy`, which has 33 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.295,0.054,0.232,0.159,0.012,0.112,0.203,0.01,0.099,0.379,...,2.14,0.093,0.304,0.682,0.164,0.405,0.738,0.107,0.327,0.713
stats_for_recall,0.818,0.149,0.386,0.697,0.211,0.46,0.879,0.107,0.326,0.727,...,2.697,0.029,0.171,0.788,0.167,0.409,0.939,0.057,0.239,0.899
stats_for_f1-score,0.434,,,0.259,,,0.33,,,0.498,...,2.384,,,0.731,,,0.826,,,0.795
stats_for_f2-score,0.604,,,0.416,,,0.528,,,0.614,...,2.561,,,0.764,,,0.89,,,0.854
stats_for_NDCG,0.754,0.149,0.386,0.581,0.185,0.43,0.761,0.124,0.352,0.644,...,2.57,0.05,0.224,0.777,0.166,0.408,0.906,0.064,0.253,0.857
stats_for_M1,0.667,0.222,0.471,0.455,0.248,0.498,0.636,0.231,0.481,0.545,...,2.364,0.184,0.429,0.758,0.184,0.429,0.848,0.129,0.359,0.788
stats_for_M3,0.818,0.149,0.386,0.606,0.239,0.489,0.788,0.167,0.409,0.727,...,2.697,0.029,0.171,0.788,0.167,0.409,0.939,0.057,0.239,0.899
stats_for_M5,0.818,0.149,0.386,0.697,0.211,0.46,0.879,0.107,0.326,0.727,...,2.697,0.029,0.171,0.788,0.167,0.409,0.939,0.057,0.239,0.899
stats_for_position,1.259,0.34,0.583,1.696,1.168,1.081,1.621,1.408,1.187,1.417,...,3.385,0.25,0.5,1.038,0.037,0.192,1.097,0.087,0.296,1.128
stats_for_length (x of gs),3.485,1.401,1.184,4.593,0.464,0.681,4.613,0.689,0.83,2.273,...,4.333,0.602,0.776,1.212,0.167,0.409,1.515,0.735,0.857,1.444


Results for experiment `disease_type=common-diseases failure-type-ignored=any-error for app=Buoy`, which has 33 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.295,0.054,0.232,0.159,0.012,0.112,0.203,0.01,0.099,0.379,...,2.14,0.093,0.304,0.682,0.164,0.405,0.738,0.107,0.327,0.713
stats_for_recall,0.818,0.149,0.386,0.697,0.211,0.46,0.879,0.107,0.326,0.727,...,2.697,0.029,0.171,0.788,0.167,0.409,0.939,0.057,0.239,0.899
stats_for_f1-score,0.434,,,0.259,,,0.33,,,0.498,...,2.384,,,0.731,,,0.826,,,0.795
stats_for_f2-score,0.604,,,0.416,,,0.528,,,0.614,...,2.561,,,0.764,,,0.89,,,0.854
stats_for_NDCG,0.754,0.149,0.386,0.581,0.185,0.43,0.761,0.124,0.352,0.644,...,2.57,0.05,0.224,0.777,0.166,0.408,0.906,0.064,0.253,0.857
stats_for_M1,0.667,0.222,0.471,0.455,0.248,0.498,0.636,0.231,0.481,0.545,...,2.364,0.184,0.429,0.758,0.184,0.429,0.848,0.129,0.359,0.788
stats_for_M3,0.818,0.149,0.386,0.606,0.239,0.489,0.788,0.167,0.409,0.727,...,2.697,0.029,0.171,0.788,0.167,0.409,0.939,0.057,0.239,0.899
stats_for_M5,0.818,0.149,0.386,0.697,0.211,0.46,0.879,0.107,0.326,0.727,...,2.697,0.029,0.171,0.788,0.167,0.409,0.939,0.057,0.239,0.899
stats_for_position,1.259,0.34,0.583,1.696,1.168,1.081,1.621,1.408,1.187,1.417,...,3.385,0.25,0.5,1.038,0.037,0.192,1.097,0.087,0.296,1.128
stats_for_length (x of gs),3.485,1.401,1.184,4.593,0.464,0.681,4.613,0.689,0.83,2.273,...,4.333,0.602,0.776,1.212,0.167,0.409,1.515,0.735,0.857,1.444


Results for experiment `disease_type=common-diseases failure-type-ignored=no-disease-found for app=Healthily`, which has 37 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.291,0.052,0.227,0.149,0.013,0.115,0.187,0.012,0.109,0.338,...,2.074,0.095,0.308,0.649,0.174,0.417,0.695,0.115,0.34,0.691
stats_for_recall,0.811,0.153,0.392,0.649,0.228,0.477,0.811,0.153,0.392,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_f1-score,0.428,,,0.242,,,0.304,,,0.445,...,2.324,,,0.699,,,0.791,,,0.775
stats_for_f2-score,0.597,,,0.388,,,0.486,,,0.548,...,2.508,,,0.733,,,0.863,,,0.836
stats_for_NDCG,0.744,0.152,0.39,0.536,0.19,0.436,0.695,0.154,0.392,0.575,...,2.526,0.047,0.218,0.747,0.183,0.428,0.889,0.08,0.282,0.842
stats_for_M1,0.649,0.228,0.477,0.405,0.241,0.491,0.568,0.245,0.495,0.486,...,2.325,0.184,0.429,0.73,0.197,0.444,0.838,0.136,0.369,0.775
stats_for_M3,0.811,0.153,0.392,0.568,0.245,0.495,0.73,0.197,0.444,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_M5,0.811,0.153,0.392,0.649,0.228,0.477,0.811,0.153,0.392,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_position,1.267,0.329,0.573,1.708,1.123,1.06,1.633,1.366,1.169,1.417,...,3.374,0.243,0.493,1.036,0.034,0.186,1.088,0.08,0.284,1.125
stats_for_length (x of gs),3.514,1.439,1.2,4.613,0.431,0.656,4.629,0.633,0.796,2.273,...,4.514,0.722,0.85,1.297,0.371,0.609,1.595,0.728,0.853,1.505


Results for experiment `disease_type=common-diseases failure-type-ignored=session-failed for app=Healthily`, which has 27 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.321,0.058,0.24,0.161,0.013,0.112,0.218,0.009,0.093,0.401,...,2.096,0.097,0.312,0.685,0.151,0.388,0.717,0.116,0.34,0.699
stats_for_recall,0.852,0.126,0.355,0.704,0.209,0.457,0.926,0.069,0.262,0.704,...,2.704,0.036,0.189,0.815,0.151,0.388,0.926,0.069,0.262,0.901
stats_for_f1-score,0.466,,,0.262,,,0.353,,,0.511,...,2.359,,,0.744,,,0.808,,,0.786
stats_for_f2-score,0.64,,,0.42,,,0.561,,,0.612,...,2.554,,,0.785,,,0.875,,,0.851
stats_for_NDCG,0.806,0.13,0.36,0.581,0.181,0.426,0.823,0.093,0.305,0.658,...,2.576,0.056,0.237,0.801,0.151,0.388,0.899,0.074,0.272,0.859
stats_for_M1,0.741,0.192,0.438,0.444,0.247,0.497,0.704,0.209,0.457,0.593,...,2.371,0.192,0.438,0.778,0.173,0.416,0.852,0.126,0.355,0.79
stats_for_M3,0.852,0.126,0.355,0.593,0.241,0.491,0.852,0.126,0.355,0.704,...,2.704,0.036,0.189,0.815,0.151,0.388,0.926,0.069,0.262,0.901
stats_for_M5,0.852,0.126,0.355,0.704,0.209,0.457,0.926,0.069,0.262,0.704,...,2.704,0.036,0.189,0.815,0.151,0.388,0.926,0.069,0.262,0.901
stats_for_position,1.174,0.231,0.48,1.737,1.247,1.116,1.48,1.05,1.024,1.211,...,3.394,0.274,0.523,1.045,0.043,0.208,1.08,0.074,0.271,1.131
stats_for_length (x of gs),3.333,1.407,1.186,4.609,0.499,0.706,4.556,0.765,0.875,2.185,...,4.482,0.667,0.816,1.259,0.192,0.438,1.556,0.84,0.916,1.494


Results for experiment `disease_type=common-diseases failure-type-ignored=any-error for app=Healthily`, which has 27 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.321,0.058,0.24,0.161,0.013,0.112,0.218,0.009,0.093,0.401,...,2.096,0.097,0.312,0.685,0.151,0.388,0.717,0.116,0.34,0.699
stats_for_recall,0.852,0.126,0.355,0.704,0.209,0.457,0.926,0.069,0.262,0.704,...,2.704,0.036,0.189,0.815,0.151,0.388,0.926,0.069,0.262,0.901
stats_for_f1-score,0.466,,,0.262,,,0.353,,,0.511,...,2.359,,,0.744,,,0.808,,,0.786
stats_for_f2-score,0.64,,,0.42,,,0.561,,,0.612,...,2.554,,,0.785,,,0.875,,,0.851
stats_for_NDCG,0.806,0.13,0.36,0.581,0.181,0.426,0.823,0.093,0.305,0.658,...,2.576,0.056,0.237,0.801,0.151,0.388,0.899,0.074,0.272,0.859
stats_for_M1,0.741,0.192,0.438,0.444,0.247,0.497,0.704,0.209,0.457,0.593,...,2.371,0.192,0.438,0.778,0.173,0.416,0.852,0.126,0.355,0.79
stats_for_M3,0.852,0.126,0.355,0.593,0.241,0.491,0.852,0.126,0.355,0.704,...,2.704,0.036,0.189,0.815,0.151,0.388,0.926,0.069,0.262,0.901
stats_for_M5,0.852,0.126,0.355,0.704,0.209,0.457,0.926,0.069,0.262,0.704,...,2.704,0.036,0.189,0.815,0.151,0.388,0.926,0.069,0.262,0.901
stats_for_position,1.174,0.231,0.48,1.737,1.247,1.116,1.48,1.05,1.024,1.211,...,3.394,0.274,0.523,1.045,0.043,0.208,1.08,0.074,0.271,1.131
stats_for_length (x of gs),3.333,1.407,1.186,4.609,0.499,0.706,4.556,0.765,0.875,2.185,...,4.482,0.667,0.816,1.259,0.192,0.438,1.556,0.84,0.916,1.494


Results for experiment `disease_type=common-diseases failure-type-ignored=no-disease-found for app=K Health`, which has 37 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.291,0.052,0.227,0.149,0.013,0.115,0.187,0.012,0.109,0.338,...,2.074,0.095,0.308,0.649,0.174,0.417,0.695,0.115,0.34,0.691
stats_for_recall,0.811,0.153,0.392,0.649,0.228,0.477,0.811,0.153,0.392,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_f1-score,0.428,,,0.242,,,0.304,,,0.445,...,2.324,,,0.699,,,0.791,,,0.775
stats_for_f2-score,0.597,,,0.388,,,0.486,,,0.548,...,2.508,,,0.733,,,0.863,,,0.836
stats_for_NDCG,0.744,0.152,0.39,0.536,0.19,0.436,0.695,0.154,0.392,0.575,...,2.526,0.047,0.218,0.747,0.183,0.428,0.889,0.08,0.282,0.842
stats_for_M1,0.649,0.228,0.477,0.405,0.241,0.491,0.568,0.245,0.495,0.486,...,2.325,0.184,0.429,0.73,0.197,0.444,0.838,0.136,0.369,0.775
stats_for_M3,0.811,0.153,0.392,0.568,0.245,0.495,0.73,0.197,0.444,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_M5,0.811,0.153,0.392,0.649,0.228,0.477,0.811,0.153,0.392,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_position,1.267,0.329,0.573,1.708,1.123,1.06,1.633,1.366,1.169,1.417,...,3.374,0.243,0.493,1.036,0.034,0.186,1.088,0.08,0.284,1.125
stats_for_length (x of gs),3.514,1.439,1.2,4.613,0.431,0.656,4.629,0.633,0.796,2.273,...,4.514,0.722,0.85,1.297,0.371,0.609,1.595,0.728,0.853,1.505


Results for experiment `disease_type=common-diseases failure-type-ignored=session-failed for app=K Health`, which has 25 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.323,0.062,0.249,0.166,0.012,0.112,0.219,0.009,0.097,0.413,...,2.077,0.1,0.316,0.66,0.154,0.393,0.727,0.111,0.333,0.692
stats_for_recall,0.84,0.134,0.367,0.72,0.202,0.449,0.92,0.074,0.271,0.72,...,2.68,0.038,0.196,0.8,0.16,0.4,0.92,0.074,0.271,0.893
stats_for_f1-score,0.467,,,0.27,,,0.354,,,0.525,...,2.338,,,0.723,,,0.812,,,0.779
stats_for_f2-score,0.636,,,0.432,,,0.561,,,0.627,...,2.531,,,0.767,,,0.874,,,0.844
stats_for_NDCG,0.79,0.137,0.37,0.587,0.175,0.418,0.828,0.095,0.308,0.67,...,2.541,0.059,0.244,0.785,0.159,0.399,0.89,0.079,0.281,0.847
stats_for_M1,0.72,0.202,0.449,0.44,0.246,0.496,0.72,0.202,0.449,0.6,...,2.32,0.202,0.449,0.76,0.182,0.427,0.84,0.134,0.367,0.773
stats_for_M3,0.84,0.134,0.367,0.6,0.24,0.49,0.84,0.134,0.367,0.72,...,2.68,0.038,0.196,0.8,0.16,0.4,0.92,0.074,0.271,0.893
stats_for_M5,0.84,0.134,0.367,0.72,0.202,0.449,0.92,0.074,0.271,0.72,...,2.68,0.038,0.196,0.8,0.16,0.4,0.92,0.074,0.271,0.893
stats_for_position,1.19,0.249,0.499,1.778,1.284,1.133,1.435,1.028,1.014,1.222,...,3.429,0.29,0.538,1.05,0.048,0.218,1.087,0.079,0.282,1.143
stats_for_length (x of gs),3.32,1.498,1.224,4.571,0.531,0.728,4.52,0.81,0.9,2.2,...,4.4,0.698,0.835,1.28,0.202,0.449,1.44,0.406,0.637,1.467


Results for experiment `disease_type=common-diseases failure-type-ignored=any-error for app=K Health`, which has 25 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.323,0.062,0.249,0.166,0.012,0.112,0.219,0.009,0.097,0.413,...,2.077,0.1,0.316,0.66,0.154,0.393,0.727,0.111,0.333,0.692
stats_for_recall,0.84,0.134,0.367,0.72,0.202,0.449,0.92,0.074,0.271,0.72,...,2.68,0.038,0.196,0.8,0.16,0.4,0.92,0.074,0.271,0.893
stats_for_f1-score,0.467,,,0.27,,,0.354,,,0.525,...,2.338,,,0.723,,,0.812,,,0.779
stats_for_f2-score,0.636,,,0.432,,,0.561,,,0.627,...,2.531,,,0.767,,,0.874,,,0.844
stats_for_NDCG,0.79,0.137,0.37,0.587,0.175,0.418,0.828,0.095,0.308,0.67,...,2.541,0.059,0.244,0.785,0.159,0.399,0.89,0.079,0.281,0.847
stats_for_M1,0.72,0.202,0.449,0.44,0.246,0.496,0.72,0.202,0.449,0.6,...,2.32,0.202,0.449,0.76,0.182,0.427,0.84,0.134,0.367,0.773
stats_for_M3,0.84,0.134,0.367,0.6,0.24,0.49,0.84,0.134,0.367,0.72,...,2.68,0.038,0.196,0.8,0.16,0.4,0.92,0.074,0.271,0.893
stats_for_M5,0.84,0.134,0.367,0.72,0.202,0.449,0.92,0.074,0.271,0.72,...,2.68,0.038,0.196,0.8,0.16,0.4,0.92,0.074,0.271,0.893
stats_for_position,1.19,0.249,0.499,1.778,1.284,1.133,1.435,1.028,1.014,1.222,...,3.429,0.29,0.538,1.05,0.048,0.218,1.087,0.079,0.282,1.143
stats_for_length (x of gs),3.32,1.498,1.224,4.571,0.531,0.728,4.52,0.81,0.9,2.2,...,4.4,0.698,0.835,1.28,0.202,0.449,1.44,0.406,0.637,1.467


Results for experiment `disease_type=common-diseases failure-type-ignored=no-disease-found for app=Mediktor`, which has 34 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.282,0.055,0.234,0.143,0.014,0.118,0.185,0.013,0.113,0.319,...,2.104,0.095,0.308,0.662,0.187,0.433,0.687,0.116,0.341,0.701
stats_for_recall,0.794,0.163,0.404,0.618,0.236,0.486,0.794,0.163,0.404,0.618,...,2.618,0.029,0.169,0.735,0.195,0.441,0.912,0.08,0.284,0.873
stats_for_f1-score,0.416,,,0.232,,,0.3,,,0.421,...,2.33,,,0.697,,,0.784,,,0.777
stats_for_f2-score,0.582,,,0.371,,,0.479,,,0.52,...,2.493,,,0.719,,,0.856,,,0.831
stats_for_NDCG,0.721,0.159,0.399,0.495,0.186,0.432,0.668,0.158,0.398,0.548,...,2.494,0.05,0.224,0.724,0.193,0.439,0.89,0.084,0.29,0.831
stats_for_M1,0.618,0.236,0.486,0.353,0.228,0.478,0.529,0.249,0.499,0.471,...,2.294,0.195,0.441,0.706,0.208,0.456,0.853,0.125,0.354,0.765
stats_for_M3,0.794,0.163,0.404,0.529,0.249,0.499,0.706,0.208,0.456,0.618,...,2.618,0.029,0.169,0.735,0.195,0.441,0.912,0.08,0.284,0.873
stats_for_M5,0.794,0.163,0.404,0.618,0.236,0.486,0.794,0.163,0.404,0.618,...,2.618,0.029,0.169,0.735,0.195,0.441,0.912,0.08,0.284,0.873
stats_for_position,1.296,0.357,0.597,1.81,1.202,1.096,1.704,1.468,1.212,1.429,...,3.378,0.259,0.509,1.04,0.038,0.196,1.065,0.06,0.246,1.126
stats_for_length (x of gs),3.588,1.478,1.216,4.607,0.453,0.673,4.625,0.672,0.82,2.267,...,4.382,0.717,0.847,1.235,0.356,0.597,1.588,0.713,0.844,1.461


Results for experiment `disease_type=common-diseases failure-type-ignored=session-failed for app=Mediktor`, which has 37 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.291,0.052,0.227,0.149,0.013,0.115,0.187,0.012,0.109,0.338,...,2.074,0.095,0.308,0.649,0.174,0.417,0.695,0.115,0.34,0.691
stats_for_recall,0.811,0.153,0.392,0.649,0.228,0.477,0.811,0.153,0.392,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_f1-score,0.428,,,0.242,,,0.304,,,0.445,...,2.324,,,0.699,,,0.791,,,0.775
stats_for_f2-score,0.597,,,0.388,,,0.486,,,0.548,...,2.508,,,0.733,,,0.863,,,0.836
stats_for_NDCG,0.744,0.152,0.39,0.536,0.19,0.436,0.695,0.154,0.392,0.575,...,2.526,0.047,0.218,0.747,0.183,0.428,0.889,0.08,0.282,0.842
stats_for_M1,0.649,0.228,0.477,0.405,0.241,0.491,0.568,0.245,0.495,0.486,...,2.325,0.184,0.429,0.73,0.197,0.444,0.838,0.136,0.369,0.775
stats_for_M3,0.811,0.153,0.392,0.568,0.245,0.495,0.73,0.197,0.444,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_M5,0.811,0.153,0.392,0.649,0.228,0.477,0.811,0.153,0.392,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_position,1.267,0.329,0.573,1.708,1.123,1.06,1.633,1.366,1.169,1.417,...,3.374,0.243,0.493,1.036,0.034,0.186,1.088,0.08,0.284,1.125
stats_for_length (x of gs),3.514,1.439,1.2,4.613,0.431,0.656,4.629,0.633,0.796,2.273,...,4.514,0.722,0.85,1.297,0.371,0.609,1.595,0.728,0.853,1.505


Results for experiment `disease_type=common-diseases failure-type-ignored=any-error for app=Mediktor`, which has 34 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.282,0.055,0.234,0.143,0.014,0.118,0.185,0.013,0.113,0.319,...,2.104,0.095,0.308,0.662,0.187,0.433,0.687,0.116,0.341,0.701
stats_for_recall,0.794,0.163,0.404,0.618,0.236,0.486,0.794,0.163,0.404,0.618,...,2.618,0.029,0.169,0.735,0.195,0.441,0.912,0.08,0.284,0.873
stats_for_f1-score,0.416,,,0.232,,,0.3,,,0.421,...,2.33,,,0.697,,,0.784,,,0.777
stats_for_f2-score,0.582,,,0.371,,,0.479,,,0.52,...,2.493,,,0.719,,,0.856,,,0.831
stats_for_NDCG,0.721,0.159,0.399,0.495,0.186,0.432,0.668,0.158,0.398,0.548,...,2.494,0.05,0.224,0.724,0.193,0.439,0.89,0.084,0.29,0.831
stats_for_M1,0.618,0.236,0.486,0.353,0.228,0.478,0.529,0.249,0.499,0.471,...,2.294,0.195,0.441,0.706,0.208,0.456,0.853,0.125,0.354,0.765
stats_for_M3,0.794,0.163,0.404,0.529,0.249,0.499,0.706,0.208,0.456,0.618,...,2.618,0.029,0.169,0.735,0.195,0.441,0.912,0.08,0.284,0.873
stats_for_M5,0.794,0.163,0.404,0.618,0.236,0.486,0.794,0.163,0.404,0.618,...,2.618,0.029,0.169,0.735,0.195,0.441,0.912,0.08,0.284,0.873
stats_for_position,1.296,0.357,0.597,1.81,1.202,1.096,1.704,1.468,1.212,1.429,...,3.378,0.259,0.509,1.04,0.038,0.196,1.065,0.06,0.246,1.126
stats_for_length (x of gs),3.588,1.478,1.216,4.607,0.453,0.673,4.625,0.672,0.82,2.267,...,4.382,0.717,0.847,1.235,0.356,0.597,1.588,0.713,0.844,1.461


Results for experiment `disease_type=common-diseases failure-type-ignored=no-disease-found for app=Symptomate`, which has 36 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.294,0.053,0.23,0.153,0.013,0.114,0.193,0.011,0.106,0.347,...,2.047,0.095,0.309,0.639,0.175,0.419,0.686,0.116,0.341,0.682
stats_for_recall,0.806,0.157,0.396,0.667,0.222,0.471,0.833,0.139,0.373,0.667,...,2.639,0.027,0.164,0.75,0.188,0.433,0.917,0.076,0.276,0.88
stats_for_f1-score,0.431,,,0.249,,,0.313,,,0.457,...,2.304,,,0.69,,,0.785,,,0.768
stats_for_f2-score,0.598,,,0.399,,,0.501,,,0.563,...,2.493,,,0.725,,,0.859,,,0.831
stats_for_NDCG,0.751,0.154,0.393,0.55,0.187,0.433,0.715,0.144,0.38,0.591,...,2.513,0.048,0.22,0.74,0.186,0.431,0.886,0.082,0.286,0.838
stats_for_M1,0.667,0.222,0.471,0.417,0.243,0.493,0.583,0.243,0.493,0.5,...,2.305,0.188,0.433,0.722,0.201,0.448,0.833,0.139,0.373,0.768
stats_for_M3,0.806,0.157,0.396,0.583,0.243,0.493,0.75,0.188,0.433,0.667,...,2.639,0.027,0.164,0.75,0.188,0.433,0.917,0.076,0.276,0.88
stats_for_M5,0.806,0.157,0.396,0.667,0.222,0.471,0.833,0.139,0.373,0.667,...,2.639,0.027,0.164,0.75,0.188,0.433,0.917,0.076,0.276,0.88
stats_for_position,1.207,0.233,0.483,1.708,1.123,1.06,1.633,1.366,1.169,1.417,...,3.385,0.248,0.498,1.037,0.036,0.189,1.091,0.083,0.287,1.128
stats_for_length (x of gs),3.472,1.416,1.19,4.613,0.431,0.656,4.629,0.633,0.796,2.312,...,4.556,0.731,0.855,1.306,0.379,0.616,1.611,0.738,0.859,1.519


Results for experiment `disease_type=common-diseases failure-type-ignored=session-failed for app=Symptomate`, which has 35 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.302,0.052,0.228,0.157,0.013,0.113,0.198,0.01,0.102,0.357,...,2.132,0.091,0.302,0.671,0.171,0.413,0.725,0.104,0.322,0.711
stats_for_recall,0.829,0.142,0.377,0.686,0.216,0.464,0.857,0.122,0.35,0.686,...,2.685,0.028,0.167,0.771,0.176,0.42,0.943,0.054,0.232,0.895
stats_for_f1-score,0.443,,,0.256,,,0.322,,,0.47,...,2.375,,,0.718,,,0.82,,,0.792
stats_for_f2-score,0.615,,,0.41,,,0.515,,,0.579,...,2.552,,,0.749,,,0.89,,,0.851
stats_for_NDCG,0.758,0.143,0.378,0.566,0.184,0.429,0.735,0.133,0.365,0.607,...,2.566,0.048,0.219,0.761,0.175,0.419,0.911,0.061,0.247,0.855
stats_for_M1,0.657,0.225,0.475,0.429,0.245,0.495,0.6,0.24,0.49,0.514,...,2.371,0.176,0.42,0.743,0.191,0.437,0.857,0.122,0.35,0.79
stats_for_M3,0.829,0.142,0.377,0.6,0.24,0.49,0.771,0.176,0.42,0.686,...,2.685,0.028,0.167,0.771,0.176,0.42,0.943,0.054,0.232,0.895
stats_for_M5,0.829,0.142,0.377,0.686,0.216,0.464,0.857,0.122,0.35,0.686,...,2.685,0.028,0.167,0.771,0.176,0.42,0.943,0.054,0.232,0.895
stats_for_position,1.276,0.338,0.581,1.708,1.123,1.06,1.633,1.366,1.169,1.417,...,3.363,0.239,0.489,1.037,0.036,0.189,1.091,0.083,0.287,1.121
stats_for_length (x of gs),3.429,1.388,1.178,4.586,0.449,0.67,4.606,0.663,0.814,2.273,...,4.314,0.588,0.767,1.2,0.16,0.4,1.543,0.705,0.84,1.438


Results for experiment `disease_type=common-diseases failure-type-ignored=any-error for app=Symptomate`, which has 34 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.305,0.053,0.23,0.162,0.012,0.111,0.204,0.01,0.098,0.368,...,2.107,0.092,0.303,0.662,0.172,0.415,0.717,0.104,0.323,0.702
stats_for_recall,0.824,0.145,0.381,0.706,0.208,0.456,0.882,0.104,0.322,0.706,...,2.677,0.029,0.169,0.765,0.18,0.424,0.941,0.055,0.235,0.892
stats_for_f1-score,0.445,,,0.264,,,0.331,,,0.484,...,2.356,,,0.71,,,0.814,,,0.785
stats_for_f2-score,0.615,,,0.422,,,0.53,,,0.596,...,2.538,,,0.742,,,0.886,,,0.846
stats_for_NDCG,0.765,0.145,0.381,0.583,0.179,0.424,0.757,0.121,0.348,0.625,...,2.554,0.049,0.221,0.754,0.179,0.423,0.909,0.062,0.25,0.851
stats_for_M1,0.676,0.219,0.468,0.441,0.247,0.497,0.618,0.236,0.486,0.529,...,2.353,0.18,0.424,0.735,0.195,0.441,0.853,0.125,0.354,0.784
stats_for_M3,0.824,0.145,0.381,0.618,0.236,0.486,0.794,0.163,0.404,0.706,...,2.677,0.029,0.169,0.765,0.18,0.424,0.941,0.055,0.235,0.892
stats_for_M5,0.824,0.145,0.381,0.706,0.208,0.456,0.882,0.104,0.322,0.706,...,2.677,0.029,0.169,0.765,0.18,0.424,0.941,0.055,0.235,0.892
stats_for_position,1.214,0.24,0.49,1.708,1.123,1.06,1.633,1.366,1.169,1.417,...,3.374,0.244,0.494,1.038,0.037,0.192,1.094,0.085,0.291,1.125
stats_for_length (x of gs),3.382,1.354,1.164,4.586,0.449,0.67,4.606,0.663,0.814,2.312,...,4.353,0.595,0.771,1.206,0.163,0.404,1.559,0.717,0.847,1.451


Results for experiment `disease_type=common-diseases failure-type-ignored=no-disease-found for app=WebMD`, which has 37 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.291,0.052,0.227,0.149,0.013,0.115,0.187,0.012,0.109,0.338,...,2.074,0.095,0.308,0.649,0.174,0.417,0.695,0.115,0.34,0.691
stats_for_recall,0.811,0.153,0.392,0.649,0.228,0.477,0.811,0.153,0.392,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_f1-score,0.428,,,0.242,,,0.304,,,0.445,...,2.324,,,0.699,,,0.791,,,0.775
stats_for_f2-score,0.597,,,0.388,,,0.486,,,0.548,...,2.508,,,0.733,,,0.863,,,0.836
stats_for_NDCG,0.744,0.152,0.39,0.536,0.19,0.436,0.695,0.154,0.392,0.575,...,2.526,0.047,0.218,0.747,0.183,0.428,0.889,0.08,0.282,0.842
stats_for_M1,0.649,0.228,0.477,0.405,0.241,0.491,0.568,0.245,0.495,0.486,...,2.325,0.184,0.429,0.73,0.197,0.444,0.838,0.136,0.369,0.775
stats_for_M3,0.811,0.153,0.392,0.568,0.245,0.495,0.73,0.197,0.444,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_M5,0.811,0.153,0.392,0.649,0.228,0.477,0.811,0.153,0.392,0.649,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_position,1.267,0.329,0.573,1.708,1.123,1.06,1.633,1.366,1.169,1.417,...,3.374,0.243,0.493,1.036,0.034,0.186,1.088,0.08,0.284,1.125
stats_for_length (x of gs),3.514,1.439,1.2,4.613,0.431,0.656,4.629,0.633,0.796,2.273,...,4.514,0.722,0.85,1.297,0.371,0.609,1.595,0.728,0.853,1.505


Results for experiment `disease_type=common-diseases failure-type-ignored=session-failed for app=WebMD`, which has 36 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.285,0.052,0.228,0.146,0.013,0.115,0.186,0.012,0.11,0.347,...,2.089,0.095,0.309,0.667,0.167,0.408,0.7,0.118,0.343,0.696
stats_for_recall,0.806,0.157,0.396,0.639,0.231,0.48,0.806,0.157,0.396,0.667,...,2.667,0.027,0.164,0.778,0.173,0.416,0.917,0.076,0.276,0.889
stats_for_f1-score,0.421,,,0.238,,,0.302,,,0.457,...,2.341,,,0.718,,,0.794,,,0.78
stats_for_f2-score,0.59,,,0.381,,,0.484,,,0.563,...,2.525,,,0.753,,,0.863,,,0.842
stats_for_NDCG,0.747,0.156,0.395,0.533,0.195,0.442,0.697,0.158,0.397,0.591,...,2.541,0.048,0.22,0.768,0.172,0.415,0.886,0.082,0.286,0.847
stats_for_M1,0.667,0.222,0.471,0.417,0.243,0.493,0.583,0.243,0.493,0.5,...,2.333,0.188,0.433,0.75,0.188,0.433,0.833,0.139,0.373,0.778
stats_for_M3,0.806,0.157,0.396,0.556,0.247,0.497,0.722,0.201,0.448,0.667,...,2.667,0.027,0.164,0.778,0.173,0.416,0.917,0.076,0.276,0.889
stats_for_M5,0.806,0.157,0.396,0.639,0.231,0.48,0.806,0.157,0.396,0.667,...,2.667,0.027,0.164,0.778,0.173,0.416,0.917,0.076,0.276,0.889
stats_for_position,1.241,0.321,0.567,1.696,1.168,1.081,1.621,1.408,1.187,1.417,...,3.384,0.248,0.498,1.036,0.034,0.186,1.091,0.083,0.287,1.128
stats_for_length (x of gs),3.556,1.414,1.189,4.633,0.432,0.657,4.647,0.64,0.8,2.273,...,4.528,0.731,0.855,1.306,0.379,0.616,1.583,0.743,0.862,1.509


Results for experiment `disease_type=common-diseases failure-type-ignored=any-error for app=WebMD`, which has 36 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.285,0.052,0.228,0.146,0.013,0.115,0.186,0.012,0.11,0.347,...,2.089,0.095,0.309,0.667,0.167,0.408,0.7,0.118,0.343,0.696
stats_for_recall,0.806,0.157,0.396,0.639,0.231,0.48,0.806,0.157,0.396,0.667,...,2.667,0.027,0.164,0.778,0.173,0.416,0.917,0.076,0.276,0.889
stats_for_f1-score,0.421,,,0.238,,,0.302,,,0.457,...,2.341,,,0.718,,,0.794,,,0.78
stats_for_f2-score,0.59,,,0.381,,,0.484,,,0.563,...,2.525,,,0.753,,,0.863,,,0.842
stats_for_NDCG,0.747,0.156,0.395,0.533,0.195,0.442,0.697,0.158,0.397,0.591,...,2.541,0.048,0.22,0.768,0.172,0.415,0.886,0.082,0.286,0.847
stats_for_M1,0.667,0.222,0.471,0.417,0.243,0.493,0.583,0.243,0.493,0.5,...,2.333,0.188,0.433,0.75,0.188,0.433,0.833,0.139,0.373,0.778
stats_for_M3,0.806,0.157,0.396,0.556,0.247,0.497,0.722,0.201,0.448,0.667,...,2.667,0.027,0.164,0.778,0.173,0.416,0.917,0.076,0.276,0.889
stats_for_M5,0.806,0.157,0.396,0.639,0.231,0.48,0.806,0.157,0.396,0.667,...,2.667,0.027,0.164,0.778,0.173,0.416,0.917,0.076,0.276,0.889
stats_for_position,1.241,0.321,0.567,1.696,1.168,1.081,1.621,1.408,1.187,1.417,...,3.384,0.248,0.498,1.036,0.034,0.186,1.091,0.083,0.287,1.128
stats_for_length (x of gs),3.556,1.414,1.189,4.633,0.432,0.657,4.647,0.64,0.8,2.273,...,4.528,0.731,0.855,1.306,0.379,0.616,1.583,0.743,0.862,1.509


Results for experiment `disease_type=less-common-diseases failure-type-ignored=None for app=NA`, which has 7 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.248,0.014,0.117,0.257,0.1,0.316,0.257,0.1,0.316,0.119,...,1.726,0.149,0.386,0.643,0.122,0.35,0.5,0.119,0.345,0.575
stats_for_recall,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_f1-score,0.385,,,0.378,,,0.378,,,0.168,...,2.061,,,0.735,,,0.632,,,0.687
stats_for_f2-score,0.575,,,0.527,,,0.527,,,0.223,...,2.337,,,0.804,,,0.75,,,0.779
stats_for_NDCG,0.694,0.142,0.377,0.643,0.194,0.44,0.662,0.191,0.437,0.233,...,2.465,0.124,0.352,0.857,0.122,0.35,0.804,0.124,0.352,0.822
stats_for_M1,0.571,0.245,0.495,0.571,0.245,0.495,0.571,0.245,0.495,0.143,...,2.285,0.204,0.452,0.857,0.122,0.35,0.714,0.204,0.452,0.762
stats_for_M3,0.571,0.245,0.495,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_M5,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_position,2.0,2.0,1.414,1.4,0.64,0.8,1.2,0.16,0.4,1.5,...,3.334,0.139,0.373,1.0,0.0,0.0,1.167,0.139,0.373,1.111
stats_for_length (x of gs),3.857,0.98,0.99,4.429,1.959,1.4,4.0,2.333,1.528,2.5,...,5.715,1.143,1.069,1.429,0.245,0.495,2.286,0.776,0.881,1.905


Results for experiment `disease_type=less-common-diseases failure-type-ignored=no-disease-found for app=Avey`, which has 7 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.248,0.014,0.117,0.257,0.1,0.316,0.257,0.1,0.316,0.119,...,1.726,0.149,0.386,0.643,0.122,0.35,0.5,0.119,0.345,0.575
stats_for_recall,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_f1-score,0.385,,,0.378,,,0.378,,,0.168,...,2.061,,,0.735,,,0.632,,,0.687
stats_for_f2-score,0.575,,,0.527,,,0.527,,,0.223,...,2.337,,,0.804,,,0.75,,,0.779
stats_for_NDCG,0.694,0.142,0.377,0.643,0.194,0.44,0.662,0.191,0.437,0.233,...,2.465,0.124,0.352,0.857,0.122,0.35,0.804,0.124,0.352,0.822
stats_for_M1,0.571,0.245,0.495,0.571,0.245,0.495,0.571,0.245,0.495,0.143,...,2.285,0.204,0.452,0.857,0.122,0.35,0.714,0.204,0.452,0.762
stats_for_M3,0.571,0.245,0.495,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_M5,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_position,2.0,2.0,1.414,1.4,0.64,0.8,1.2,0.16,0.4,1.5,...,3.334,0.139,0.373,1.0,0.0,0.0,1.167,0.139,0.373,1.111
stats_for_length (x of gs),3.857,0.98,0.99,4.429,1.959,1.4,4.0,2.333,1.528,2.5,...,5.715,1.143,1.069,1.429,0.245,0.495,2.286,0.776,0.881,1.905


Results for experiment `disease_type=less-common-diseases failure-type-ignored=no-disease-found for app=any`, which has 5 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.28,0.004,0.065,0.16,0.006,0.08,0.16,0.006,0.08,0.167,...,1.767,0.084,0.291,0.6,0.14,0.374,0.4,0.107,0.327,0.589
stats_for_recall,1.0,0.0,0.0,0.8,0.16,0.4,0.8,0.16,0.4,0.4,...,2.6,0.0,0.0,0.8,0.16,0.4,0.8,0.16,0.4,0.867
stats_for_f1-score,0.438,,,0.267,,,0.267,,,0.236,...,2.087,,,0.686,,,0.533,,,0.696
stats_for_f2-score,0.66,,,0.444,,,0.444,,,0.313,...,2.36,,,0.75,,,0.667,,,0.787
stats_for_NDCG,0.772,0.078,0.279,0.7,0.16,0.4,0.726,0.152,0.39,0.326,...,2.452,0.022,0.148,0.8,0.16,0.4,0.726,0.152,0.39,0.817
stats_for_M1,0.6,0.24,0.49,0.6,0.24,0.49,0.6,0.24,0.49,0.2,...,2.2,0.16,0.4,0.8,0.16,0.4,0.6,0.24,0.49,0.733
stats_for_M3,0.6,0.24,0.49,0.8,0.16,0.4,0.8,0.16,0.4,0.4,...,2.6,0.0,0.0,0.8,0.16,0.4,0.8,0.16,0.4,0.867
stats_for_M5,1.0,0.0,0.0,0.8,0.16,0.4,0.8,0.16,0.4,0.4,...,2.6,0.0,0.0,0.8,0.16,0.4,0.8,0.16,0.4,0.867
stats_for_position,2.2,2.16,1.47,1.5,0.75,0.866,1.25,0.188,0.433,1.5,...,3.45,0.16,0.4,1.0,0.0,0.0,1.25,0.188,0.433,1.15
stats_for_length (x of gs),3.8,0.96,0.98,5.0,0.0,0.0,4.6,0.64,0.8,2.4,...,5.6,0.64,0.8,1.4,0.24,0.49,2.6,0.64,0.8,1.867


Results for experiment `disease_type=less-common-diseases failure-type-ignored=session-failed for app=Avey`, which has 7 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.248,0.014,0.117,0.257,0.1,0.316,0.257,0.1,0.316,0.119,...,1.726,0.149,0.386,0.643,0.122,0.35,0.5,0.119,0.345,0.575
stats_for_recall,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_f1-score,0.385,,,0.378,,,0.378,,,0.168,...,2.061,,,0.735,,,0.632,,,0.687
stats_for_f2-score,0.575,,,0.527,,,0.527,,,0.223,...,2.337,,,0.804,,,0.75,,,0.779
stats_for_NDCG,0.694,0.142,0.377,0.643,0.194,0.44,0.662,0.191,0.437,0.233,...,2.465,0.124,0.352,0.857,0.122,0.35,0.804,0.124,0.352,0.822
stats_for_M1,0.571,0.245,0.495,0.571,0.245,0.495,0.571,0.245,0.495,0.143,...,2.285,0.204,0.452,0.857,0.122,0.35,0.714,0.204,0.452,0.762
stats_for_M3,0.571,0.245,0.495,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_M5,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_position,2.0,2.0,1.414,1.4,0.64,0.8,1.2,0.16,0.4,1.5,...,3.334,0.139,0.373,1.0,0.0,0.0,1.167,0.139,0.373,1.111
stats_for_length (x of gs),3.857,0.98,0.99,4.429,1.959,1.4,4.0,2.333,1.528,2.5,...,5.715,1.143,1.069,1.429,0.245,0.495,2.286,0.776,0.881,1.905


Results for experiment `disease_type=less-common-diseases failure-type-ignored=session-failed for app=any`, which has 4 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.217,0.019,0.136,0.3,0.17,0.412,0.3,0.17,0.412,0.208,...,1.729,0.199,0.446,0.625,0.047,0.217,0.542,0.075,0.273,0.576
stats_for_recall,0.75,0.188,0.433,0.5,0.25,0.5,0.5,0.25,0.5,0.5,...,2.75,0.188,0.433,1.0,0.0,0.0,1.0,0.0,0.0,0.917
stats_for_f1-score,0.337,,,0.375,,,0.375,,,0.294,...,2.115,,,0.769,,,0.703,,,0.705
stats_for_f2-score,0.503,,,0.441,,,0.441,,,0.39,...,2.451,,,0.893,,,0.855,,,0.817
stats_for_NDCG,0.608,0.177,0.421,0.5,0.25,0.5,0.5,0.25,0.5,0.408,...,2.75,0.188,0.433,1.0,0.0,0.0,1.0,0.0,0.0,0.917
stats_for_M1,0.5,0.25,0.5,0.5,0.25,0.5,0.5,0.25,0.5,0.25,...,2.75,0.188,0.433,1.0,0.0,0.0,1.0,0.0,0.0,0.917
stats_for_M3,0.5,0.25,0.5,0.5,0.25,0.5,0.5,0.25,0.5,0.5,...,2.75,0.188,0.433,1.0,0.0,0.0,1.0,0.0,0.0,0.917
stats_for_M5,0.75,0.188,0.433,0.5,0.25,0.5,0.5,0.25,0.5,0.5,...,2.75,0.188,0.433,1.0,0.0,0.0,1.0,0.0,0.0,0.917
stats_for_position,2.0,2.0,1.414,1.0,0.0,0.0,1.0,0.0,0.0,1.5,...,3.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0
stats_for_length (x of gs),4.0,1.0,1.0,4.0,3.0,1.732,3.0,2.667,1.633,2.667,...,6.0,1.5,1.225,1.75,0.188,0.433,2.25,0.688,0.829,2.0


Results for experiment `disease_type=less-common-diseases failure-type-ignored=any-error for app=Avey`, which has 7 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.248,0.014,0.117,0.257,0.1,0.316,0.257,0.1,0.316,0.119,...,1.726,0.149,0.386,0.643,0.122,0.35,0.5,0.119,0.345,0.575
stats_for_recall,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_f1-score,0.385,,,0.378,,,0.378,,,0.168,...,2.061,,,0.735,,,0.632,,,0.687
stats_for_f2-score,0.575,,,0.527,,,0.527,,,0.223,...,2.337,,,0.804,,,0.75,,,0.779
stats_for_NDCG,0.694,0.142,0.377,0.643,0.194,0.44,0.662,0.191,0.437,0.233,...,2.465,0.124,0.352,0.857,0.122,0.35,0.804,0.124,0.352,0.822
stats_for_M1,0.571,0.245,0.495,0.571,0.245,0.495,0.571,0.245,0.495,0.143,...,2.285,0.204,0.452,0.857,0.122,0.35,0.714,0.204,0.452,0.762
stats_for_M3,0.571,0.245,0.495,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_M5,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_position,2.0,2.0,1.414,1.4,0.64,0.8,1.2,0.16,0.4,1.5,...,3.334,0.139,0.373,1.0,0.0,0.0,1.167,0.139,0.373,1.111
stats_for_length (x of gs),3.857,0.98,0.99,4.429,1.959,1.4,4.0,2.333,1.528,2.5,...,5.715,1.143,1.069,1.429,0.245,0.495,2.286,0.776,0.881,1.905


Results for experiment `disease_type=less-common-diseases failure-type-ignored=any-error for app=any`, which has 2 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.267,0.004,0.067,0.1,0.01,0.1,0.1,0.01,0.1,0.417,...,1.833,0.0,0.0,0.5,0.0,0.0,0.333,0.0,0.0,0.611
stats_for_recall,1.0,0.0,0.0,0.5,0.25,0.5,0.5,0.25,0.5,1.0,...,3.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0
stats_for_f1-score,0.421,,,0.167,,,0.167,,,0.589,...,2.167,,,0.667,,,0.5,,,0.722
stats_for_f2-score,0.646,,,0.278,,,0.278,,,0.781,...,2.547,,,0.833,,,0.714,,,0.849
stats_for_NDCG,0.715,0.081,0.285,0.5,0.25,0.5,0.5,0.25,0.5,0.815,...,3.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0
stats_for_M1,0.5,0.25,0.5,0.5,0.25,0.5,0.5,0.25,0.5,0.5,...,3.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0
stats_for_M3,0.5,0.25,0.5,0.5,0.25,0.5,0.5,0.25,0.5,1.0,...,3.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0
stats_for_M5,1.0,0.0,0.0,0.5,0.25,0.5,0.5,0.25,0.5,1.0,...,3.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0
stats_for_position,2.5,2.25,1.5,1.0,0.0,0.0,1.0,0.0,0.0,1.5,...,3.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0
stats_for_length (x of gs),4.0,1.0,1.0,5.0,0.0,0.0,4.0,1.0,1.0,2.5,...,6.0,0.0,0.0,2.0,0.0,0.0,3.0,0.0,0.0,2.0


Results for experiment `disease_type=less-common-diseases failure-type-ignored=no-disease-found for app=Avey v2`, which has 6 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.289,0.004,0.063,0.3,0.103,0.321,0.3,0.103,0.321,0.139,...,1.806,0.152,0.39,0.667,0.139,0.373,0.5,0.139,0.373,0.602
stats_for_recall,1.0,0.0,0.0,0.833,0.139,0.373,0.833,0.139,0.373,0.333,...,2.499,0.139,0.373,0.833,0.139,0.373,0.833,0.139,0.373,0.833
stats_for_f1-score,0.448,,,0.441,,,0.441,,,0.196,...,2.089,,,0.741,,,0.625,,,0.696
stats_for_f2-score,0.67,,,0.615,,,0.615,,,0.26,...,2.314,,,0.794,,,0.735,,,0.771
stats_for_NDCG,0.81,0.072,0.268,0.75,0.146,0.382,0.772,0.137,0.371,0.272,...,2.377,0.137,0.371,0.833,0.139,0.373,0.772,0.137,0.371,0.792
stats_for_M1,0.667,0.222,0.471,0.667,0.222,0.471,0.667,0.222,0.471,0.167,...,2.167,0.222,0.471,0.833,0.139,0.373,0.667,0.222,0.471,0.722
stats_for_M3,0.667,0.222,0.471,0.833,0.139,0.373,0.833,0.139,0.373,0.333,...,2.499,0.139,0.373,0.833,0.139,0.373,0.833,0.139,0.373,0.833
stats_for_M5,1.0,0.0,0.0,0.833,0.139,0.373,0.833,0.139,0.373,0.333,...,2.499,0.139,0.373,0.833,0.139,0.373,0.833,0.139,0.373,0.833
stats_for_position,2.0,2.0,1.414,1.4,0.64,0.8,1.2,0.16,0.4,1.5,...,3.4,0.16,0.4,1.0,0.0,0.0,1.2,0.16,0.4,1.133
stats_for_length (x of gs),3.667,0.889,0.943,4.333,2.222,1.491,4.0,2.333,1.528,2.4,...,5.333,0.556,0.745,1.333,0.222,0.471,2.333,0.889,0.943,1.778


Results for experiment `disease_type=less-common-diseases failure-type-ignored=session-failed for app=Avey v2`, which has 7 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.248,0.014,0.117,0.257,0.1,0.316,0.257,0.1,0.316,0.119,...,1.726,0.149,0.386,0.643,0.122,0.35,0.5,0.119,0.345,0.575
stats_for_recall,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_f1-score,0.385,,,0.378,,,0.378,,,0.168,...,2.061,,,0.735,,,0.632,,,0.687
stats_for_f2-score,0.575,,,0.527,,,0.527,,,0.223,...,2.337,,,0.804,,,0.75,,,0.779
stats_for_NDCG,0.694,0.142,0.377,0.643,0.194,0.44,0.662,0.191,0.437,0.233,...,2.465,0.124,0.352,0.857,0.122,0.35,0.804,0.124,0.352,0.822
stats_for_M1,0.571,0.245,0.495,0.571,0.245,0.495,0.571,0.245,0.495,0.143,...,2.285,0.204,0.452,0.857,0.122,0.35,0.714,0.204,0.452,0.762
stats_for_M3,0.571,0.245,0.495,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_M5,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_position,2.0,2.0,1.414,1.4,0.64,0.8,1.2,0.16,0.4,1.5,...,3.334,0.139,0.373,1.0,0.0,0.0,1.167,0.139,0.373,1.111
stats_for_length (x of gs),3.857,0.98,0.99,4.429,1.959,1.4,4.0,2.333,1.528,2.5,...,5.715,1.143,1.069,1.429,0.245,0.495,2.286,0.776,0.881,1.905


Results for experiment `disease_type=less-common-diseases failure-type-ignored=any-error for app=Avey v2`, which has 6 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.289,0.004,0.063,0.3,0.103,0.321,0.3,0.103,0.321,0.139,...,1.806,0.152,0.39,0.667,0.139,0.373,0.5,0.139,0.373,0.602
stats_for_recall,1.0,0.0,0.0,0.833,0.139,0.373,0.833,0.139,0.373,0.333,...,2.499,0.139,0.373,0.833,0.139,0.373,0.833,0.139,0.373,0.833
stats_for_f1-score,0.448,,,0.441,,,0.441,,,0.196,...,2.089,,,0.741,,,0.625,,,0.696
stats_for_f2-score,0.67,,,0.615,,,0.615,,,0.26,...,2.314,,,0.794,,,0.735,,,0.771
stats_for_NDCG,0.81,0.072,0.268,0.75,0.146,0.382,0.772,0.137,0.371,0.272,...,2.377,0.137,0.371,0.833,0.139,0.373,0.772,0.137,0.371,0.792
stats_for_M1,0.667,0.222,0.471,0.667,0.222,0.471,0.667,0.222,0.471,0.167,...,2.167,0.222,0.471,0.833,0.139,0.373,0.667,0.222,0.471,0.722
stats_for_M3,0.667,0.222,0.471,0.833,0.139,0.373,0.833,0.139,0.373,0.333,...,2.499,0.139,0.373,0.833,0.139,0.373,0.833,0.139,0.373,0.833
stats_for_M5,1.0,0.0,0.0,0.833,0.139,0.373,0.833,0.139,0.373,0.333,...,2.499,0.139,0.373,0.833,0.139,0.373,0.833,0.139,0.373,0.833
stats_for_position,2.0,2.0,1.414,1.4,0.64,0.8,1.2,0.16,0.4,1.5,...,3.4,0.16,0.4,1.0,0.0,0.0,1.2,0.16,0.4,1.133
stats_for_length (x of gs),3.667,0.889,0.943,4.333,2.222,1.491,4.0,2.333,1.528,2.4,...,5.333,0.556,0.745,1.333,0.222,0.471,2.333,0.889,0.943,1.778


Results for experiment `disease_type=less-common-diseases failure-type-ignored=no-disease-found for app=Buoy`, which has 6 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.233,0.014,0.12,0.133,0.009,0.094,0.133,0.009,0.094,0.139,...,1.681,0.107,0.328,0.583,0.118,0.344,0.417,0.09,0.3,0.56
stats_for_recall,0.833,0.139,0.373,0.667,0.222,0.471,0.667,0.222,0.471,0.333,...,2.666,0.0,0.0,0.833,0.139,0.373,0.833,0.139,0.373,0.889
stats_for_f1-score,0.364,,,0.222,,,0.222,,,0.196,...,2.052,,,0.686,,,0.556,,,0.684
stats_for_f2-score,0.55,,,0.37,,,0.37,,,0.26,...,2.375,,,0.767,,,0.694,,,0.792
stats_for_NDCG,0.644,0.148,0.384,0.583,0.201,0.449,0.605,0.2,0.447,0.272,...,2.543,0.019,0.138,0.833,0.139,0.373,0.772,0.137,0.371,0.848
stats_for_M1,0.5,0.25,0.5,0.5,0.25,0.5,0.5,0.25,0.5,0.167,...,2.333,0.139,0.373,0.833,0.139,0.373,0.667,0.222,0.471,0.778
stats_for_M3,0.5,0.25,0.5,0.667,0.222,0.471,0.667,0.222,0.471,0.333,...,2.666,0.0,0.0,0.833,0.139,0.373,0.833,0.139,0.373,0.889
stats_for_M5,0.833,0.139,0.373,0.667,0.222,0.471,0.667,0.222,0.471,0.333,...,2.666,0.0,0.0,0.833,0.139,0.373,0.833,0.139,0.373,0.889
stats_for_position,2.2,2.16,1.47,1.5,0.75,0.866,1.25,0.188,0.433,1.5,...,3.367,0.139,0.373,1.0,0.0,0.0,1.2,0.16,0.4,1.122
stats_for_length (x of gs),4.0,1.0,1.0,5.0,0.0,0.0,4.6,0.64,0.8,2.5,...,6.0,1.333,1.155,1.5,0.25,0.5,2.5,0.583,0.764,2.0


Results for experiment `disease_type=less-common-diseases failure-type-ignored=session-failed for app=Buoy`, which has 7 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.248,0.014,0.117,0.257,0.1,0.316,0.257,0.1,0.316,0.119,...,1.726,0.149,0.386,0.643,0.122,0.35,0.5,0.119,0.345,0.575
stats_for_recall,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_f1-score,0.385,,,0.378,,,0.378,,,0.168,...,2.061,,,0.735,,,0.632,,,0.687
stats_for_f2-score,0.575,,,0.527,,,0.527,,,0.223,...,2.337,,,0.804,,,0.75,,,0.779
stats_for_NDCG,0.694,0.142,0.377,0.643,0.194,0.44,0.662,0.191,0.437,0.233,...,2.465,0.124,0.352,0.857,0.122,0.35,0.804,0.124,0.352,0.822
stats_for_M1,0.571,0.245,0.495,0.571,0.245,0.495,0.571,0.245,0.495,0.143,...,2.285,0.204,0.452,0.857,0.122,0.35,0.714,0.204,0.452,0.762
stats_for_M3,0.571,0.245,0.495,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_M5,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_position,2.0,2.0,1.414,1.4,0.64,0.8,1.2,0.16,0.4,1.5,...,3.334,0.139,0.373,1.0,0.0,0.0,1.167,0.139,0.373,1.111
stats_for_length (x of gs),3.857,0.98,0.99,4.429,1.959,1.4,4.0,2.333,1.528,2.5,...,5.715,1.143,1.069,1.429,0.245,0.495,2.286,0.776,0.881,1.905


Results for experiment `disease_type=less-common-diseases failure-type-ignored=any-error for app=Buoy`, which has 6 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.233,0.014,0.12,0.133,0.009,0.094,0.133,0.009,0.094,0.139,...,1.681,0.107,0.328,0.583,0.118,0.344,0.417,0.09,0.3,0.56
stats_for_recall,0.833,0.139,0.373,0.667,0.222,0.471,0.667,0.222,0.471,0.333,...,2.666,0.0,0.0,0.833,0.139,0.373,0.833,0.139,0.373,0.889
stats_for_f1-score,0.364,,,0.222,,,0.222,,,0.196,...,2.052,,,0.686,,,0.556,,,0.684
stats_for_f2-score,0.55,,,0.37,,,0.37,,,0.26,...,2.375,,,0.767,,,0.694,,,0.792
stats_for_NDCG,0.644,0.148,0.384,0.583,0.201,0.449,0.605,0.2,0.447,0.272,...,2.543,0.019,0.138,0.833,0.139,0.373,0.772,0.137,0.371,0.848
stats_for_M1,0.5,0.25,0.5,0.5,0.25,0.5,0.5,0.25,0.5,0.167,...,2.333,0.139,0.373,0.833,0.139,0.373,0.667,0.222,0.471,0.778
stats_for_M3,0.5,0.25,0.5,0.667,0.222,0.471,0.667,0.222,0.471,0.333,...,2.666,0.0,0.0,0.833,0.139,0.373,0.833,0.139,0.373,0.889
stats_for_M5,0.833,0.139,0.373,0.667,0.222,0.471,0.667,0.222,0.471,0.333,...,2.666,0.0,0.0,0.833,0.139,0.373,0.833,0.139,0.373,0.889
stats_for_position,2.2,2.16,1.47,1.5,0.75,0.866,1.25,0.188,0.433,1.5,...,3.367,0.139,0.373,1.0,0.0,0.0,1.2,0.16,0.4,1.122
stats_for_length (x of gs),4.0,1.0,1.0,5.0,0.0,0.0,4.6,0.64,0.8,2.5,...,6.0,1.333,1.155,1.5,0.25,0.5,2.5,0.583,0.764,2.0


Results for experiment `disease_type=less-common-diseases failure-type-ignored=no-disease-found for app=Healthily`, which has 7 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.248,0.014,0.117,0.257,0.1,0.316,0.257,0.1,0.316,0.119,...,1.726,0.149,0.386,0.643,0.122,0.35,0.5,0.119,0.345,0.575
stats_for_recall,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_f1-score,0.385,,,0.378,,,0.378,,,0.168,...,2.061,,,0.735,,,0.632,,,0.687
stats_for_f2-score,0.575,,,0.527,,,0.527,,,0.223,...,2.337,,,0.804,,,0.75,,,0.779
stats_for_NDCG,0.694,0.142,0.377,0.643,0.194,0.44,0.662,0.191,0.437,0.233,...,2.465,0.124,0.352,0.857,0.122,0.35,0.804,0.124,0.352,0.822
stats_for_M1,0.571,0.245,0.495,0.571,0.245,0.495,0.571,0.245,0.495,0.143,...,2.285,0.204,0.452,0.857,0.122,0.35,0.714,0.204,0.452,0.762
stats_for_M3,0.571,0.245,0.495,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_M5,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_position,2.0,2.0,1.414,1.4,0.64,0.8,1.2,0.16,0.4,1.5,...,3.334,0.139,0.373,1.0,0.0,0.0,1.167,0.139,0.373,1.111
stats_for_length (x of gs),3.857,0.98,0.99,4.429,1.959,1.4,4.0,2.333,1.528,2.5,...,5.715,1.143,1.069,1.429,0.245,0.495,2.286,0.776,0.881,1.905


Results for experiment `disease_type=less-common-diseases failure-type-ignored=session-failed for app=Healthily`, which has 5 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.24,0.017,0.131,0.28,0.138,0.371,0.28,0.138,0.371,0.167,...,1.983,0.19,0.436,0.7,0.06,0.245,0.633,0.093,0.306,0.661
stats_for_recall,0.8,0.16,0.4,0.6,0.24,0.49,0.6,0.24,0.49,0.4,...,2.8,0.16,0.4,1.0,0.0,0.0,1.0,0.0,0.0,0.933
stats_for_f1-score,0.369,,,0.382,,,0.382,,,0.236,...,2.316,,,0.824,,,0.775,,,0.772
stats_for_f2-score,0.545,,,0.488,,,0.488,,,0.313,...,2.582,,,0.921,,,0.896,,,0.861
stats_for_NDCG,0.686,0.166,0.408,0.6,0.24,0.49,0.6,0.24,0.49,0.326,...,2.8,0.16,0.4,1.0,0.0,0.0,1.0,0.0,0.0,0.933
stats_for_M1,0.6,0.24,0.49,0.6,0.24,0.49,0.6,0.24,0.49,0.2,...,2.8,0.16,0.4,1.0,0.0,0.0,1.0,0.0,0.0,0.933
stats_for_M3,0.6,0.24,0.49,0.6,0.24,0.49,0.6,0.24,0.49,0.4,...,2.8,0.16,0.4,1.0,0.0,0.0,1.0,0.0,0.0,0.933
stats_for_M5,0.8,0.16,0.4,0.6,0.24,0.49,0.6,0.24,0.49,0.4,...,2.8,0.16,0.4,1.0,0.0,0.0,1.0,0.0,0.0,0.933
stats_for_position,1.75,1.688,1.299,1.0,0.0,0.0,1.0,0.0,0.0,1.5,...,3.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0
stats_for_length (x of gs),3.8,0.96,0.98,4.2,2.56,1.6,3.5,2.75,1.658,2.75,...,5.4,1.36,1.166,1.6,0.24,0.49,2.0,0.8,0.894,1.8


Results for experiment `disease_type=less-common-diseases failure-type-ignored=any-error for app=Healthily`, which has 5 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.24,0.017,0.131,0.28,0.138,0.371,0.28,0.138,0.371,0.167,...,1.983,0.19,0.436,0.7,0.06,0.245,0.633,0.093,0.306,0.661
stats_for_recall,0.8,0.16,0.4,0.6,0.24,0.49,0.6,0.24,0.49,0.4,...,2.8,0.16,0.4,1.0,0.0,0.0,1.0,0.0,0.0,0.933
stats_for_f1-score,0.369,,,0.382,,,0.382,,,0.236,...,2.316,,,0.824,,,0.775,,,0.772
stats_for_f2-score,0.545,,,0.488,,,0.488,,,0.313,...,2.582,,,0.921,,,0.896,,,0.861
stats_for_NDCG,0.686,0.166,0.408,0.6,0.24,0.49,0.6,0.24,0.49,0.326,...,2.8,0.16,0.4,1.0,0.0,0.0,1.0,0.0,0.0,0.933
stats_for_M1,0.6,0.24,0.49,0.6,0.24,0.49,0.6,0.24,0.49,0.2,...,2.8,0.16,0.4,1.0,0.0,0.0,1.0,0.0,0.0,0.933
stats_for_M3,0.6,0.24,0.49,0.6,0.24,0.49,0.6,0.24,0.49,0.4,...,2.8,0.16,0.4,1.0,0.0,0.0,1.0,0.0,0.0,0.933
stats_for_M5,0.8,0.16,0.4,0.6,0.24,0.49,0.6,0.24,0.49,0.4,...,2.8,0.16,0.4,1.0,0.0,0.0,1.0,0.0,0.0,0.933
stats_for_position,1.75,1.688,1.299,1.0,0.0,0.0,1.0,0.0,0.0,1.5,...,3.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0
stats_for_length (x of gs),3.8,0.96,0.98,4.2,2.56,1.6,3.5,2.75,1.658,2.75,...,5.4,1.36,1.166,1.6,0.24,0.49,2.0,0.8,0.894,1.8


Results for experiment `disease_type=less-common-diseases failure-type-ignored=no-disease-found for app=K Health`, which has 7 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.248,0.014,0.117,0.257,0.1,0.316,0.257,0.1,0.316,0.119,...,1.726,0.149,0.386,0.643,0.122,0.35,0.5,0.119,0.345,0.575
stats_for_recall,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_f1-score,0.385,,,0.378,,,0.378,,,0.168,...,2.061,,,0.735,,,0.632,,,0.687
stats_for_f2-score,0.575,,,0.527,,,0.527,,,0.223,...,2.337,,,0.804,,,0.75,,,0.779
stats_for_NDCG,0.694,0.142,0.377,0.643,0.194,0.44,0.662,0.191,0.437,0.233,...,2.465,0.124,0.352,0.857,0.122,0.35,0.804,0.124,0.352,0.822
stats_for_M1,0.571,0.245,0.495,0.571,0.245,0.495,0.571,0.245,0.495,0.143,...,2.285,0.204,0.452,0.857,0.122,0.35,0.714,0.204,0.452,0.762
stats_for_M3,0.571,0.245,0.495,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_M5,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_position,2.0,2.0,1.414,1.4,0.64,0.8,1.2,0.16,0.4,1.5,...,3.334,0.139,0.373,1.0,0.0,0.0,1.167,0.139,0.373,1.111
stats_for_length (x of gs),3.857,0.98,0.99,4.429,1.959,1.4,4.0,2.333,1.528,2.5,...,5.715,1.143,1.069,1.429,0.245,0.495,2.286,0.776,0.881,1.905


Results for experiment `disease_type=less-common-diseases failure-type-ignored=session-failed for app=K Health`, which has 4 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.217,0.019,0.136,0.3,0.17,0.412,0.3,0.17,0.412,0.208,...,1.729,0.199,0.446,0.625,0.047,0.217,0.542,0.075,0.273,0.576
stats_for_recall,0.75,0.188,0.433,0.5,0.25,0.5,0.5,0.25,0.5,0.5,...,2.75,0.188,0.433,1.0,0.0,0.0,1.0,0.0,0.0,0.917
stats_for_f1-score,0.337,,,0.375,,,0.375,,,0.294,...,2.115,,,0.769,,,0.703,,,0.705
stats_for_f2-score,0.503,,,0.441,,,0.441,,,0.39,...,2.451,,,0.893,,,0.855,,,0.817
stats_for_NDCG,0.608,0.177,0.421,0.5,0.25,0.5,0.5,0.25,0.5,0.408,...,2.75,0.188,0.433,1.0,0.0,0.0,1.0,0.0,0.0,0.917
stats_for_M1,0.5,0.25,0.5,0.5,0.25,0.5,0.5,0.25,0.5,0.25,...,2.75,0.188,0.433,1.0,0.0,0.0,1.0,0.0,0.0,0.917
stats_for_M3,0.5,0.25,0.5,0.5,0.25,0.5,0.5,0.25,0.5,0.5,...,2.75,0.188,0.433,1.0,0.0,0.0,1.0,0.0,0.0,0.917
stats_for_M5,0.75,0.188,0.433,0.5,0.25,0.5,0.5,0.25,0.5,0.5,...,2.75,0.188,0.433,1.0,0.0,0.0,1.0,0.0,0.0,0.917
stats_for_position,2.0,2.0,1.414,1.0,0.0,0.0,1.0,0.0,0.0,1.5,...,3.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0
stats_for_length (x of gs),4.0,1.0,1.0,4.0,3.0,1.732,3.0,2.667,1.633,2.667,...,6.0,1.5,1.225,1.75,0.188,0.433,2.25,0.688,0.829,2.0


Results for experiment `disease_type=less-common-diseases failure-type-ignored=any-error for app=K Health`, which has 4 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.217,0.019,0.136,0.3,0.17,0.412,0.3,0.17,0.412,0.208,...,1.729,0.199,0.446,0.625,0.047,0.217,0.542,0.075,0.273,0.576
stats_for_recall,0.75,0.188,0.433,0.5,0.25,0.5,0.5,0.25,0.5,0.5,...,2.75,0.188,0.433,1.0,0.0,0.0,1.0,0.0,0.0,0.917
stats_for_f1-score,0.337,,,0.375,,,0.375,,,0.294,...,2.115,,,0.769,,,0.703,,,0.705
stats_for_f2-score,0.503,,,0.441,,,0.441,,,0.39,...,2.451,,,0.893,,,0.855,,,0.817
stats_for_NDCG,0.608,0.177,0.421,0.5,0.25,0.5,0.5,0.25,0.5,0.408,...,2.75,0.188,0.433,1.0,0.0,0.0,1.0,0.0,0.0,0.917
stats_for_M1,0.5,0.25,0.5,0.5,0.25,0.5,0.5,0.25,0.5,0.25,...,2.75,0.188,0.433,1.0,0.0,0.0,1.0,0.0,0.0,0.917
stats_for_M3,0.5,0.25,0.5,0.5,0.25,0.5,0.5,0.25,0.5,0.5,...,2.75,0.188,0.433,1.0,0.0,0.0,1.0,0.0,0.0,0.917
stats_for_M5,0.75,0.188,0.433,0.5,0.25,0.5,0.5,0.25,0.5,0.5,...,2.75,0.188,0.433,1.0,0.0,0.0,1.0,0.0,0.0,0.917
stats_for_position,2.0,2.0,1.414,1.0,0.0,0.0,1.0,0.0,0.0,1.5,...,3.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0
stats_for_length (x of gs),4.0,1.0,1.0,4.0,3.0,1.732,3.0,2.667,1.633,2.667,...,6.0,1.5,1.225,1.75,0.188,0.433,2.25,0.688,0.829,2.0


Results for experiment `disease_type=less-common-diseases failure-type-ignored=no-disease-found for app=Mediktor`, which has 7 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.248,0.014,0.117,0.257,0.1,0.316,0.257,0.1,0.316,0.119,...,1.726,0.149,0.386,0.643,0.122,0.35,0.5,0.119,0.345,0.575
stats_for_recall,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_f1-score,0.385,,,0.378,,,0.378,,,0.168,...,2.061,,,0.735,,,0.632,,,0.687
stats_for_f2-score,0.575,,,0.527,,,0.527,,,0.223,...,2.337,,,0.804,,,0.75,,,0.779
stats_for_NDCG,0.694,0.142,0.377,0.643,0.194,0.44,0.662,0.191,0.437,0.233,...,2.465,0.124,0.352,0.857,0.122,0.35,0.804,0.124,0.352,0.822
stats_for_M1,0.571,0.245,0.495,0.571,0.245,0.495,0.571,0.245,0.495,0.143,...,2.285,0.204,0.452,0.857,0.122,0.35,0.714,0.204,0.452,0.762
stats_for_M3,0.571,0.245,0.495,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_M5,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_position,2.0,2.0,1.414,1.4,0.64,0.8,1.2,0.16,0.4,1.5,...,3.334,0.139,0.373,1.0,0.0,0.0,1.167,0.139,0.373,1.111
stats_for_length (x of gs),3.857,0.98,0.99,4.429,1.959,1.4,4.0,2.333,1.528,2.5,...,5.715,1.143,1.069,1.429,0.245,0.495,2.286,0.776,0.881,1.905


Results for experiment `disease_type=less-common-diseases failure-type-ignored=session-failed for app=Mediktor`, which has 7 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.248,0.014,0.117,0.257,0.1,0.316,0.257,0.1,0.316,0.119,...,1.726,0.149,0.386,0.643,0.122,0.35,0.5,0.119,0.345,0.575
stats_for_recall,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_f1-score,0.385,,,0.378,,,0.378,,,0.168,...,2.061,,,0.735,,,0.632,,,0.687
stats_for_f2-score,0.575,,,0.527,,,0.527,,,0.223,...,2.337,,,0.804,,,0.75,,,0.779
stats_for_NDCG,0.694,0.142,0.377,0.643,0.194,0.44,0.662,0.191,0.437,0.233,...,2.465,0.124,0.352,0.857,0.122,0.35,0.804,0.124,0.352,0.822
stats_for_M1,0.571,0.245,0.495,0.571,0.245,0.495,0.571,0.245,0.495,0.143,...,2.285,0.204,0.452,0.857,0.122,0.35,0.714,0.204,0.452,0.762
stats_for_M3,0.571,0.245,0.495,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_M5,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_position,2.0,2.0,1.414,1.4,0.64,0.8,1.2,0.16,0.4,1.5,...,3.334,0.139,0.373,1.0,0.0,0.0,1.167,0.139,0.373,1.111
stats_for_length (x of gs),3.857,0.98,0.99,4.429,1.959,1.4,4.0,2.333,1.528,2.5,...,5.715,1.143,1.069,1.429,0.245,0.495,2.286,0.776,0.881,1.905


Results for experiment `disease_type=less-common-diseases failure-type-ignored=any-error for app=Mediktor`, which has 7 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.248,0.014,0.117,0.257,0.1,0.316,0.257,0.1,0.316,0.119,...,1.726,0.149,0.386,0.643,0.122,0.35,0.5,0.119,0.345,0.575
stats_for_recall,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_f1-score,0.385,,,0.378,,,0.378,,,0.168,...,2.061,,,0.735,,,0.632,,,0.687
stats_for_f2-score,0.575,,,0.527,,,0.527,,,0.223,...,2.337,,,0.804,,,0.75,,,0.779
stats_for_NDCG,0.694,0.142,0.377,0.643,0.194,0.44,0.662,0.191,0.437,0.233,...,2.465,0.124,0.352,0.857,0.122,0.35,0.804,0.124,0.352,0.822
stats_for_M1,0.571,0.245,0.495,0.571,0.245,0.495,0.571,0.245,0.495,0.143,...,2.285,0.204,0.452,0.857,0.122,0.35,0.714,0.204,0.452,0.762
stats_for_M3,0.571,0.245,0.495,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_M5,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_position,2.0,2.0,1.414,1.4,0.64,0.8,1.2,0.16,0.4,1.5,...,3.334,0.139,0.373,1.0,0.0,0.0,1.167,0.139,0.373,1.111
stats_for_length (x of gs),3.857,0.98,0.99,4.429,1.959,1.4,4.0,2.333,1.528,2.5,...,5.715,1.143,1.069,1.429,0.245,0.495,2.286,0.776,0.881,1.905


Results for experiment `disease_type=less-common-diseases failure-type-ignored=no-disease-found for app=Symptomate`, which has 7 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.248,0.014,0.117,0.257,0.1,0.316,0.257,0.1,0.316,0.119,...,1.726,0.149,0.386,0.643,0.122,0.35,0.5,0.119,0.345,0.575
stats_for_recall,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_f1-score,0.385,,,0.378,,,0.378,,,0.168,...,2.061,,,0.735,,,0.632,,,0.687
stats_for_f2-score,0.575,,,0.527,,,0.527,,,0.223,...,2.337,,,0.804,,,0.75,,,0.779
stats_for_NDCG,0.694,0.142,0.377,0.643,0.194,0.44,0.662,0.191,0.437,0.233,...,2.465,0.124,0.352,0.857,0.122,0.35,0.804,0.124,0.352,0.822
stats_for_M1,0.571,0.245,0.495,0.571,0.245,0.495,0.571,0.245,0.495,0.143,...,2.285,0.204,0.452,0.857,0.122,0.35,0.714,0.204,0.452,0.762
stats_for_M3,0.571,0.245,0.495,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_M5,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_position,2.0,2.0,1.414,1.4,0.64,0.8,1.2,0.16,0.4,1.5,...,3.334,0.139,0.373,1.0,0.0,0.0,1.167,0.139,0.373,1.111
stats_for_length (x of gs),3.857,0.98,0.99,4.429,1.959,1.4,4.0,2.333,1.528,2.5,...,5.715,1.143,1.069,1.429,0.245,0.495,2.286,0.776,0.881,1.905


Results for experiment `disease_type=less-common-diseases failure-type-ignored=session-failed for app=Symptomate`, which has 7 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.248,0.014,0.117,0.257,0.1,0.316,0.257,0.1,0.316,0.119,...,1.726,0.149,0.386,0.643,0.122,0.35,0.5,0.119,0.345,0.575
stats_for_recall,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_f1-score,0.385,,,0.378,,,0.378,,,0.168,...,2.061,,,0.735,,,0.632,,,0.687
stats_for_f2-score,0.575,,,0.527,,,0.527,,,0.223,...,2.337,,,0.804,,,0.75,,,0.779
stats_for_NDCG,0.694,0.142,0.377,0.643,0.194,0.44,0.662,0.191,0.437,0.233,...,2.465,0.124,0.352,0.857,0.122,0.35,0.804,0.124,0.352,0.822
stats_for_M1,0.571,0.245,0.495,0.571,0.245,0.495,0.571,0.245,0.495,0.143,...,2.285,0.204,0.452,0.857,0.122,0.35,0.714,0.204,0.452,0.762
stats_for_M3,0.571,0.245,0.495,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_M5,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_position,2.0,2.0,1.414,1.4,0.64,0.8,1.2,0.16,0.4,1.5,...,3.334,0.139,0.373,1.0,0.0,0.0,1.167,0.139,0.373,1.111
stats_for_length (x of gs),3.857,0.98,0.99,4.429,1.959,1.4,4.0,2.333,1.528,2.5,...,5.715,1.143,1.069,1.429,0.245,0.495,2.286,0.776,0.881,1.905


Results for experiment `disease_type=less-common-diseases failure-type-ignored=any-error for app=Symptomate`, which has 7 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.248,0.014,0.117,0.257,0.1,0.316,0.257,0.1,0.316,0.119,...,1.726,0.149,0.386,0.643,0.122,0.35,0.5,0.119,0.345,0.575
stats_for_recall,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_f1-score,0.385,,,0.378,,,0.378,,,0.168,...,2.061,,,0.735,,,0.632,,,0.687
stats_for_f2-score,0.575,,,0.527,,,0.527,,,0.223,...,2.337,,,0.804,,,0.75,,,0.779
stats_for_NDCG,0.694,0.142,0.377,0.643,0.194,0.44,0.662,0.191,0.437,0.233,...,2.465,0.124,0.352,0.857,0.122,0.35,0.804,0.124,0.352,0.822
stats_for_M1,0.571,0.245,0.495,0.571,0.245,0.495,0.571,0.245,0.495,0.143,...,2.285,0.204,0.452,0.857,0.122,0.35,0.714,0.204,0.452,0.762
stats_for_M3,0.571,0.245,0.495,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_M5,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_position,2.0,2.0,1.414,1.4,0.64,0.8,1.2,0.16,0.4,1.5,...,3.334,0.139,0.373,1.0,0.0,0.0,1.167,0.139,0.373,1.111
stats_for_length (x of gs),3.857,0.98,0.99,4.429,1.959,1.4,4.0,2.333,1.528,2.5,...,5.715,1.143,1.069,1.429,0.245,0.495,2.286,0.776,0.881,1.905


Results for experiment `disease_type=less-common-diseases failure-type-ignored=no-disease-found for app=WebMD`, which has 7 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.248,0.014,0.117,0.257,0.1,0.316,0.257,0.1,0.316,0.119,...,1.726,0.149,0.386,0.643,0.122,0.35,0.5,0.119,0.345,0.575
stats_for_recall,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_f1-score,0.385,,,0.378,,,0.378,,,0.168,...,2.061,,,0.735,,,0.632,,,0.687
stats_for_f2-score,0.575,,,0.527,,,0.527,,,0.223,...,2.337,,,0.804,,,0.75,,,0.779
stats_for_NDCG,0.694,0.142,0.377,0.643,0.194,0.44,0.662,0.191,0.437,0.233,...,2.465,0.124,0.352,0.857,0.122,0.35,0.804,0.124,0.352,0.822
stats_for_M1,0.571,0.245,0.495,0.571,0.245,0.495,0.571,0.245,0.495,0.143,...,2.285,0.204,0.452,0.857,0.122,0.35,0.714,0.204,0.452,0.762
stats_for_M3,0.571,0.245,0.495,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_M5,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_position,2.0,2.0,1.414,1.4,0.64,0.8,1.2,0.16,0.4,1.5,...,3.334,0.139,0.373,1.0,0.0,0.0,1.167,0.139,0.373,1.111
stats_for_length (x of gs),3.857,0.98,0.99,4.429,1.959,1.4,4.0,2.333,1.528,2.5,...,5.715,1.143,1.069,1.429,0.245,0.495,2.286,0.776,0.881,1.905


Results for experiment `disease_type=less-common-diseases failure-type-ignored=session-failed for app=WebMD`, which has 7 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.248,0.014,0.117,0.257,0.1,0.316,0.257,0.1,0.316,0.119,...,1.726,0.149,0.386,0.643,0.122,0.35,0.5,0.119,0.345,0.575
stats_for_recall,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_f1-score,0.385,,,0.378,,,0.378,,,0.168,...,2.061,,,0.735,,,0.632,,,0.687
stats_for_f2-score,0.575,,,0.527,,,0.527,,,0.223,...,2.337,,,0.804,,,0.75,,,0.779
stats_for_NDCG,0.694,0.142,0.377,0.643,0.194,0.44,0.662,0.191,0.437,0.233,...,2.465,0.124,0.352,0.857,0.122,0.35,0.804,0.124,0.352,0.822
stats_for_M1,0.571,0.245,0.495,0.571,0.245,0.495,0.571,0.245,0.495,0.143,...,2.285,0.204,0.452,0.857,0.122,0.35,0.714,0.204,0.452,0.762
stats_for_M3,0.571,0.245,0.495,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_M5,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_position,2.0,2.0,1.414,1.4,0.64,0.8,1.2,0.16,0.4,1.5,...,3.334,0.139,0.373,1.0,0.0,0.0,1.167,0.139,0.373,1.111
stats_for_length (x of gs),3.857,0.98,0.99,4.429,1.959,1.4,4.0,2.333,1.528,2.5,...,5.715,1.143,1.069,1.429,0.245,0.495,2.286,0.776,0.881,1.905


Results for experiment `disease_type=less-common-diseases failure-type-ignored=any-error for app=WebMD`, which has 7 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.248,0.014,0.117,0.257,0.1,0.316,0.257,0.1,0.316,0.119,...,1.726,0.149,0.386,0.643,0.122,0.35,0.5,0.119,0.345,0.575
stats_for_recall,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_f1-score,0.385,,,0.378,,,0.378,,,0.168,...,2.061,,,0.735,,,0.632,,,0.687
stats_for_f2-score,0.575,,,0.527,,,0.527,,,0.223,...,2.337,,,0.804,,,0.75,,,0.779
stats_for_NDCG,0.694,0.142,0.377,0.643,0.194,0.44,0.662,0.191,0.437,0.233,...,2.465,0.124,0.352,0.857,0.122,0.35,0.804,0.124,0.352,0.822
stats_for_M1,0.571,0.245,0.495,0.571,0.245,0.495,0.571,0.245,0.495,0.143,...,2.285,0.204,0.452,0.857,0.122,0.35,0.714,0.204,0.452,0.762
stats_for_M3,0.571,0.245,0.495,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_M5,0.857,0.122,0.35,0.714,0.204,0.452,0.714,0.204,0.452,0.286,...,2.571,0.122,0.35,0.857,0.122,0.35,0.857,0.122,0.35,0.857
stats_for_position,2.0,2.0,1.414,1.4,0.64,0.8,1.2,0.16,0.4,1.5,...,3.334,0.139,0.373,1.0,0.0,0.0,1.167,0.139,0.373,1.111
stats_for_length (x of gs),3.857,0.98,0.99,4.429,1.959,1.4,4.0,2.333,1.528,2.5,...,5.715,1.143,1.069,1.429,0.245,0.495,2.286,0.776,0.881,1.905


Results for experiment `disease_type=all failure-type-ignored=None for app=NA`, which has 44 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.284,0.046,0.214,0.166,0.029,0.169,0.198,0.027,0.163,0.303,...,2.018,0.106,0.326,0.648,0.166,0.407,0.664,0.121,0.348,0.673
stats_for_recall,0.818,0.149,0.386,0.659,0.225,0.474,0.795,0.163,0.403,0.591,...,2.637,0.043,0.208,0.773,0.176,0.419,0.909,0.083,0.287,0.879
stats_for_f1-score,0.422,,,0.265,,,0.317,,,0.401,...,2.284,,,0.705,,,0.767,,,0.761
stats_for_f2-score,0.594,,,0.413,,,0.496,,,0.497,...,2.483,,,0.744,,,0.847,,,0.828
stats_for_NDCG,0.736,0.151,0.388,0.553,0.192,0.439,0.69,0.16,0.4,0.52,...,2.516,0.061,0.246,0.764,0.175,0.418,0.876,0.088,0.296,0.839
stats_for_M1,0.636,0.231,0.481,0.432,0.245,0.495,0.568,0.245,0.495,0.432,...,2.318,0.188,0.433,0.75,0.188,0.433,0.818,0.149,0.386,0.773
stats_for_M3,0.773,0.176,0.419,0.591,0.242,0.492,0.727,0.198,0.445,0.591,...,2.637,0.043,0.208,0.773,0.176,0.419,0.909,0.083,0.287,0.879
stats_for_M5,0.818,0.149,0.386,0.659,0.225,0.474,0.795,0.163,0.403,0.591,...,2.637,0.043,0.208,0.773,0.176,0.419,0.909,0.083,0.287,0.879
stats_for_position,1.389,0.682,0.826,1.655,1.054,1.026,1.571,1.216,1.103,1.423,...,3.367,0.229,0.479,1.029,0.029,0.169,1.1,0.09,0.3,1.122
stats_for_length (x of gs),3.568,1.382,1.175,4.579,0.717,0.847,4.537,0.932,0.965,2.308,...,4.705,0.808,0.899,1.318,0.353,0.594,1.705,0.799,0.894,1.568


Results for experiment `disease_type=all failure-type-ignored=no-disease-found for app=Avey`, which has 40 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.291,0.05,0.223,0.183,0.028,0.169,0.208,0.027,0.165,0.325,...,1.958,0.109,0.33,0.625,0.172,0.415,0.643,0.124,0.352,0.653
stats_for_recall,0.8,0.16,0.4,0.725,0.199,0.447,0.825,0.144,0.38,0.625,...,2.6,0.048,0.218,0.75,0.188,0.433,0.9,0.09,0.3,0.867
stats_for_f1-score,0.427,,,0.292,,,0.332,,,0.428,...,2.231,,,0.682,,,0.75,,,0.744
stats_for_f2-score,0.593,,,0.455,,,0.518,,,0.528,...,2.437,,,0.721,,,0.833,,,0.812
stats_for_NDCG,0.741,0.162,0.402,0.608,0.178,0.422,0.718,0.149,0.386,0.547,...,2.486,0.064,0.253,0.75,0.188,0.433,0.863,0.095,0.308,0.829
stats_for_M1,0.675,0.219,0.468,0.475,0.249,0.499,0.6,0.24,0.49,0.45,...,2.3,0.188,0.433,0.75,0.188,0.433,0.8,0.16,0.4,0.767
stats_for_M3,0.75,0.188,0.433,0.65,0.227,0.477,0.75,0.188,0.433,0.625,...,2.6,0.048,0.218,0.75,0.188,0.433,0.9,0.09,0.3,0.867
stats_for_M5,0.8,0.16,0.4,0.725,0.199,0.447,0.825,0.144,0.38,0.625,...,2.6,0.048,0.218,0.75,0.188,0.433,0.9,0.09,0.3,0.867
stats_for_position,1.312,0.652,0.808,1.655,1.054,1.026,1.576,1.275,1.129,1.44,...,3.348,0.233,0.483,1.0,0.0,0.0,1.111,0.099,0.314,1.116
stats_for_length (x of gs),3.45,1.348,1.161,4.579,0.717,0.847,4.513,0.968,0.984,2.343,...,4.8,0.849,0.922,1.325,0.369,0.608,1.75,0.838,0.915,1.6


Results for experiment `disease_type=all failure-type-ignored=no-disease-found for app=any`, which has 35 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.29,0.053,0.23,0.161,0.012,0.109,0.191,0.011,0.107,0.324,...,1.997,0.097,0.311,0.629,0.191,0.437,0.625,0.127,0.356,0.666
stats_for_recall,0.8,0.16,0.4,0.714,0.204,0.452,0.829,0.142,0.377,0.629,...,2.571,0.028,0.167,0.714,0.204,0.452,0.886,0.101,0.318,0.857
stats_for_f1-score,0.426,,,0.263,,,0.31,,,0.428,...,2.244,,,0.669,,,0.733,,,0.748
stats_for_f2-score,0.592,,,0.423,,,0.497,,,0.529,...,2.428,,,0.695,,,0.818,,,0.809
stats_for_NDCG,0.732,0.162,0.402,0.58,0.174,0.418,0.707,0.146,0.382,0.55,...,2.451,0.049,0.222,0.714,0.204,0.452,0.854,0.105,0.324,0.817
stats_for_M1,0.657,0.225,0.475,0.429,0.245,0.495,0.571,0.245,0.495,0.457,...,2.257,0.191,0.437,0.714,0.204,0.452,0.8,0.16,0.4,0.752
stats_for_M3,0.743,0.191,0.437,0.629,0.233,0.483,0.743,0.191,0.437,0.629,...,2.571,0.028,0.167,0.714,0.204,0.452,0.886,0.101,0.318,0.857
stats_for_M5,0.8,0.16,0.4,0.714,0.204,0.452,0.829,0.142,0.377,0.629,...,2.571,0.028,0.167,0.714,0.204,0.452,0.886,0.101,0.318,0.857
stats_for_position,1.357,0.73,0.854,1.76,1.142,1.069,1.655,1.398,1.183,1.455,...,3.362,0.253,0.503,1.0,0.0,0.0,1.097,0.087,0.296,1.121
stats_for_length (x of gs),3.486,1.393,1.18,4.667,0.404,0.636,4.6,0.697,0.835,2.323,...,4.628,0.754,0.868,1.257,0.362,0.602,1.771,0.862,0.928,1.543


Results for experiment `disease_type=all failure-type-ignored=session-failed for app=Avey`, which has 42 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.292,0.046,0.214,0.174,0.029,0.169,0.195,0.027,0.165,0.317,...,2.061,0.105,0.324,0.679,0.153,0.391,0.66,0.124,0.351,0.687
stats_for_recall,0.833,0.139,0.373,0.69,0.214,0.462,0.786,0.168,0.41,0.619,...,2.667,0.045,0.213,0.81,0.154,0.393,0.905,0.086,0.294,0.889
stats_for_f1-score,0.432,,,0.278,,,0.312,,,0.419,...,2.323,,,0.739,,,0.763,,,0.774
stats_for_f2-score,0.608,,,0.433,,,0.489,,,0.52,...,2.517,,,0.78,,,0.842,,,0.839
stats_for_NDCG,0.756,0.144,0.38,0.579,0.186,0.432,0.684,0.165,0.406,0.545,...,2.559,0.06,0.246,0.801,0.154,0.392,0.87,0.091,0.302,0.853
stats_for_M1,0.667,0.222,0.471,0.452,0.248,0.498,0.571,0.245,0.495,0.452,...,2.382,0.168,0.41,0.786,0.168,0.41,0.81,0.154,0.393,0.794
stats_for_M3,0.786,0.168,0.41,0.619,0.236,0.486,0.714,0.204,0.452,0.619,...,2.667,0.045,0.213,0.81,0.154,0.393,0.905,0.086,0.294,0.889
stats_for_M5,0.833,0.139,0.373,0.69,0.214,0.462,0.786,0.168,0.41,0.619,...,2.667,0.045,0.213,0.81,0.154,0.393,0.905,0.086,0.294,0.889
stats_for_position,1.371,0.691,0.831,1.655,1.054,1.026,1.576,1.275,1.129,1.423,...,3.334,0.21,0.458,1.029,0.029,0.169,1.105,0.094,0.307,1.111
stats_for_length (x of gs),3.548,1.438,1.199,4.579,0.717,0.847,4.564,0.913,0.955,2.297,...,4.666,0.712,0.844,1.333,0.365,0.604,1.714,0.823,0.907,1.555


Results for experiment `disease_type=all failure-type-ignored=session-failed for app=any`, which has 27 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.322,0.058,0.24,0.198,0.036,0.191,0.228,0.034,0.185,0.414,...,2.096,0.116,0.34,0.704,0.116,0.34,0.698,0.113,0.337,0.699
stats_for_recall,0.852,0.126,0.355,0.741,0.192,0.438,0.852,0.126,0.355,0.741,...,2.741,0.069,0.262,0.889,0.099,0.314,0.926,0.069,0.262,0.914
stats_for_f1-score,0.467,,,0.312,,,0.36,,,0.531,...,2.375,,,0.786,,,0.796,,,0.792
stats_for_f2-score,0.641,,,0.479,,,0.551,,,0.64,...,2.582,,,0.845,,,0.869,,,0.861
stats_for_NDCG,0.799,0.134,0.366,0.618,0.174,0.417,0.781,0.136,0.369,0.681,...,2.64,0.081,0.284,0.875,0.101,0.317,0.899,0.074,0.272,0.88
stats_for_M1,0.741,0.192,0.438,0.481,0.25,0.5,0.704,0.209,0.457,0.593,...,2.482,0.173,0.416,0.852,0.126,0.355,0.852,0.126,0.355,0.827
stats_for_M3,0.815,0.151,0.388,0.63,0.233,0.483,0.778,0.173,0.416,0.741,...,2.741,0.069,0.262,0.889,0.099,0.314,0.926,0.069,0.262,0.914
stats_for_M5,0.852,0.126,0.355,0.741,0.192,0.438,0.852,0.126,0.355,0.741,...,2.741,0.069,0.262,0.889,0.099,0.314,0.926,0.069,0.262,0.914
stats_for_position,1.261,0.541,0.735,1.7,1.21,1.1,1.391,1.021,1.01,1.25,...,3.322,0.24,0.49,1.042,0.04,0.2,1.08,0.074,0.271,1.107
stats_for_length (x of gs),3.37,1.567,1.252,4.48,0.97,0.985,4.385,1.237,1.112,2.231,...,4.556,0.678,0.823,1.37,0.233,0.483,1.556,0.543,0.737,1.519


Results for experiment `disease_type=all failure-type-ignored=any-error for app=Avey`, which has 38 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.3,0.05,0.224,0.192,0.028,0.167,0.205,0.028,0.168,0.342,...,2.001,0.108,0.329,0.658,0.159,0.399,0.637,0.126,0.355,0.667
stats_for_recall,0.816,0.15,0.388,0.763,0.181,0.425,0.816,0.15,0.388,0.658,...,2.631,0.05,0.223,0.789,0.166,0.408,0.895,0.094,0.307,0.877
stats_for_f1-score,0.439,,,0.307,,,0.328,,,0.45,...,2.271,,,0.718,,,0.744,,,0.757
stats_for_f2-score,0.607,,,0.478,,,0.511,,,0.555,...,2.473,,,0.759,,,0.828,,,0.824
stats_for_NDCG,0.763,0.155,0.394,0.64,0.167,0.408,0.713,0.154,0.393,0.576,...,2.531,0.064,0.253,0.789,0.166,0.408,0.856,0.099,0.314,0.844
stats_for_M1,0.711,0.206,0.454,0.5,0.25,0.5,0.605,0.239,0.489,0.474,...,2.367,0.166,0.408,0.789,0.166,0.408,0.789,0.166,0.408,0.789
stats_for_M3,0.763,0.181,0.425,0.684,0.216,0.465,0.737,0.194,0.44,0.658,...,2.631,0.05,0.223,0.789,0.166,0.408,0.895,0.094,0.307,0.877
stats_for_M5,0.816,0.15,0.388,0.763,0.181,0.425,0.816,0.15,0.388,0.658,...,2.631,0.05,0.223,0.789,0.166,0.408,0.895,0.094,0.307,0.877
stats_for_position,1.29,0.658,0.811,1.655,1.054,1.026,1.581,1.34,1.158,1.44,...,3.312,0.212,0.461,1.0,0.0,0.0,1.118,0.104,0.322,1.104
stats_for_length (x of gs),3.421,1.402,1.184,4.579,0.717,0.847,4.541,0.951,0.975,2.333,...,4.763,0.751,0.867,1.342,0.383,0.619,1.763,0.865,0.93,1.588


Results for experiment `disease_type=all failure-type-ignored=any-error for app=any`, which has 20 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.337,0.07,0.265,0.185,0.011,0.103,0.205,0.013,0.115,0.475,...,2.183,0.098,0.312,0.725,0.137,0.37,0.675,0.124,0.351,0.728
stats_for_recall,0.85,0.128,0.357,0.8,0.16,0.4,0.85,0.127,0.357,0.85,...,2.7,0.048,0.218,0.85,0.127,0.357,0.9,0.09,0.3,0.9
stats_for_f1-score,0.483,,,0.301,,,0.33,,,0.609,...,2.412,,,0.783,,,0.771,,,0.804
stats_for_f2-score,0.652,,,0.481,,,0.522,,,0.734,...,2.577,,,0.822,,,0.844,,,0.859
stats_for_NDCG,0.797,0.137,0.371,0.634,0.148,0.384,0.772,0.14,0.374,0.788,...,2.62,0.064,0.252,0.85,0.127,0.357,0.882,0.093,0.305,0.873
stats_for_M1,0.75,0.188,0.433,0.45,0.248,0.497,0.7,0.21,0.458,0.7,...,2.5,0.16,0.4,0.85,0.127,0.357,0.85,0.127,0.357,0.833
stats_for_M3,0.8,0.16,0.4,0.65,0.227,0.477,0.75,0.188,0.433,0.85,...,2.7,0.048,0.218,0.85,0.127,0.357,0.9,0.09,0.3,0.9
stats_for_M5,0.85,0.128,0.357,0.8,0.16,0.4,0.85,0.127,0.357,0.85,...,2.7,0.048,0.218,0.85,0.127,0.357,0.9,0.09,0.3,0.9
stats_for_position,1.294,0.678,0.824,1.875,1.359,1.166,1.471,1.308,1.144,1.235,...,3.267,0.271,0.521,1.0,0.0,0.0,1.056,0.052,0.229,1.089
stats_for_length (x of gs),3.3,1.71,1.308,4.6,0.54,0.735,4.45,0.948,0.973,2.2,...,4.2,0.44,0.663,1.25,0.188,0.433,1.55,0.548,0.74,1.4


Results for experiment `disease_type=all failure-type-ignored=no-disease-found for app=Avey v2`, which has 41 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.295,0.047,0.216,0.178,0.029,0.169,0.213,0.025,0.159,0.317,...,1.988,0.105,0.324,0.634,0.171,0.414,0.651,0.124,0.352,0.663
stats_for_recall,0.829,0.142,0.376,0.707,0.207,0.455,0.854,0.125,0.353,0.61,...,2.609,0.046,0.215,0.756,0.184,0.429,0.902,0.088,0.297,0.87
stats_for_f1-score,0.435,,,0.284,,,0.341,,,0.417,...,2.254,,,0.69,,,0.756,,,0.751
stats_for_f2-score,0.609,,,0.443,,,0.533,,,0.515,...,2.453,,,0.728,,,0.837,,,0.818
stats_for_NDCG,0.762,0.146,0.382,0.593,0.182,0.427,0.74,0.134,0.366,0.534,...,2.48,0.064,0.253,0.747,0.183,0.428,0.866,0.093,0.305,0.827
stats_for_M1,0.683,0.217,0.465,0.463,0.249,0.499,0.61,0.238,0.488,0.439,...,2.269,0.196,0.443,0.732,0.196,0.443,0.805,0.157,0.396,0.756
stats_for_M3,0.78,0.171,0.414,0.634,0.232,0.482,0.78,0.171,0.414,0.61,...,2.609,0.046,0.215,0.756,0.184,0.429,0.902,0.088,0.297,0.87
stats_for_M5,0.829,0.142,0.376,0.707,0.207,0.455,0.854,0.125,0.353,0.61,...,2.609,0.046,0.215,0.756,0.184,0.429,0.902,0.088,0.297,0.87
stats_for_position,1.324,0.631,0.794,1.655,1.054,1.026,1.571,1.216,1.103,1.44,...,3.396,0.242,0.492,1.032,0.031,0.177,1.108,0.096,0.311,1.132
stats_for_length (x of gs),3.463,1.322,1.15,4.568,0.732,0.856,4.537,0.932,0.965,2.306,...,4.708,0.713,0.844,1.317,0.363,0.602,1.732,0.83,0.911,1.569


Results for experiment `disease_type=all failure-type-ignored=session-failed for app=Avey v2`, which has 44 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.284,0.046,0.214,0.166,0.029,0.169,0.198,0.027,0.163,0.303,...,2.018,0.106,0.326,0.648,0.166,0.407,0.664,0.121,0.348,0.673
stats_for_recall,0.818,0.149,0.386,0.659,0.225,0.474,0.795,0.163,0.403,0.591,...,2.637,0.043,0.208,0.773,0.176,0.419,0.909,0.083,0.287,0.879
stats_for_f1-score,0.422,,,0.265,,,0.317,,,0.401,...,2.284,,,0.705,,,0.767,,,0.761
stats_for_f2-score,0.594,,,0.413,,,0.496,,,0.497,...,2.483,,,0.744,,,0.847,,,0.828
stats_for_NDCG,0.736,0.151,0.388,0.553,0.192,0.439,0.69,0.16,0.4,0.52,...,2.516,0.061,0.246,0.764,0.175,0.418,0.876,0.088,0.296,0.839
stats_for_M1,0.636,0.231,0.481,0.432,0.245,0.495,0.568,0.245,0.495,0.432,...,2.318,0.188,0.433,0.75,0.188,0.433,0.818,0.149,0.386,0.773
stats_for_M3,0.773,0.176,0.419,0.591,0.242,0.492,0.727,0.198,0.445,0.591,...,2.637,0.043,0.208,0.773,0.176,0.419,0.909,0.083,0.287,0.879
stats_for_M5,0.818,0.149,0.386,0.659,0.225,0.474,0.795,0.163,0.403,0.591,...,2.637,0.043,0.208,0.773,0.176,0.419,0.909,0.083,0.287,0.879
stats_for_position,1.389,0.682,0.826,1.655,1.054,1.026,1.571,1.216,1.103,1.423,...,3.367,0.229,0.479,1.029,0.029,0.169,1.1,0.09,0.3,1.122
stats_for_length (x of gs),3.568,1.382,1.175,4.579,0.717,0.847,4.537,0.932,0.965,2.308,...,4.705,0.808,0.899,1.318,0.353,0.594,1.705,0.799,0.894,1.568


Results for experiment `disease_type=all failure-type-ignored=any-error for app=Avey v2`, which has 41 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.295,0.047,0.216,0.178,0.029,0.169,0.213,0.025,0.159,0.317,...,1.988,0.105,0.324,0.634,0.171,0.414,0.651,0.124,0.352,0.663
stats_for_recall,0.829,0.142,0.376,0.707,0.207,0.455,0.854,0.125,0.353,0.61,...,2.609,0.046,0.215,0.756,0.184,0.429,0.902,0.088,0.297,0.87
stats_for_f1-score,0.435,,,0.284,,,0.341,,,0.417,...,2.254,,,0.69,,,0.756,,,0.751
stats_for_f2-score,0.609,,,0.443,,,0.533,,,0.515,...,2.453,,,0.728,,,0.837,,,0.818
stats_for_NDCG,0.762,0.146,0.382,0.593,0.182,0.427,0.74,0.134,0.366,0.534,...,2.48,0.064,0.253,0.747,0.183,0.428,0.866,0.093,0.305,0.827
stats_for_M1,0.683,0.217,0.465,0.463,0.249,0.499,0.61,0.238,0.488,0.439,...,2.269,0.196,0.443,0.732,0.196,0.443,0.805,0.157,0.396,0.756
stats_for_M3,0.78,0.171,0.414,0.634,0.232,0.482,0.78,0.171,0.414,0.61,...,2.609,0.046,0.215,0.756,0.184,0.429,0.902,0.088,0.297,0.87
stats_for_M5,0.829,0.142,0.376,0.707,0.207,0.455,0.854,0.125,0.353,0.61,...,2.609,0.046,0.215,0.756,0.184,0.429,0.902,0.088,0.297,0.87
stats_for_position,1.324,0.631,0.794,1.655,1.054,1.026,1.571,1.216,1.103,1.44,...,3.396,0.242,0.492,1.032,0.031,0.177,1.108,0.096,0.311,1.132
stats_for_length (x of gs),3.463,1.322,1.15,4.568,0.732,0.856,4.537,0.932,0.965,2.306,...,4.708,0.713,0.844,1.317,0.363,0.602,1.732,0.83,0.911,1.569


Results for experiment `disease_type=all failure-type-ignored=no-disease-found for app=Buoy`, which has 43 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.283,0.047,0.216,0.147,0.013,0.113,0.18,0.012,0.109,0.31,...,2.019,0.097,0.311,0.64,0.167,0.408,0.656,0.121,0.348,0.673
stats_for_recall,0.814,0.151,0.389,0.651,0.227,0.477,0.791,0.165,0.407,0.605,...,2.651,0.023,0.151,0.767,0.178,0.422,0.907,0.084,0.29,0.884
stats_for_f1-score,0.42,,,0.24,,,0.293,,,0.41,...,2.29,,,0.698,,,0.761,,,0.763
stats_for_f2-score,0.592,,,0.386,,,0.471,,,0.508,...,2.494,,,0.738,,,0.843,,,0.831
stats_for_NDCG,0.73,0.152,0.39,0.542,0.192,0.438,0.683,0.161,0.401,0.532,...,2.528,0.044,0.209,0.759,0.178,0.421,0.873,0.089,0.299,0.843
stats_for_M1,0.628,0.234,0.483,0.419,0.243,0.493,0.558,0.247,0.497,0.442,...,2.325,0.178,0.422,0.744,0.19,0.436,0.814,0.151,0.389,0.775
stats_for_M3,0.767,0.178,0.422,0.581,0.243,0.493,0.721,0.201,0.449,0.605,...,2.651,0.023,0.151,0.767,0.178,0.422,0.907,0.084,0.29,0.884
stats_for_M5,0.814,0.151,0.389,0.651,0.227,0.477,0.791,0.165,0.407,0.605,...,2.651,0.023,0.151,0.767,0.178,0.422,0.907,0.084,0.29,0.884
stats_for_position,1.4,0.697,0.835,1.679,1.075,1.037,1.588,1.242,1.115,1.423,...,3.371,0.229,0.479,1.03,0.029,0.171,1.103,0.092,0.303,1.124
stats_for_length (x of gs),3.581,1.406,1.186,4.676,0.381,0.617,4.625,0.634,0.796,2.308,...,4.721,0.824,0.908,1.326,0.359,0.599,1.721,0.806,0.898,1.574


Results for experiment `disease_type=all failure-type-ignored=session-failed for app=Buoy`, which has 40 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.287,0.047,0.217,0.176,0.029,0.171,0.212,0.026,0.161,0.333,...,2.068,0.105,0.324,0.675,0.157,0.396,0.697,0.117,0.342,0.689
stats_for_recall,0.825,0.144,0.38,0.7,0.21,0.458,0.85,0.128,0.357,0.65,...,2.675,0.048,0.218,0.8,0.16,0.4,0.925,0.069,0.263,0.892
stats_for_f1-score,0.426,,,0.281,,,0.339,,,0.44,...,2.33,,,0.732,,,0.795,,,0.777
stats_for_f2-score,0.6,,,0.439,,,0.531,,,0.546,...,2.524,,,0.771,,,0.868,,,0.841
stats_for_NDCG,0.744,0.148,0.385,0.592,0.187,0.432,0.743,0.137,0.37,0.572,...,2.552,0.064,0.253,0.791,0.16,0.4,0.888,0.076,0.276,0.851
stats_for_M1,0.65,0.227,0.477,0.475,0.249,0.499,0.625,0.234,0.484,0.475,...,2.35,0.188,0.433,0.775,0.174,0.418,0.825,0.144,0.38,0.783
stats_for_M3,0.775,0.174,0.418,0.625,0.234,0.484,0.775,0.174,0.418,0.65,...,2.675,0.048,0.218,0.8,0.16,0.4,0.925,0.069,0.263,0.892
stats_for_M5,0.825,0.144,0.38,0.7,0.21,0.458,0.85,0.128,0.357,0.65,...,2.675,0.048,0.218,0.8,0.16,0.4,0.925,0.069,0.263,0.892
stats_for_position,1.394,0.724,0.851,1.643,1.087,1.042,1.559,1.247,1.116,1.423,...,3.376,0.233,0.483,1.031,0.03,0.174,1.108,0.096,0.311,1.125
stats_for_length (x of gs),3.55,1.347,1.161,4.559,0.776,0.881,4.514,1.007,1.003,2.308,...,4.575,0.719,0.848,1.25,0.188,0.433,1.65,0.827,0.91,1.525


Results for experiment `disease_type=all failure-type-ignored=any-error for app=Buoy`, which has 39 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.285,0.048,0.219,0.155,0.012,0.109,0.192,0.01,0.101,0.342,...,2.07,0.095,0.308,0.667,0.158,0.398,0.689,0.118,0.343,0.69
stats_for_recall,0.821,0.147,0.384,0.692,0.213,0.462,0.846,0.13,0.361,0.667,...,2.692,0.025,0.158,0.795,0.163,0.404,0.923,0.071,0.266,0.897
stats_for_f1-score,0.423,,,0.253,,,0.313,,,0.452,...,2.338,,,0.725,,,0.789,,,0.779
stats_for_f2-score,0.597,,,0.409,,,0.503,,,0.56,...,2.538,,,0.766,,,0.864,,,0.846
stats_for_NDCG,0.737,0.15,0.388,0.582,0.187,0.433,0.737,0.139,0.373,0.587,...,2.565,0.046,0.213,0.785,0.163,0.403,0.885,0.078,0.279,0.855
stats_for_M1,0.641,0.23,0.48,0.462,0.249,0.499,0.615,0.237,0.487,0.487,...,2.359,0.178,0.421,0.769,0.178,0.421,0.821,0.147,0.384,0.786
stats_for_M3,0.769,0.178,0.421,0.615,0.237,0.487,0.769,0.178,0.421,0.667,...,2.692,0.025,0.158,0.795,0.163,0.404,0.923,0.071,0.266,0.897
stats_for_M5,0.821,0.147,0.384,0.692,0.213,0.462,0.846,0.13,0.361,0.667,...,2.692,0.025,0.158,0.795,0.163,0.404,0.923,0.071,0.266,0.897
stats_for_position,1.406,0.741,0.861,1.667,1.111,1.054,1.576,1.275,1.129,1.423,...,3.38,0.233,0.483,1.032,0.031,0.177,1.111,0.099,0.314,1.127
stats_for_length (x of gs),3.564,1.374,1.172,4.667,0.404,0.636,4.611,0.682,0.826,2.308,...,4.59,0.735,0.857,1.256,0.191,0.437,1.667,0.838,0.915,1.53


Results for experiment `disease_type=all failure-type-ignored=no-disease-found for app=Healthily`, which has 44 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.284,0.046,0.214,0.166,0.029,0.169,0.198,0.027,0.163,0.303,...,2.018,0.106,0.326,0.648,0.166,0.407,0.664,0.121,0.348,0.673
stats_for_recall,0.818,0.149,0.386,0.659,0.225,0.474,0.795,0.163,0.403,0.591,...,2.637,0.043,0.208,0.773,0.176,0.419,0.909,0.083,0.287,0.879
stats_for_f1-score,0.422,,,0.265,,,0.317,,,0.401,...,2.284,,,0.705,,,0.767,,,0.761
stats_for_f2-score,0.594,,,0.413,,,0.496,,,0.497,...,2.483,,,0.744,,,0.847,,,0.828
stats_for_NDCG,0.736,0.151,0.388,0.553,0.192,0.439,0.69,0.16,0.4,0.52,...,2.516,0.061,0.246,0.764,0.175,0.418,0.876,0.088,0.296,0.839
stats_for_M1,0.636,0.231,0.481,0.432,0.245,0.495,0.568,0.245,0.495,0.432,...,2.318,0.188,0.433,0.75,0.188,0.433,0.818,0.149,0.386,0.773
stats_for_M3,0.773,0.176,0.419,0.591,0.242,0.492,0.727,0.198,0.445,0.591,...,2.637,0.043,0.208,0.773,0.176,0.419,0.909,0.083,0.287,0.879
stats_for_M5,0.818,0.149,0.386,0.659,0.225,0.474,0.795,0.163,0.403,0.591,...,2.637,0.043,0.208,0.773,0.176,0.419,0.909,0.083,0.287,0.879
stats_for_position,1.389,0.682,0.826,1.655,1.054,1.026,1.571,1.216,1.103,1.423,...,3.367,0.229,0.479,1.029,0.029,0.169,1.1,0.09,0.3,1.122
stats_for_length (x of gs),3.568,1.382,1.175,4.579,0.717,0.847,4.537,0.932,0.965,2.308,...,4.705,0.808,0.899,1.318,0.353,0.594,1.705,0.799,0.894,1.568


Results for experiment `disease_type=all failure-type-ignored=session-failed for app=Healthily`, which has 32 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.308,0.052,0.228,0.18,0.034,0.184,0.228,0.029,0.171,0.365,...,2.08,0.112,0.335,0.688,0.137,0.37,0.704,0.113,0.336,0.693
stats_for_recall,0.844,0.132,0.363,0.688,0.215,0.464,0.875,0.109,0.331,0.656,...,2.72,0.059,0.242,0.844,0.132,0.363,0.938,0.059,0.242,0.907
stats_for_f1-score,0.451,,,0.285,,,0.362,,,0.469,...,2.356,,,0.758,,,0.804,,,0.785
stats_for_f2-score,0.626,,,0.44,,,0.558,,,0.566,...,2.561,,,0.807,,,0.88,,,0.854
stats_for_NDCG,0.787,0.137,0.371,0.584,0.19,0.436,0.788,0.122,0.35,0.606,...,2.61,0.073,0.27,0.832,0.132,0.364,0.914,0.064,0.252,0.87
stats_for_M1,0.719,0.202,0.45,0.469,0.249,0.499,0.688,0.215,0.464,0.531,...,2.437,0.188,0.433,0.812,0.152,0.39,0.875,0.109,0.331,0.812
stats_for_M3,0.812,0.152,0.39,0.594,0.241,0.491,0.812,0.152,0.39,0.656,...,2.72,0.059,0.242,0.844,0.132,0.363,0.938,0.059,0.242,0.907
stats_for_M5,0.844,0.132,0.363,0.688,0.215,0.464,0.875,0.109,0.331,0.656,...,2.72,0.059,0.242,0.844,0.132,0.363,0.938,0.059,0.242,0.907
stats_for_position,1.259,0.488,0.699,1.636,1.14,1.068,1.429,0.959,0.979,1.238,...,3.337,0.246,0.496,1.037,0.036,0.189,1.067,0.062,0.249,1.112
stats_for_length (x of gs),3.406,1.366,1.169,4.536,0.892,0.944,4.419,1.147,1.071,2.258,...,4.625,0.777,0.882,1.312,0.215,0.464,1.625,0.859,0.927,1.542


Results for experiment `disease_type=all failure-type-ignored=any-error for app=Healthily`, which has 32 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.308,0.052,0.228,0.18,0.034,0.184,0.228,0.029,0.171,0.365,...,2.08,0.112,0.335,0.688,0.137,0.37,0.704,0.113,0.336,0.693
stats_for_recall,0.844,0.132,0.363,0.688,0.215,0.464,0.875,0.109,0.331,0.656,...,2.72,0.059,0.242,0.844,0.132,0.363,0.938,0.059,0.242,0.907
stats_for_f1-score,0.451,,,0.285,,,0.362,,,0.469,...,2.356,,,0.758,,,0.804,,,0.785
stats_for_f2-score,0.626,,,0.44,,,0.558,,,0.566,...,2.561,,,0.807,,,0.88,,,0.854
stats_for_NDCG,0.787,0.137,0.371,0.584,0.19,0.436,0.788,0.122,0.35,0.606,...,2.61,0.073,0.27,0.832,0.132,0.364,0.914,0.064,0.252,0.87
stats_for_M1,0.719,0.202,0.45,0.469,0.249,0.499,0.688,0.215,0.464,0.531,...,2.437,0.188,0.433,0.812,0.152,0.39,0.875,0.109,0.331,0.812
stats_for_M3,0.812,0.152,0.39,0.594,0.241,0.491,0.812,0.152,0.39,0.656,...,2.72,0.059,0.242,0.844,0.132,0.363,0.938,0.059,0.242,0.907
stats_for_M5,0.844,0.132,0.363,0.688,0.215,0.464,0.875,0.109,0.331,0.656,...,2.72,0.059,0.242,0.844,0.132,0.363,0.938,0.059,0.242,0.907
stats_for_position,1.259,0.488,0.699,1.636,1.14,1.068,1.429,0.959,0.979,1.238,...,3.337,0.246,0.496,1.037,0.036,0.189,1.067,0.062,0.249,1.112
stats_for_length (x of gs),3.406,1.366,1.169,4.536,0.892,0.944,4.419,1.147,1.071,2.258,...,4.625,0.777,0.882,1.312,0.215,0.464,1.625,0.859,0.927,1.542


Results for experiment `disease_type=all failure-type-ignored=no-disease-found for app=K Health`, which has 44 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.284,0.046,0.214,0.166,0.029,0.169,0.198,0.027,0.163,0.303,...,2.018,0.106,0.326,0.648,0.166,0.407,0.664,0.121,0.348,0.673
stats_for_recall,0.818,0.149,0.386,0.659,0.225,0.474,0.795,0.163,0.403,0.591,...,2.637,0.043,0.208,0.773,0.176,0.419,0.909,0.083,0.287,0.879
stats_for_f1-score,0.422,,,0.265,,,0.317,,,0.401,...,2.284,,,0.705,,,0.767,,,0.761
stats_for_f2-score,0.594,,,0.413,,,0.496,,,0.497,...,2.483,,,0.744,,,0.847,,,0.828
stats_for_NDCG,0.736,0.151,0.388,0.553,0.192,0.439,0.69,0.16,0.4,0.52,...,2.516,0.061,0.246,0.764,0.175,0.418,0.876,0.088,0.296,0.839
stats_for_M1,0.636,0.231,0.481,0.432,0.245,0.495,0.568,0.245,0.495,0.432,...,2.318,0.188,0.433,0.75,0.188,0.433,0.818,0.149,0.386,0.773
stats_for_M3,0.773,0.176,0.419,0.591,0.242,0.492,0.727,0.198,0.445,0.591,...,2.637,0.043,0.208,0.773,0.176,0.419,0.909,0.083,0.287,0.879
stats_for_M5,0.818,0.149,0.386,0.659,0.225,0.474,0.795,0.163,0.403,0.591,...,2.637,0.043,0.208,0.773,0.176,0.419,0.909,0.083,0.287,0.879
stats_for_position,1.389,0.682,0.826,1.655,1.054,1.026,1.571,1.216,1.103,1.423,...,3.367,0.229,0.479,1.029,0.029,0.169,1.1,0.09,0.3,1.122
stats_for_length (x of gs),3.568,1.382,1.175,4.579,0.717,0.847,4.537,0.932,0.965,2.308,...,4.705,0.808,0.899,1.318,0.353,0.594,1.705,0.799,0.894,1.568


Results for experiment `disease_type=all failure-type-ignored=session-failed for app=K Health`, which has 29 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.309,0.057,0.24,0.184,0.036,0.191,0.23,0.032,0.18,0.385,...,2.028,0.115,0.34,0.655,0.14,0.374,0.701,0.11,0.332,0.676
stats_for_recall,0.828,0.143,0.378,0.69,0.214,0.463,0.862,0.119,0.345,0.69,...,2.69,0.064,0.253,0.828,0.143,0.378,0.931,0.064,0.253,0.897
stats_for_f1-score,0.45,,,0.291,,,0.363,,,0.494,...,2.312,,,0.731,,,0.8,,,0.771
stats_for_f2-score,0.62,,,0.445,,,0.556,,,0.596,...,2.524,,,0.786,,,0.874,,,0.841
stats_for_NDCG,0.765,0.147,0.383,0.575,0.186,0.432,0.783,0.129,0.359,0.634,...,2.571,0.079,0.28,0.815,0.143,0.378,0.906,0.069,0.264,0.857
stats_for_M1,0.69,0.214,0.463,0.448,0.247,0.497,0.69,0.214,0.463,0.552,...,2.379,0.2,0.447,0.793,0.164,0.405,0.862,0.119,0.345,0.793
stats_for_M3,0.793,0.164,0.405,0.586,0.243,0.493,0.793,0.164,0.405,0.69,...,2.69,0.064,0.253,0.828,0.143,0.378,0.931,0.064,0.253,0.897
stats_for_M5,0.828,0.143,0.378,0.69,0.214,0.463,0.862,0.119,0.345,0.69,...,2.69,0.064,0.253,0.828,0.143,0.378,0.931,0.064,0.253,0.897
stats_for_position,1.292,0.54,0.735,1.7,1.21,1.1,1.4,0.96,0.98,1.25,...,3.375,0.266,0.516,1.042,0.04,0.2,1.074,0.069,0.262,1.125
stats_for_length (x of gs),3.414,1.484,1.218,4.48,0.97,0.985,4.357,1.23,1.109,2.25,...,4.621,0.82,0.906,1.345,0.226,0.475,1.552,0.523,0.723,1.54


Results for experiment `disease_type=all failure-type-ignored=any-error for app=K Health`, which has 29 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.309,0.057,0.24,0.184,0.036,0.191,0.23,0.032,0.18,0.385,...,2.028,0.115,0.34,0.655,0.14,0.374,0.701,0.11,0.332,0.676
stats_for_recall,0.828,0.143,0.378,0.69,0.214,0.463,0.862,0.119,0.345,0.69,...,2.69,0.064,0.253,0.828,0.143,0.378,0.931,0.064,0.253,0.897
stats_for_f1-score,0.45,,,0.291,,,0.363,,,0.494,...,2.312,,,0.731,,,0.8,,,0.771
stats_for_f2-score,0.62,,,0.445,,,0.556,,,0.596,...,2.524,,,0.786,,,0.874,,,0.841
stats_for_NDCG,0.765,0.147,0.383,0.575,0.186,0.432,0.783,0.129,0.359,0.634,...,2.571,0.079,0.28,0.815,0.143,0.378,0.906,0.069,0.264,0.857
stats_for_M1,0.69,0.214,0.463,0.448,0.247,0.497,0.69,0.214,0.463,0.552,...,2.379,0.2,0.447,0.793,0.164,0.405,0.862,0.119,0.345,0.793
stats_for_M3,0.793,0.164,0.405,0.586,0.243,0.493,0.793,0.164,0.405,0.69,...,2.69,0.064,0.253,0.828,0.143,0.378,0.931,0.064,0.253,0.897
stats_for_M5,0.828,0.143,0.378,0.69,0.214,0.463,0.862,0.119,0.345,0.69,...,2.69,0.064,0.253,0.828,0.143,0.378,0.931,0.064,0.253,0.897
stats_for_position,1.292,0.54,0.735,1.7,1.21,1.1,1.4,0.96,0.98,1.25,...,3.375,0.266,0.516,1.042,0.04,0.2,1.074,0.069,0.262,1.125
stats_for_length (x of gs),3.414,1.484,1.218,4.48,0.97,0.985,4.357,1.23,1.109,2.25,...,4.621,0.82,0.906,1.345,0.226,0.475,1.552,0.523,0.723,1.54


Results for experiment `disease_type=all failure-type-ignored=no-disease-found for app=Mediktor`, which has 41 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.276,0.048,0.219,0.162,0.03,0.174,0.197,0.028,0.168,0.285,...,2.04,0.108,0.329,0.659,0.176,0.42,0.655,0.122,0.349,0.68
stats_for_recall,0.805,0.157,0.396,0.634,0.232,0.482,0.78,0.171,0.414,0.561,...,2.609,0.046,0.215,0.756,0.184,0.429,0.902,0.088,0.297,0.87
stats_for_f1-score,0.411,,,0.258,,,0.315,,,0.378,...,2.286,,,0.704,,,0.759,,,0.762
stats_for_f2-score,0.582,,,0.401,,,0.49,,,0.47,...,2.468,,,0.734,,,0.839,,,0.823
stats_for_NDCG,0.717,0.156,0.395,0.52,0.191,0.437,0.667,0.164,0.405,0.494,...,2.489,0.064,0.253,0.747,0.183,0.428,0.875,0.092,0.303,0.83
stats_for_M1,0.61,0.238,0.488,0.39,0.238,0.488,0.537,0.249,0.499,0.415,...,2.293,0.196,0.443,0.732,0.196,0.443,0.829,0.142,0.376,0.764
stats_for_M3,0.756,0.184,0.429,0.561,0.246,0.496,0.707,0.207,0.455,0.561,...,2.609,0.046,0.215,0.756,0.184,0.429,0.902,0.088,0.297,0.87
stats_for_M5,0.805,0.157,0.396,0.634,0.232,0.482,0.78,0.171,0.414,0.561,...,2.609,0.046,0.215,0.756,0.184,0.429,0.902,0.088,0.297,0.87
stats_for_position,1.424,0.729,0.854,1.731,1.12,1.058,1.625,1.297,1.139,1.435,...,3.369,0.242,0.492,1.032,0.031,0.177,1.081,0.075,0.273,1.123
stats_for_length (x of gs),3.634,1.403,1.184,4.571,0.759,0.871,4.526,0.986,0.993,2.306,...,4.609,0.817,0.904,1.268,0.343,0.585,1.707,0.792,0.89,1.536


Results for experiment `disease_type=all failure-type-ignored=session-failed for app=Mediktor`, which has 44 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.284,0.046,0.214,0.166,0.029,0.169,0.198,0.027,0.163,0.303,...,2.018,0.106,0.326,0.648,0.166,0.407,0.664,0.121,0.348,0.673
stats_for_recall,0.818,0.149,0.386,0.659,0.225,0.474,0.795,0.163,0.403,0.591,...,2.637,0.043,0.208,0.773,0.176,0.419,0.909,0.083,0.287,0.879
stats_for_f1-score,0.422,,,0.265,,,0.317,,,0.401,...,2.284,,,0.705,,,0.767,,,0.761
stats_for_f2-score,0.594,,,0.413,,,0.496,,,0.497,...,2.483,,,0.744,,,0.847,,,0.828
stats_for_NDCG,0.736,0.151,0.388,0.553,0.192,0.439,0.69,0.16,0.4,0.52,...,2.516,0.061,0.246,0.764,0.175,0.418,0.876,0.088,0.296,0.839
stats_for_M1,0.636,0.231,0.481,0.432,0.245,0.495,0.568,0.245,0.495,0.432,...,2.318,0.188,0.433,0.75,0.188,0.433,0.818,0.149,0.386,0.773
stats_for_M3,0.773,0.176,0.419,0.591,0.242,0.492,0.727,0.198,0.445,0.591,...,2.637,0.043,0.208,0.773,0.176,0.419,0.909,0.083,0.287,0.879
stats_for_M5,0.818,0.149,0.386,0.659,0.225,0.474,0.795,0.163,0.403,0.591,...,2.637,0.043,0.208,0.773,0.176,0.419,0.909,0.083,0.287,0.879
stats_for_position,1.389,0.682,0.826,1.655,1.054,1.026,1.571,1.216,1.103,1.423,...,3.367,0.229,0.479,1.029,0.029,0.169,1.1,0.09,0.3,1.122
stats_for_length (x of gs),3.568,1.382,1.175,4.579,0.717,0.847,4.537,0.932,0.965,2.308,...,4.705,0.808,0.899,1.318,0.353,0.594,1.705,0.799,0.894,1.568


Results for experiment `disease_type=all failure-type-ignored=any-error for app=Mediktor`, which has 41 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.276,0.048,0.219,0.162,0.03,0.174,0.197,0.028,0.168,0.285,...,2.04,0.108,0.329,0.659,0.176,0.42,0.655,0.122,0.349,0.68
stats_for_recall,0.805,0.157,0.396,0.634,0.232,0.482,0.78,0.171,0.414,0.561,...,2.609,0.046,0.215,0.756,0.184,0.429,0.902,0.088,0.297,0.87
stats_for_f1-score,0.411,,,0.258,,,0.315,,,0.378,...,2.286,,,0.704,,,0.759,,,0.762
stats_for_f2-score,0.582,,,0.401,,,0.49,,,0.47,...,2.468,,,0.734,,,0.839,,,0.823
stats_for_NDCG,0.717,0.156,0.395,0.52,0.191,0.437,0.667,0.164,0.405,0.494,...,2.489,0.064,0.253,0.747,0.183,0.428,0.875,0.092,0.303,0.83
stats_for_M1,0.61,0.238,0.488,0.39,0.238,0.488,0.537,0.249,0.499,0.415,...,2.293,0.196,0.443,0.732,0.196,0.443,0.829,0.142,0.376,0.764
stats_for_M3,0.756,0.184,0.429,0.561,0.246,0.496,0.707,0.207,0.455,0.561,...,2.609,0.046,0.215,0.756,0.184,0.429,0.902,0.088,0.297,0.87
stats_for_M5,0.805,0.157,0.396,0.634,0.232,0.482,0.78,0.171,0.414,0.561,...,2.609,0.046,0.215,0.756,0.184,0.429,0.902,0.088,0.297,0.87
stats_for_position,1.424,0.729,0.854,1.731,1.12,1.058,1.625,1.297,1.139,1.435,...,3.369,0.242,0.492,1.032,0.031,0.177,1.081,0.075,0.273,1.123
stats_for_length (x of gs),3.634,1.403,1.184,4.571,0.759,0.871,4.526,0.986,0.993,2.306,...,4.609,0.817,0.904,1.268,0.343,0.585,1.707,0.792,0.89,1.536


Results for experiment `disease_type=all failure-type-ignored=no-disease-found for app=Symptomate`, which has 43 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.286,0.047,0.216,0.17,0.029,0.169,0.203,0.026,0.162,0.31,...,1.996,0.107,0.327,0.64,0.167,0.408,0.656,0.121,0.348,0.665
stats_for_recall,0.814,0.151,0.389,0.674,0.22,0.469,0.814,0.151,0.389,0.605,...,2.627,0.044,0.211,0.767,0.178,0.422,0.907,0.084,0.29,0.876
stats_for_f1-score,0.423,,,0.272,,,0.325,,,0.41,...,2.266,,,0.698,,,0.761,,,0.755
stats_for_f2-score,0.594,,,0.423,,,0.508,,,0.508,...,2.47,,,0.738,,,0.843,,,0.823
stats_for_NDCG,0.742,0.153,0.391,0.565,0.19,0.435,0.706,0.152,0.39,0.532,...,2.505,0.062,0.248,0.759,0.178,0.421,0.873,0.089,0.299,0.835
stats_for_M1,0.651,0.227,0.477,0.442,0.247,0.497,0.581,0.243,0.493,0.442,...,2.302,0.19,0.436,0.744,0.19,0.436,0.814,0.151,0.389,0.767
stats_for_M3,0.767,0.178,0.422,0.605,0.239,0.489,0.744,0.19,0.436,0.605,...,2.627,0.044,0.211,0.767,0.178,0.422,0.907,0.084,0.29,0.876
stats_for_M5,0.814,0.151,0.389,0.674,0.22,0.469,0.814,0.151,0.389,0.605,...,2.627,0.044,0.211,0.767,0.178,0.422,0.907,0.084,0.29,0.876
stats_for_position,1.343,0.625,0.791,1.655,1.054,1.026,1.571,1.216,1.103,1.423,...,3.377,0.233,0.483,1.03,0.029,0.171,1.103,0.092,0.303,1.126
stats_for_length (x of gs),3.535,1.365,1.168,4.579,0.717,0.847,4.537,0.932,0.965,2.342,...,4.745,0.816,0.903,1.326,0.359,0.599,1.721,0.806,0.898,1.582


Results for experiment `disease_type=all failure-type-ignored=session-failed for app=Symptomate`, which has 42 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.293,0.046,0.214,0.174,0.029,0.169,0.208,0.026,0.161,0.317,...,2.064,0.104,0.323,0.667,0.163,0.403,0.687,0.113,0.336,0.688
stats_for_recall,0.833,0.139,0.373,0.69,0.214,0.462,0.833,0.139,0.373,0.619,...,2.667,0.045,0.213,0.786,0.168,0.41,0.929,0.066,0.258,0.889
stats_for_f1-score,0.434,,,0.278,,,0.333,,,0.419,...,2.325,,,0.722,,,0.79,,,0.775
stats_for_f2-score,0.609,,,0.433,,,0.52,,,0.52,...,2.518,,,0.759,,,0.868,,,0.839
stats_for_NDCG,0.747,0.143,0.378,0.579,0.186,0.432,0.723,0.144,0.379,0.545,...,2.549,0.062,0.248,0.777,0.168,0.41,0.893,0.073,0.27,0.85
stats_for_M1,0.643,0.23,0.479,0.452,0.248,0.498,0.595,0.241,0.491,0.452,...,2.357,0.181,0.426,0.762,0.181,0.426,0.833,0.139,0.373,0.786
stats_for_M3,0.786,0.168,0.41,0.619,0.236,0.486,0.762,0.181,0.426,0.619,...,2.667,0.045,0.213,0.786,0.168,0.41,0.929,0.066,0.258,0.889
stats_for_M5,0.833,0.139,0.373,0.69,0.214,0.462,0.833,0.139,0.373,0.619,...,2.667,0.045,0.213,0.786,0.168,0.41,0.929,0.066,0.258,0.889
stats_for_position,1.4,0.697,0.835,1.655,1.054,1.026,1.571,1.216,1.103,1.423,...,3.358,0.224,0.474,1.03,0.029,0.171,1.103,0.092,0.303,1.119
stats_for_length (x of gs),3.5,1.345,1.16,4.556,0.747,0.864,4.513,0.968,0.984,2.308,...,4.548,0.706,0.84,1.238,0.181,0.426,1.667,0.794,0.891,1.516


Results for experiment `disease_type=all failure-type-ignored=any-error for app=Symptomate`, which has 41 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.295,0.047,0.216,0.178,0.029,0.169,0.213,0.025,0.159,0.325,...,2.042,0.105,0.324,0.659,0.164,0.405,0.68,0.114,0.337,0.681
stats_for_recall,0.829,0.142,0.376,0.707,0.207,0.455,0.854,0.125,0.353,0.634,...,2.658,0.046,0.215,0.78,0.171,0.414,0.927,0.068,0.26,0.886
stats_for_f1-score,0.435,,,0.284,,,0.341,,,0.43,...,2.307,,,0.714,,,0.785,,,0.769
stats_for_f2-score,0.609,,,0.443,,,0.533,,,0.533,...,2.504,,,0.752,,,0.864,,,0.835
stats_for_NDCG,0.753,0.145,0.381,0.593,0.182,0.427,0.74,0.134,0.366,0.558,...,2.538,0.063,0.251,0.771,0.171,0.413,0.891,0.075,0.273,0.846
stats_for_M1,0.659,0.225,0.474,0.463,0.249,0.499,0.61,0.238,0.488,0.463,...,2.341,0.184,0.429,0.756,0.184,0.429,0.829,0.142,0.376,0.78
stats_for_M3,0.78,0.171,0.414,0.634,0.232,0.482,0.78,0.171,0.414,0.634,...,2.658,0.046,0.215,0.78,0.171,0.414,0.927,0.068,0.26,0.886
stats_for_M5,0.829,0.142,0.376,0.707,0.207,0.455,0.854,0.125,0.353,0.634,...,2.658,0.046,0.215,0.78,0.171,0.414,0.927,0.068,0.26,0.886
stats_for_position,1.353,0.64,0.8,1.655,1.054,1.026,1.571,1.216,1.103,1.423,...,3.367,0.229,0.478,1.031,0.03,0.174,1.105,0.094,0.307,1.122
stats_for_length (x of gs),3.463,1.322,1.15,4.556,0.747,0.864,4.513,0.968,0.984,2.342,...,4.586,0.713,0.844,1.244,0.184,0.429,1.683,0.802,0.895,1.529


Results for experiment `disease_type=all failure-type-ignored=no-disease-found for app=WebMD`, which has 44 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.284,0.046,0.214,0.166,0.029,0.169,0.198,0.027,0.163,0.303,...,2.018,0.106,0.326,0.648,0.166,0.407,0.664,0.121,0.348,0.673
stats_for_recall,0.818,0.149,0.386,0.659,0.225,0.474,0.795,0.163,0.403,0.591,...,2.637,0.043,0.208,0.773,0.176,0.419,0.909,0.083,0.287,0.879
stats_for_f1-score,0.422,,,0.265,,,0.317,,,0.401,...,2.284,,,0.705,,,0.767,,,0.761
stats_for_f2-score,0.594,,,0.413,,,0.496,,,0.497,...,2.483,,,0.744,,,0.847,,,0.828
stats_for_NDCG,0.736,0.151,0.388,0.553,0.192,0.439,0.69,0.16,0.4,0.52,...,2.516,0.061,0.246,0.764,0.175,0.418,0.876,0.088,0.296,0.839
stats_for_M1,0.636,0.231,0.481,0.432,0.245,0.495,0.568,0.245,0.495,0.432,...,2.318,0.188,0.433,0.75,0.188,0.433,0.818,0.149,0.386,0.773
stats_for_M3,0.773,0.176,0.419,0.591,0.242,0.492,0.727,0.198,0.445,0.591,...,2.637,0.043,0.208,0.773,0.176,0.419,0.909,0.083,0.287,0.879
stats_for_M5,0.818,0.149,0.386,0.659,0.225,0.474,0.795,0.163,0.403,0.591,...,2.637,0.043,0.208,0.773,0.176,0.419,0.909,0.083,0.287,0.879
stats_for_position,1.389,0.682,0.826,1.655,1.054,1.026,1.571,1.216,1.103,1.423,...,3.367,0.229,0.479,1.029,0.029,0.169,1.1,0.09,0.3,1.122
stats_for_length (x of gs),3.568,1.382,1.175,4.579,0.717,0.847,4.537,0.932,0.965,2.308,...,4.705,0.808,0.899,1.318,0.353,0.594,1.705,0.799,0.894,1.568


Results for experiment `disease_type=all failure-type-ignored=session-failed for app=WebMD`, which has 43 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.279,0.046,0.214,0.164,0.029,0.17,0.197,0.027,0.164,0.31,...,2.03,0.107,0.327,0.663,0.16,0.399,0.667,0.123,0.351,0.677
stats_for_recall,0.814,0.151,0.389,0.651,0.227,0.477,0.791,0.165,0.407,0.605,...,2.651,0.044,0.211,0.791,0.165,0.407,0.907,0.084,0.29,0.884
stats_for_f1-score,0.416,,,0.262,,,0.315,,,0.41,...,2.297,,,0.721,,,0.769,,,0.766
stats_for_f2-score,0.588,,,0.408,,,0.493,,,0.508,...,2.497,,,0.762,,,0.846,,,0.832
stats_for_NDCG,0.738,0.154,0.392,0.551,0.197,0.443,0.691,0.163,0.404,0.532,...,2.528,0.062,0.248,0.782,0.165,0.406,0.873,0.089,0.299,0.843
stats_for_M1,0.651,0.227,0.477,0.442,0.247,0.497,0.581,0.243,0.493,0.442,...,2.325,0.19,0.436,0.767,0.178,0.422,0.814,0.151,0.389,0.775
stats_for_M3,0.767,0.178,0.422,0.581,0.243,0.493,0.721,0.201,0.449,0.605,...,2.651,0.044,0.211,0.791,0.165,0.407,0.907,0.084,0.29,0.884
stats_for_M5,0.814,0.151,0.389,0.651,0.227,0.477,0.791,0.165,0.407,0.605,...,2.651,0.044,0.211,0.791,0.165,0.407,0.907,0.084,0.29,0.884
stats_for_position,1.371,0.691,0.831,1.643,1.087,1.042,1.559,1.247,1.116,1.423,...,3.376,0.233,0.483,1.029,0.029,0.169,1.103,0.092,0.303,1.125
stats_for_length (x of gs),3.605,1.355,1.164,4.595,0.728,0.853,4.55,0.948,0.973,2.308,...,4.722,0.816,0.903,1.326,0.359,0.599,1.698,0.816,0.903,1.574


Results for experiment `disease_type=all failure-type-ignored=any-error for app=WebMD`, which has 43 cases, is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.279,0.046,0.214,0.164,0.029,0.17,0.197,0.027,0.164,0.31,...,2.03,0.107,0.327,0.663,0.16,0.399,0.667,0.123,0.351,0.677
stats_for_recall,0.814,0.151,0.389,0.651,0.227,0.477,0.791,0.165,0.407,0.605,...,2.651,0.044,0.211,0.791,0.165,0.407,0.907,0.084,0.29,0.884
stats_for_f1-score,0.416,,,0.262,,,0.315,,,0.41,...,2.297,,,0.721,,,0.769,,,0.766
stats_for_f2-score,0.588,,,0.408,,,0.493,,,0.508,...,2.497,,,0.762,,,0.846,,,0.832
stats_for_NDCG,0.738,0.154,0.392,0.551,0.197,0.443,0.691,0.163,0.404,0.532,...,2.528,0.062,0.248,0.782,0.165,0.406,0.873,0.089,0.299,0.843
stats_for_M1,0.651,0.227,0.477,0.442,0.247,0.497,0.581,0.243,0.493,0.442,...,2.325,0.19,0.436,0.767,0.178,0.422,0.814,0.151,0.389,0.775
stats_for_M3,0.767,0.178,0.422,0.581,0.243,0.493,0.721,0.201,0.449,0.605,...,2.651,0.044,0.211,0.791,0.165,0.407,0.907,0.084,0.29,0.884
stats_for_M5,0.814,0.151,0.389,0.651,0.227,0.477,0.791,0.165,0.407,0.605,...,2.651,0.044,0.211,0.791,0.165,0.407,0.907,0.084,0.29,0.884
stats_for_position,1.371,0.691,0.831,1.643,1.087,1.042,1.559,1.247,1.116,1.423,...,3.376,0.233,0.483,1.029,0.029,0.169,1.103,0.092,0.303,1.125
stats_for_length (x of gs),3.605,1.355,1.164,4.595,0.728,0.853,4.55,0.948,0.973,2.308,...,4.722,0.816,0.903,1.326,0.359,0.599,1.698,0.816,0.903,1.574


Now let us combine the individual results of the apps. We take the app results from different experiments and combine them. Our goal is to get the best results for each app and compare them.

In [15]:
bestResults = defaultdict(lambda : list(results.values())[0].copy())

for app in caseClassification['apps']:
    try:
        bestResults[getExpName(DiseaseType.ALL, FailureType.ANY, "best")][app] = results[
            getExpName(DiseaseType.ALL, FailureType.ANY, app)][app]
        bestResults[getExpName(DiseaseType.ALL, FailureType.SES_FAIL, "best")][app] = results[
            getExpName(DiseaseType.ALL, FailureType.ANY, app)][app]
        bestResults[getExpName(DiseaseType.ALL, FailureType.NO_DDX, "best")][app] = results[
            getExpName(DiseaseType.ALL, FailureType.ANY, app)][app]
    except Exception as e:
        display(getExpName(DiseaseType.ALL, FailureType.ANY, app))
        display(results[getExpName(DiseaseType.ALL, FailureType.ANY, app)])
        raise e

In [16]:
displayResults(bestResults,printNumCases=False)

Results for experiment `disease_type=all failure-type-ignored=any-error for app=best` is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.291,0.052,0.227,0.192,0.028,0.167,0.213,0.025,0.159,0.342,...,2.074,0.095,0.308,0.649,0.174,0.417,0.695,0.115,0.34,0.691
stats_for_recall,0.811,0.153,0.392,0.763,0.181,0.425,0.854,0.125,0.353,0.667,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_f1-score,0.428,,,0.307,,,0.341,,,0.452,...,2.324,,,0.699,,,0.791,,,0.775
stats_for_f2-score,0.597,,,0.478,,,0.533,,,0.56,...,2.508,,,0.733,,,0.863,,,0.836
stats_for_NDCG,0.744,0.152,0.39,0.64,0.167,0.408,0.74,0.134,0.366,0.587,...,2.526,0.047,0.218,0.747,0.183,0.428,0.889,0.08,0.282,0.842
stats_for_M1,0.649,0.228,0.477,0.5,0.25,0.5,0.61,0.238,0.488,0.487,...,2.325,0.184,0.429,0.73,0.197,0.444,0.838,0.136,0.369,0.775
stats_for_M3,0.811,0.153,0.392,0.684,0.216,0.465,0.78,0.171,0.414,0.667,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_M5,0.811,0.153,0.392,0.763,0.181,0.425,0.854,0.125,0.353,0.667,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_position,1.267,0.329,0.573,1.655,1.054,1.026,1.571,1.216,1.103,1.423,...,3.374,0.243,0.493,1.036,0.034,0.186,1.088,0.08,0.284,1.125
stats_for_length (x of gs),3.514,1.439,1.2,4.579,0.717,0.847,4.537,0.932,0.965,2.308,...,4.514,0.722,0.85,1.297,0.371,0.609,1.595,0.728,0.853,1.505


Results for experiment `disease_type=all failure-type-ignored=session-failed for app=best` is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.291,0.052,0.227,0.192,0.028,0.167,0.213,0.025,0.159,0.342,...,2.074,0.095,0.308,0.649,0.174,0.417,0.695,0.115,0.34,0.691
stats_for_recall,0.811,0.153,0.392,0.763,0.181,0.425,0.854,0.125,0.353,0.667,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_f1-score,0.428,,,0.307,,,0.341,,,0.452,...,2.324,,,0.699,,,0.791,,,0.775
stats_for_f2-score,0.597,,,0.478,,,0.533,,,0.56,...,2.508,,,0.733,,,0.863,,,0.836
stats_for_NDCG,0.744,0.152,0.39,0.64,0.167,0.408,0.74,0.134,0.366,0.587,...,2.526,0.047,0.218,0.747,0.183,0.428,0.889,0.08,0.282,0.842
stats_for_M1,0.649,0.228,0.477,0.5,0.25,0.5,0.61,0.238,0.488,0.487,...,2.325,0.184,0.429,0.73,0.197,0.444,0.838,0.136,0.369,0.775
stats_for_M3,0.811,0.153,0.392,0.684,0.216,0.465,0.78,0.171,0.414,0.667,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_M5,0.811,0.153,0.392,0.763,0.181,0.425,0.854,0.125,0.353,0.667,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_position,1.267,0.329,0.573,1.655,1.054,1.026,1.571,1.216,1.103,1.423,...,3.374,0.243,0.493,1.036,0.034,0.186,1.088,0.08,0.284,1.125
stats_for_length (x of gs),3.514,1.439,1.2,4.579,0.717,0.847,4.537,0.932,0.965,2.308,...,4.514,0.722,0.85,1.297,0.371,0.609,1.595,0.728,0.853,1.505


Results for experiment `disease_type=all failure-type-ignored=no-disease-found for app=best` is


Unnamed: 0_level_0,Ada,Ada,Ada,Avey,Avey,Avey,Avey v2,Avey v2,Avey v2,Buoy,...,doctor_MA,doctor_MA,doctor_MA,doctor_NJ,doctor_NJ,doctor_NJ,doctor_TH,doctor_TH,doctor_TH,average_doctor
Unnamed: 0_level_1,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,average,...,average,variance,std Dev,average,variance,std Dev,average,variance,std Dev,Unnamed: 21_level_1
stats_for_precision,0.291,0.052,0.227,0.192,0.028,0.167,0.213,0.025,0.159,0.342,...,2.074,0.095,0.308,0.649,0.174,0.417,0.695,0.115,0.34,0.691
stats_for_recall,0.811,0.153,0.392,0.763,0.181,0.425,0.854,0.125,0.353,0.667,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_f1-score,0.428,,,0.307,,,0.341,,,0.452,...,2.324,,,0.699,,,0.791,,,0.775
stats_for_f2-score,0.597,,,0.478,,,0.533,,,0.56,...,2.508,,,0.733,,,0.863,,,0.836
stats_for_NDCG,0.744,0.152,0.39,0.64,0.167,0.408,0.74,0.134,0.366,0.587,...,2.526,0.047,0.218,0.747,0.183,0.428,0.889,0.08,0.282,0.842
stats_for_M1,0.649,0.228,0.477,0.5,0.25,0.5,0.61,0.238,0.488,0.487,...,2.325,0.184,0.429,0.73,0.197,0.444,0.838,0.136,0.369,0.775
stats_for_M3,0.811,0.153,0.392,0.684,0.216,0.465,0.78,0.171,0.414,0.667,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_M5,0.811,0.153,0.392,0.763,0.181,0.425,0.854,0.125,0.353,0.667,...,2.649,0.026,0.162,0.757,0.184,0.429,0.919,0.075,0.273,0.883
stats_for_position,1.267,0.329,0.573,1.655,1.054,1.026,1.571,1.216,1.103,1.423,...,3.374,0.243,0.493,1.036,0.034,0.186,1.088,0.08,0.284,1.125
stats_for_length (x of gs),3.514,1.439,1.2,4.579,0.717,0.847,4.537,0.932,0.965,2.308,...,4.514,0.722,0.85,1.297,0.371,0.609,1.595,0.728,0.853,1.505
