# Unsupervised Machine Scoring of Free Response Answers—Validated Against Law School Final Exams

This notebook implements the methods described in [Unsupervised Machine Scoring of Free Response Answers—Validated Against Law School Final Exams](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4040303), presented at the [Computational Legal Studies Conference](https://cclaw.smu.edu.sg/events/computational-legal-studies-2022), March 2022, hosted by the Center for Computational Law at Singapore Management University. Here is a link to the presentation's [slide deck](https://docs.google.com/presentation/d/15fxG3zoZSdUfmxFuVQZd2_MrDe1Z1rcsXwBDIvFIy3U/edit?usp=sharing), including image credits. 

Some of the data presented here differ slightly from that found in the version presented at the CLS Conference and in the slide deck above. This is due to additional work done in response to feedback received after presenting the paper. They now show the difference in performance between pseudo-random and machine ordering _after_ both machine and human markings are converted into z-scores. The translation of both scores into z-scores allowed the machine score to be compared with the human score using intraclass correlation ICC and Cohen's kappa. The CLS presentation compared the machine scores only to pseudo-random scores without translating the human scoring and after the machine score was transformed into a numerical score based on a standard grading scale (e.g., 90, 80, etc.). Older versions of this notebook with prior results can be found [here](https://github.com/colarusso/free-response-scoring/commits/main/Score%20Exams.ipynb).

**Paper Summary**

> This paper presents a novel method for unsupervised machine scoring of short answer and essay question responses, relying solely on a sufficiently large set of responses to a common prompt, absent the need for pre-labeled sample answers—given said prompt is of a particular character. That is, for questions where “good” answers look similar, “wrong” answers are likely to be “wrong” in different ways. Consequently, when a collection of text embeddings for responses to a common prompt are placed in an appropriate feature space, the centroid of their placements can stand in for a model answer, providing a lodestar against which to measure individual responses. This paper examines the efficacy of this method and discusses potential applications.
>
>Current methods for the automated scoring of short answer and essay questions are poorly suited to spontaneous and idiosyncratic assessments. That is, the time saved in grading must be balanced against the time required for the training of a model. This includes tasks such as the creation of pre-labeled sample answers. This limits the utility of machine grading for single classes working with novel assessments. The method described here eliminates the need for the preparation of pre-labeled sample answers. It is the author’s hope that such a method may be leveraged to reduce the time needed to grade free response questions, promoting the increased adoption of formative assessment esp. in contexts like law school instruction which traditionally have relied almost exclusively on summative assessments.
>
>Ranking by the algorithm is found to be statistically significant when compared to a pseudo-random shuffle. To determine how similar a list’s order was to that produced by a human grader, the lowest number of neighbor swaps needed to transform the ordering of these lists into that of the human ordering was calculated. For a dataset including more than one thousand student answers to a set of thirteen free response questions, drawn from six Suffolk University Law School final exams, taught by five instructors, the p-value for a paired t-test of the two populations’ swaps, with the pseudo-random group acting as the untreated group and the machine-grader acting as the treatment, came to 0.000000334, allowing us to reject the null hypothesis that the machine's ordering is equivalent to a random shuffle. Additionally, the Cohen’s d for the number of swaps between the pseudo-random ordering and machine ordering was found to be large (i.e., 1.03).

This notebook was used to obtain the results referenced above. Student exam answers and their associated grade data were acquired and this research conducted after approval was granted by Suffolk University’s Office of Research and Sponsored Programs (ORSP) which oversees all human subject research at Suffolk University.

This notebook implements the primary steps of the method described in the paper, namely:  

1. Produce an embedding for each answer that captures as much of the relevant information as possible.
2. Find the centroid for all of the embeddings in your population of answers and impute the location of a  “correct” model answer.
3. Measure the distance between each answer’s embedding and the “correct” answer (e.g., the answers’ centroid or medoid).
4. Convert the answers’ distances from the model answer into z-scores for the population of answers.
5. Translate these z-scores into some known grading scale.
6. Order the answers according to this scale.
7. Compare these orderings to random orderings by seeing how many times you have to change their rankings to obtain the same ordering as one would get if they were ordered by their human-assigned grades.

_Note: The methods described here are the subject of USPTO Patent Application Serial No. 17/246,563._

## Contents

_These, and other internal links (e.g., "back to contents"), may not work if you are viewing the preview of this notebook on GitHub.com. Links to other notebooks, external sites, and exam questions should, however, be okay._

This notebook has three sections:

- [Data: Student Exam Answers](#Data:-Student-Exam-Answers)
- [Code:Code: Libraries and Novel Functions](#Code:-Libraries-and-Novel-Functions)
- [Results](#Results)


## Data: Student Exam Answers

More than one thousand exam answers were obtained as PDF files containing answers to thirteen free response questions, drawn from six Suffolk University Law School final exams, taught by five instructors. Each PDF corresponded to a single student exam. These PDFs were parsed to extract their answers, their contents converted into XML of the following format.

```
<EXAM>
    <STUDENT id='ID'>00000000</STUDENT>
    <QUESTION id='Q1'>
        <![CDATA[
            text of written answer to question one
        ]]>
    </QUESTION>
    <QUESTION id='Q2'>
        <![CDATA[
            text of written answer to question two
        ]]>
    </QUESTION>
    <QUESTION id='Q3'>
        <![CDATA[
            text of written answer to question three
        ]]>
    </QUESTION>
</EXAM>
```

These translated files were reviewed by hand and reformatted as needed to correct any formatting errors. Each XML file was then read into a csv file for its associated exam as a single row with their columns corresponding to each of its elements (e.g., ID, Q1, Q2). Additionally, a column stating the number of words contained in each question was appended to the csv file (e.g., size_Q1, size_Q2). E.g., 

|ID|Q1|Q2|size_Q1|size_Q2|
|--|--|--|-------|-------|
|00001|text of 1's ans to q1|text of 1's ans to q2|6|6|
|00002|text of 2's ans to q1|text of 2's ans to q2|6|6|

Instructors also provided scores for each exam question. These were placed in a csv for each exam with the scores on each question associated to the exam ID. E.g., 

|ID|Q1|Q2|
|--|--|--|
|00001|96|93|
|00002|83|89|

A PDF did not extist for every exam as some exams were hand-written. Also, for the _Property Instructor A_ exam, the IDs of two PDFs did not macth any of the IDs provided in the score sheet. To address such issues, you will see that we merge the two data tables above using the default behavior of Panda's merge method (i.e., we kept only the intersection of ID, those found in both files). Consequently, we did not score the exams for which no PDF existed or those for which we could not find a score on the intructor-provided list of scores.

To see the code used to assist in this processing see the following notebook: [Prep Exams](Prep%20Exams.ipynb).

In keeping with the wishes of those instructors who provided exams, only three of the exams are available here for review (i.e., [property_instructor_A](https://colarusso.github.io/free-response-scoring/data/property_instructor_A/property_instructor_A.docx), [property_instructor_B](https://colarusso.github.io/free-response-scoring/data/property_instructor_B/property_instructor_B.docx), and [crim_instructor_E](https://colarusso.github.io/free-response-scoring/data/crim_instructor_E/crim_instructor_E.docx)). Another two are on file with the author and may be shared upon request and the assent of their authors. In keeping with the author's wishes, the remaining exam will not be shared. Subject to constraints imposed by the Family Educational Rights and Privacy Act (FERPA), at least three of the answer sets may be shared upon request. One of these is linked to an exam requiring instructor assent to be shared. I am in the process of reviewing and cleaning these data for PII. You can see those data that have been cleaned and shared without need for instructor assent [here](https://github.com/colarusso/free-response-scoring/tree/main/data) and track my progress [here](https://twitter.com/Colarusso/status/1494781654237425666?s=20&t=g0Zt5f3wDnVys6oH3nVSGQ).

Please note that the exam questiones shared at the links above are docx files and have been redacted to exclude the instructor's name and text that does not include the scored question prompts (e.g., multiple choice questions). 

To facilitate retrieval of the exam answers which were stored in various folders, a list of dictionaries is defined. Each dictionary defines the folder name where csv files can be found as well as the names of its relevant columns for later consideration (i.e., their ID and those questions to be scored). To avoid sharing of this data publicly, as seen below, these files are not included in this repository and were located outside of this repository's folder (i.e., `../data/`). 

In [1]:
exams = [
            {"folder":"../data/property_instructor_A","columns":["ID","SHORT_ANS","Q1","Q2"]},
            {"folder":"../data/property_instructor_B","columns":["ID","Q1","Q2"]},
            {"folder":"../data/environ_instructor_B","columns":["ID","Q2"]},
            {"folder":"../data/PR_instructor_C","columns":["ID","Q1","Q2"]},
            {"folder":"../data/contracts_instructor_D","columns":["ID","Q1","Q2","Q3"]},
            {"folder":"../data/crim_instructor_E","columns":["ID","Q1","Q2"]}
          ]

[back to contents](#Contents)

## Code: Libraries and Novel Functions

### Python Libraries et al.

First we'll load the following libraries. These are needed to perform our scoring. 

In [2]:
import csv
import pandas as pd
import numpy as np
from numpy import var
from math import sqrt
import spacy
from scipy import stats
from sklearn.preprocessing import normalize
from sklearn.metrics import cohen_kappa_score
import pingouin as pg

# https://spacy.io/models/en#en_core_web_lg
nlp1 = spacy.load('en_core_web_lg') 
# https://github.com/explosion/spacy-transformers/tree/88814f5f4be7f0d4c784d8500c558d9ba06b9a56
nlp2 = spacy.load("en_trf_distilbertbaseuncased_lg") 
nlp3 = spacy.load("en_trf_robertabase_lg") 

In [3]:
def score_exams(exams,model='nlp1',normv=0,score=0,goal="centroid",runs=1):

    # list of swaps needed to move from machine-scored ordering to that of the graded exams
    M = []  
    # list of swaps needed to move from machine-scored ordering to that of the graded exams
    R = []

    # for every exam in the list of exams  
    for exam in exams:

        print("\n=========================================================")
        print(exam["folder"])
        print("=========================================================\n")

        # load this exam's answer texts
        texts = pd.read_csv("%s/texts_redacted.csv"%exam["folder"], encoding="utf-8").sample(frac=1)
        # load this exam's grades
        actual = pd.read_csv("%s/actual_redacted.csv"%exam["folder"], encoding="utf-8").sample(frac=1)
        
        # print the relevant columns for this exam (i.e., ID and questions to score)
        print(exam["columns"])        

        # score the questions listed above
        output_df = grade(texts[texts["ID"].isin(actual["ID"])][exam["columns"]],model,normv,score,goal)
        
        if score==2:
            # convert human grades to z scores if score==2
            print("Computing z scores for human scoring")
            actual = z_scores_from_grades(actual)
            # create a df of the format needed for ICC
            df_output_df = output_df.copy()
            df_output_df["judge"] = "Machine"
            df_actual = actual.copy()
            df_actual["judge"] = "Human"
            df_icc = pd.concat([df_actual,df_output_df], ignore_index=True)

        # below we want the average word counts for questions
        # so we need a df with only the questions actually used here
        # i.e., the inner merge. output_df doesn't have the texts columns
        # so we'll do a second merge here real quick to facilitate that count
        texts_df = actual.merge(texts, on="ID")

        # only include grades for those questions defined above
        actual = actual[[x for x in actual.columns if x in exam["columns"]]]

        # merge the machine-scored dataframe with that of the actual grades
        df_exam = actual.merge(output_df, on="ID")
        
        # display a preview of the two scores (first few rows of the merged dataframe)
        # I'm excluding the ID here to avoid sharing IDs in the cell's output
        #display(df_exam[[x for x in list(df_exam.columns.values) if (x not in ["ID"])]].head())
        display(df_exam.head())

        print("Number of entries:",len(df_exam))

        # for each of the questions that were scored
        for qn in [x for x in exam["columns"] if (x not in ["ID"])]:
            mean_words = int(round(texts_df["size_%s"%qn].mean()))
            print("\n%s\nMean Words:"%qn,mean_words)

            # create a dataframe with the grade and machine scores for this question
            # drop any nans
            df = df_exam[["ID", "%s_x"%qn, "%s_y"%qn]].dropna()
            
            if score==2:
                # get ICC for score==2 (i.e., when both scores are z scores)
                df_icc_tmp = df_icc.copy()
                df_icc_tmp["score"] = df_icc_tmp[qn]
                df_icc_tmp = df_icc_tmp[["ID","judge","score"]]
                display(df_icc_tmp.head()) 
                icc = pg.intraclass_corr(data=df_icc_tmp, targets='ID', raters='judge', ratings='score',nan_policy='omit')
                display(icc.set_index('Type'))
                
                # Kappa isn't meant for continuous scoring, so we're binning the z score 
                # multipling by ten and rounding the product. 
                kappa = cohen_kappa_score(round(df["%s_x"%qn]*10), round(df["%s_y"%qn]*10),weights="quadratic")
                print("Cohen's kappa for rounded(z scores x 10):",kappa)

            # create an list of these scores then count the swaps needed to bring them into 
            # agreement with the grades, averaged over `runs`. Append this to our list of 
            # swaps needed to move from machine-scored ordering to that of the graded exams. 
            # we're shuffling before sorting to address the fact that the original order of
            # rows can effect the swaps as the order of actual scores could be different
            # within the bins provided by the machine grade 
            tmp = []
            for x in range(runs):
                # shuffle df then sort answers by the machine scores
                # this is needed becaue the `actual` rows can effect the number 
                # of swaps, so we want to get a represnative sample over `runs`
                # note the use of `sample.` This is where we shuffle the base df
                df = df.sample(frac=1).sort_values(by='%s_y'%qn, ascending=True)
                
                arr_ = df[df.columns[1]].values.copy()
                n = len(arr_)
                M_o = countSwaps(arr_, n)
                tmp.append(M_o)

            M_o = np.array(tmp).mean()
            M.append(M_o)

            # we're going to do the same as the above but for pseudo-random shuffles.
            # we'll shuffle and count swaps for the number of times defined in `runs`
            # and average the results 
            tmp = []
            for x in range(runs):
                # note the use of `sample.` This is where we get our pseudo-random ordering
                arr_ = df[df.columns[1]].sample(frac=1).values.copy()
                n = len(arr_)
                R_o = countSwaps(arr_, n)
                tmp.append(R_o)

            R_o = np.array(tmp).mean()
            R.append(R_o)
            
            print("Algo score, swaps needed:",M_o)
            print("Swaps needed for first 10 out of %s pseudo-random runs:"%runs)
            print(tmp[:10])
            print("Pseudo-random, avergae swaps needed:",R_o)
            
    print("\n=========================================================")
    print("Swaps (Machine):")
    print(M)
    print("N:",len(M),"\tMean: ",np.mean(M),"\tVar: ",np.var(M))
    print("Avergae swaps (Pseudo-random):")
    print(R)
    print("N:",len(R),"\tMean: ",np.mean(R),"\tVar: ",np.var(R),"\n")
    
    
    # Print out a summary table of the percent difference between pseudo-random and machine swaps
    pdiff = []
    i = 0
    for x in R:
        if x != 0 and M[i] != 0:
            pdiff_ = 100*(x-M[i])/((x+M[i])/2)
            pdiff.append(pdiff_)
            print( i+1,")",x,">",M[i],"=>",pdiff_ )
        else:
            print("error")
        i = i + 1

    print("\nAvergae percent difference between swaps needed for pseudo-random and machine sorting:")
    print(np.array(pdiff).mean(),"\n")
        
    # run t test for the two populations of swaps
    sample_n = len(M)
    print(stats.ttest_rel(M[:sample_n], R[:sample_n]))
    # return the effect size
    print("Cohen's d: ",cohend(R[:sample_n], M[:sample_n]))

In [4]:
def grade(df,model,normv,score=0,goal="centroid"):
       
    df = df.copy()    
    
    # exclude the ID column to get only the answer texts
    questions = [x for x in list(df.columns.values) if (x not in ["ID"])]
    # vectorize the text of each answer
    for x in questions:
        df["%s_vec"%x] = df[x].apply(vectorize, args=(model,normv)) 
    
    # for each question
    for x in questions:
        
        # based on the `goal` parameter grade, against either the centroid or medoid
        # that is, set either as the model answer aginst which to grade others
        if goal=="centroid":
            # centroid 
            best_ans = df["%s_vec"%x].mean()
        else:
            # medoid 
            best_ans = df["%s_vec"%x].iloc[best_answer(df,x)]

        # measure each answer's distance from the model answer above
        df["%s_dist"%x] = df["%s_vec"%x].apply(distance, args=(best_ans,))
        
        # replace the text of the answer with its z-score
        df[x] =  df["%s_vec"%x].apply(z_score, args=(df,x,best_ans))
        
        # translate the z-score into a standard grade using
        # the translation defined by `score`
        if score==0:
            df[x] =  df[x].apply(score_0)
        elif score==1:
            df[x] =  df[x].apply(score_1)
        elif score==2:
            print("Leaving z scores in place")
            
    # create output df without the vectors or distance measures
    output_df = df[[ x for x in list(df.columns.values) if ("_vec" not in x) and ("_dist" not in x) ]].copy()
            
    return output_df

In [5]:
def vectorize(row,model='nlp1',normv=0): 

    # use a method for vectorization based on the `model` parameter
    # note: some methods can work across the entire text of a qustion
    # others requier vectorization at the sentence level
    if model=='nlp1':
        nlp = nlp1
        mode = 0
    elif model=='nlp2':
        nlp = nlp2
        mode = 1
    elif model=='nlp3':
        nlp = nlp3
        mode = 1
        
    try:
        if mode == 1:
            # vectorize at the sentence level
            doc = nlp(row)
            tmp_df = pd.DataFrame({'sent' : []})
            i=0 
            for sent in doc.sents:
                sent_vec = nlp(sent.string.strip()).vector
                # if normv=1 normalize each sentence vector
                if normv==1:
                    sent_vec = norm(sent_vec)
                if i==0:
                    tmp_df = pd.DataFrame([[sent_vec]],columns=["sent"])
                else:
                    tmp_df = tmp_df.append(pd.DataFrame([[sent_vec]],columns=["sent"]), ignore_index=True)
                i+=1
            # average sentence vectors 
            output = tmp_df["sent"].mean()
        else:
            # vectorize all input text together
            output = nlp(row).vector

        # if normv=1 normalize the answer's vector
        if normv==1:
            output = norm(output)  
    
        return output
    except:
        return np.NaN
    
def norm(row):
    # normalize vector
    matrix = row.reshape(1,-1).astype(np.float64)
    return normalize(matrix, axis=1, norm='l1')[0]

In [6]:
def best_answer(df,column):
    # Return the medoid (like the centroid, but confined to a member of the group) 
    # i.e., the best answer in the group of answers.
    tmp_df = df[[ "%s_vec"%column ]].copy()
    centroid = tmp_df["%s_vec"%column].mean()
    tmp_df["%s_dist"%column] = tmp_df["%s_vec"%column].apply(distance, args=(centroid,))
    tmp_df = tmp_df[tmp_df["%s_dist"%column]==tmp_df["%s_dist"%column].min()]
    return tmp_df.index[0]

def distance(row,centroid):
    # calculate the Euclidean distance between centroid and row
    # See https://stackoverflow.com/a/1401828
    return np.linalg.norm(centroid-row)

In [7]:
def z_score(row,df,x,best):
    # calculate the z-score for a given distance
    return -1*(np.linalg.norm(best-row)-df["%s_dist"%x].mean())/df["%s_dist"%x].std()

def z_scores_from_grades(df):
    questions = [x for x in list(df.columns.values) if (x not in ["ID"])]
    for x in questions:
        df[x] = df[x].apply(z_score_tradional, args=(df,x))
    return df

def z_score_tradional(row,df,x):
    # calculate the z-score for a given score
    return ((row-df[x].mean())/df[x].std())

In [8]:
# Define grade bin boarders based on z-score
# --------------------------------------
# There's no standard mapping between z-scores and letter grades
# However, it's common for schools to set their own curve which 
# is in the same spirit. Therefor I have created a mapping where
# the average answer recieves a B. Others might choose differently

grade_bins = [
              2,    # A+
              1.5,  # A
              1,    # A-
              0.5,  # B+
              0,    # B
              -0.5, # B-
              -1,   # C+
              -1.5, # C
              -2,   # C-
              -2.5, # D+
              -3,   # D
              -3.5  # D-
             ]

# Assign numeric grades based on z-score
# -----------------------------------------
# This was the original z-score to percentage grade translation.
# I took the above and translated it into numberic grades
# with the scale bottoming out at 59 and topping out at 97
def score_0(row):
    if row >= grade_bins[0]:
        grade = 97
    elif row >= grade_bins[1]:
        grade = 93+((row-grade_bins[1])/(grade_bins[0]-grade_bins[1]))*4
    elif row >= grade_bins[2]:
        grade = 90+((row-grade_bins[2])/(grade_bins[1]-grade_bins[2]))*3
    elif row >= grade_bins[3]:
        grade = 87+((row-grade_bins[3])/(grade_bins[2]-grade_bins[3]))*3
    elif row >= grade_bins[4]:
        grade = 83+((row-grade_bins[4])/(grade_bins[3]-grade_bins[4]))*4
    elif row >= grade_bins[5]:
        grade = 80+((row-grade_bins[5])/(grade_bins[4]-grade_bins[5]))*3
    elif row >= grade_bins[6]:
        grade = 77+((row-grade_bins[6])/(grade_bins[5]-grade_bins[6]))*3
    elif row >= grade_bins[7]:
        grade = 73+((row-grade_bins[7])/(grade_bins[6]-grade_bins[7]))*4
    elif row >= grade_bins[8]:
        grade = 70+((row-grade_bins[8])/(grade_bins[7]-grade_bins[8]))*3
    elif row >= grade_bins[9]:
        grade = 67+((row-grade_bins[9])/(grade_bins[8]-grade_bins[9]))*3
    elif row >= grade_bins[10]:
        grade = 63+((row-grade_bins[10])/(grade_bins[9]-grade_bins[10]))*4
    elif row >= grade_bins[11]:
        grade = 60+(row-grade_bins[11])*(3/(grade_bins[10]-grade_bins[11]))
    else:
        grade = 59
        
    # The final output is rounded to avoid false percision. A better 
    # approach is to bin all of the scores. This is done in score_1 below
    return round(grade)

# Assign numeric grades based on z-score
# -----------------------------------------
# Upon subsequent reflection, I realized that binned scores were preferable
# to a continuous score with dropoffs at 59 and 97. The above translation,
# however, was used on the first run of this code and so is included here
# for compleatness.  
def score_1(row):
    if row >= grade_bins[0]:
        grade = 97
    elif row >= grade_bins[1]:
        grade = 93
    elif row >= grade_bins[2]:
        grade = 90
    elif row >= grade_bins[3]:
        grade = 87
    elif row >= grade_bins[4]:
        grade = 83
    elif row >= grade_bins[5]:
        grade = 80
    elif row >= grade_bins[6]:
        grade = 77
    elif row >= grade_bins[7]:
        grade = 73
    elif row >= grade_bins[8]:
        grade = 70
    elif row >= grade_bins[9]:
        grade = 67
    elif row >= grade_bins[10]:
        grade = 63
    elif row >= grade_bins[11]:
        grade = 60
    else:
        grade = 59
    return grade

In [9]:
# The following functions are used to count the number of adjacent swaps needed
# to change one ordering into another. It was written by Shivam Gupta
# See https://www.geeksforgeeks.org/number-swaps-sort-adjacent-swapping-allowed/
# -----------------------------------------------------
# python 3 program to count number of swaps required 
# to sort an array when only swapping of adjacent 
# elements is allowed. 
# include <bits/stdc++.h> 

#This function merges two sorted arrays and returns inversion count in the arrays.*/ 
def merge(arr, temp, left, mid, right): 
    inv_count = 0

    i = left #i is index for left subarray*/ 
    j = mid #i is index for right subarray*/ 
    k = left #i is index for resultant merged subarray*/ 
    while ((i <= mid - 1) and (j <= right)): 
        if (arr[i] <= arr[j]): 
            temp[k] = arr[i] 
            k += 1
            i += 1
        else: 
            temp[k] = arr[j] 
            k += 1
            j += 1

            #this is tricky -- see above explanation/ 
            # diagram for merge()*/ 
            inv_count = inv_count + (mid - i) 

    #Copy the remaining elements of left subarray 
    # (if there are any) to temp*/ 
    while (i <= mid - 1): 
        temp[k] = arr[i] 
        k += 1
        i += 1

    #Copy the remaining elements of right subarray 
    # (if there are any) to temp*/ 
    while (j <= right): 
        temp[k] = arr[j] 
        k += 1
        j += 1

    # Copy back the merged elements to original array*/ 
    for i in range(left,right+1,1): 
        arr[i] = temp[i] 

    return inv_count 

#An auxiliary recursive function that sorts the input 
# array and returns the number of inversions in the 
# array. */ 
def _mergeSort(arr, temp, left, right): 
    inv_count = 0
    if (right > left): 
        # Divide the array into two parts and call 
        #_mergeSortAndCountInv() 
        # for each of the parts */ 
        mid = int((right + left)/2) 

        #Inversion count will be sum of inversions in 
        # left-part, right-part and number of inversions 
        # in merging */ 
        inv_count = _mergeSort(arr, temp, left, mid) 
        inv_count += _mergeSort(arr, temp, mid+1, right) 

        # Merge the two parts*/ 
        inv_count += merge(arr, temp, left, mid+1, right) 

    return inv_count 

#This function sorts the input array and returns the 
#number of inversions in the array */ 
def countSwaps(arr, n): 
    temp = [0 for i in range(n)] 
    return _mergeSort(arr, temp, 0, n - 1) 

In [10]:
# The following function calculates Cohen d (the effect size)
# based on two populations. It was written by Jason Brownlee
# See https://machinelearningmastery.com/effect-size-measures-in-python/
# function to calculate Cohen's d for independent samples
def cohend(d1, d2):
    # calculate the size of samples
    n1, n2 = len(d1), len(d2)
    # calculate the variance of the samples
    s1, s2 = np.var(d1, ddof=1), np.var(d2, ddof=1)
    # calculate the pooled standard deviation
    s = np.sqrt(((n1 - 1) * s1 + (n2 - 1) * s2) / (n1 + n2 - 2))
    # calculate the means of the samples
    u1, u2 = np.mean(d1), np.mean(d2)
    # calculate the effect size
    return (u1 - u2) / s

[back to contents](#Contents)

# Results

A subset of the data presented in the cell below can be found in Table 1 of _Unsupervised Machine Scoring of Free Response Answers—Validated Against Law School Final Exams_. 

In [11]:
score_exams(exams,model='nlp1',normv=1,score=2,goal="centroid",runs=100000)


../data/property_instructor_A

['ID', 'SHORT_ANS', 'Q1', 'Q2']
Leaving z scores in place
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,SHORT_ANS_x,Q1_x,Q2_x,SHORT_ANS_y,Q1_y,Q2_y
0,f346f20d-4883-4fea-af93-f716f6e99274,1.521905,-0.147495,-0.386848,1.043495,0.453994,0.028075
1,ed56f2bc-52d8-4323-8f41-2cb80548a1dc,-0.159269,0.264623,0.650466,-0.487243,-1.628164,-0.336026
2,ddee1d35-ea9d-4e90-b6dd-4a9c5ff90aa0,-0.999856,-1.48688,-0.535036,0.636763,0.584401,0.496988
3,7b622388-4835-41b0-916e-bb4ca373d6c4,1.802101,0.264623,1.539592,0.835584,0.177698,0.284655
4,dcef7f09-f784-48cb-b65e-59d0293f4195,0.961514,0.676742,2.576906,0.462191,0.062287,0.393672


Number of entries: 81

SHORT_ANS
Mean Words: 436


Unnamed: 0,ID,judge,score
0,f346f20d-4883-4fea-af93-f716f6e99274,Human,1.521905
1,065476cf-e25b-4039-a5b0-69bfd27e61dd,Human,-0.999856
2,ed56f2bc-52d8-4323-8f41-2cb80548a1dc,Human,-0.159269
3,ddee1d35-ea9d-4e90-b6dd-4a9c5ff90aa0,Human,-0.999856
4,7b622388-4835-41b0-916e-bb4ca373d6c4,Human,1.802101


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.426259,2.485892,80,81,3e-05,"[0.23, 0.59]"
ICC2,Single random raters,0.42424,2.455869,80,80,4.1e-05,"[0.23, 0.59]"
ICC3,Single fixed raters,0.421274,2.455869,80,80,4.1e-05,"[0.22, 0.58]"
ICC1k,Average raters absolute,0.59773,2.485892,80,81,3e-05,"[0.38, 0.74]"
ICC2k,Average random raters,0.595742,2.455869,80,80,4.1e-05,"[0.37, 0.74]"
ICC3k,Average fixed raters,0.592812,2.455869,80,80,4.1e-05,"[0.37, 0.74]"


Cohen's kappa for rounded(z scores x 10): 0.446784149383303
Algo score, swaps needed: 998.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1624, 1536, 1402, 1685, 1434, 1744, 1401, 1249, 1551, 1652]
Pseudo-random, avergae swaps needed: 1499.68528

Q1
Mean Words: 2048


Unnamed: 0,ID,judge,score
0,f346f20d-4883-4fea-af93-f716f6e99274,Human,-0.147495
1,065476cf-e25b-4039-a5b0-69bfd27e61dd,Human,-0.250525
2,ed56f2bc-52d8-4323-8f41-2cb80548a1dc,Human,0.264623
3,ddee1d35-ea9d-4e90-b6dd-4a9c5ff90aa0,Human,-1.48688
4,7b622388-4835-41b0-916e-bb4ca373d6c4,Human,0.264623


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.506941,3.056311,80,81,5.281393e-07,"[0.33, 0.65]"
ICC2,Single random raters,0.505419,3.018623,80,80,7.604948e-07,"[0.32, 0.65]"
ICC3,Single fixed raters,0.502317,3.018623,80,80,7.604948e-07,"[0.32, 0.65]"
ICC1k,Average raters absolute,0.672808,3.056311,80,81,5.281393e-07,"[0.49, 0.79]"
ICC2k,Average random raters,0.671466,3.018623,80,80,7.604948e-07,"[0.49, 0.79]"
ICC3k,Average fixed raters,0.668723,3.018623,80,80,7.604948e-07,"[0.48, 0.79]"


Cohen's kappa for rounded(z scores x 10): 0.48698613796947476
Algo score, swaps needed: 1054.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1651, 1566, 1641, 1851, 1555, 1530, 1517, 1582, 1325, 1377]
Pseudo-random, avergae swaps needed: 1566.90241

Q2
Mean Words: 947


Unnamed: 0,ID,judge,score
0,f346f20d-4883-4fea-af93-f716f6e99274,Human,-0.386848
1,065476cf-e25b-4039-a5b0-69bfd27e61dd,Human,-0.386848
2,ed56f2bc-52d8-4323-8f41-2cb80548a1dc,Human,0.650466
3,ddee1d35-ea9d-4e90-b6dd-4a9c5ff90aa0,Human,-0.535036
4,7b622388-4835-41b0-916e-bb4ca373d6c4,Human,1.539592


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.418784,2.441062,80,81,4.2e-05,"[0.22, 0.58]"
ICC2,Single random raters,0.41699,2.415478,80,80,5.4e-05,"[0.22, 0.58]"
ICC3,Single fixed raters,0.41443,2.415478,80,80,5.4e-05,"[0.22, 0.58]"
ICC1k,Average raters absolute,0.590342,2.441062,80,81,4.2e-05,"[0.36, 0.74]"
ICC2k,Average random raters,0.588557,2.415478,80,80,5.4e-05,"[0.36, 0.74]"
ICC3k,Average fixed raters,0.586003,2.415478,80,80,5.4e-05,"[0.36, 0.73]"


Cohen's kappa for rounded(z scores x 10): 0.457167253080513
Algo score, swaps needed: 930.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1460, 1716, 1458, 1704, 1473, 1771, 1657, 1618, 1575, 1632]
Pseudo-random, avergae swaps needed: 1560.20248

../data/property_instructor_B

['ID', 'Q1', 'Q2']
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,972e8032-a55b-4c9a-a2ed-9913f124d716,0.196855,0.056514,1.228718,0.572495
1,0cd9b843-be11-4595-847f-4b89f6644253,0.314968,0.957513,-0.040083,1.398819
2,9db4494e-5ddd-4a7d-b3e9-031a33003a02,1.259871,1.858512,1.34842,1.511824
3,6726ea7e-8d3e-46f5-bb70-5aff7c5b6402,-1.811065,-1.144818,-0.192882,-0.551673
4,43da532e-504a-485a-8559-92b640834f06,-0.511823,0.056514,-0.365385,-1.464528


Number of entries: 88

Q1
Mean Words: 1431


Unnamed: 0,ID,judge,score
0,972e8032-a55b-4c9a-a2ed-9913f124d716,Human,0.196855
1,0cd9b843-be11-4595-847f-4b89f6644253,Human,0.314968
2,9db4494e-5ddd-4a7d-b3e9-031a33003a02,Human,1.259871
3,6726ea7e-8d3e-46f5-bb70-5aff7c5b6402,Human,-1.811065
4,43da532e-504a-485a-8559-92b640834f06,Human,-0.511823


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.435186,2.540989,87,88,9e-06,"[0.25, 0.59]"
ICC2,Single random raters,0.43344,2.513555,87,87,1.3e-05,"[0.25, 0.59]"
ICC3,Single fixed raters,0.430776,2.513555,87,87,1.3e-05,"[0.24, 0.59]"
ICC1k,Average raters absolute,0.606452,2.540989,87,88,9e-06,"[0.4, 0.74]"
ICC2k,Average random raters,0.604755,2.513555,87,87,1.3e-05,"[0.4, 0.74]"
ICC3k,Average fixed raters,0.602157,2.513555,87,87,1.3e-05,"[0.39, 0.74]"


Cohen's kappa for rounded(z scores x 10): 0.4602594296957241
Algo score, swaps needed: 1275.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1756, 1990, 1739, 1622, 1830, 2018, 1796, 1944, 1892, 1513]
Pseudo-random, avergae swaps needed: 1838.09328

Q2
Mean Words: 1546


Unnamed: 0,ID,judge,score
0,972e8032-a55b-4c9a-a2ed-9913f124d716,Human,0.056514
1,0cd9b843-be11-4595-847f-4b89f6644253,Human,0.957513
2,9db4494e-5ddd-4a7d-b3e9-031a33003a02,Human,1.858512
3,6726ea7e-8d3e-46f5-bb70-5aff7c5b6402,Human,-1.144818
4,43da532e-504a-485a-8559-92b640834f06,Human,0.056514


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.403045,2.350336,87,88,4.2e-05,"[0.21, 0.56]"
ICC2,Single random raters,0.401305,2.327666,87,87,5.4e-05,"[0.21, 0.56]"
ICC3,Single fixed raters,0.398978,2.327666,87,87,5.4e-05,"[0.21, 0.56]"
ICC1k,Average raters absolute,0.574529,2.350336,87,88,4.2e-05,"[0.35, 0.72]"
ICC2k,Average random raters,0.572759,2.327666,87,87,5.4e-05,"[0.35, 0.72]"
ICC3k,Average fixed raters,0.570385,2.327666,87,87,5.4e-05,"[0.34, 0.72]"


Cohen's kappa for rounded(z scores x 10): 0.4202472486054576
Algo score, swaps needed: 1224.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1793, 1597, 1899, 1601, 1798, 1784, 1459, 1663, 1785, 1709]
Pseudo-random, avergae swaps needed: 1756.86311

../data/environ_instructor_B

['ID', 'Q2']
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q2_x,Q2_y
0,3ad4aed5-023b-47c7-9ddf-3f2284a403b1,-1.853126,-1.931438
1,a0b1247e-71db-499f-a8e9-5191a179ac6c,-0.063175,-0.617613
2,740ba105-1c40-49be-b74d-e78072fdbf34,-0.421165,-1.437531
3,c1609ab5-b028-447c-8f26-60c711841a33,-1.017815,-2.002584
4,47e96791-9e80-4638-b63a-a00903ce8508,0.175485,-0.06581


Number of entries: 28

Q2
Mean Words: 2094


Unnamed: 0,ID,judge,score
0,3ad4aed5-023b-47c7-9ddf-3f2284a403b1,Human,-1.853126
1,a0b1247e-71db-499f-a8e9-5191a179ac6c,Human,-0.063175
2,740ba105-1c40-49be-b74d-e78072fdbf34,Human,-0.421165
3,c1609ab5-b028-447c-8f26-60c711841a33,Human,-1.017815
4,47e96791-9e80-4638-b63a-a00903ce8508,Human,0.175485


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.568949,3.639821,27,28,0.000549,"[0.26, 0.77]"
ICC2,Single random raters,0.566593,3.550264,27,27,0.000783,"[0.25, 0.77]"
ICC3,Single fixed raters,0.560465,3.550264,27,27,0.000783,"[0.24, 0.77]"
ICC1k,Average raters absolute,0.725261,3.639821,27,28,0.000549,"[0.41, 0.87]"
ICC2k,Average random raters,0.723344,3.550264,27,27,0.000783,"[0.4, 0.87]"
ICC3k,Average fixed raters,0.718331,3.550264,27,27,0.000783,"[0.39, 0.87]"


Cohen's kappa for rounded(z scores x 10): 0.5166799574694312
Algo score, swaps needed: 117.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[176, 126, 184, 230, 198, 205, 195, 191, 174, 168]
Pseudo-random, avergae swaps needed: 182.35202

../data/PR_instructor_C

['ID', 'Q1', 'Q2']
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,28f16501-0f48-4649-9348-caf3c3ce5d09,0.36319,0.880199,0.046741,0.260597
1,f7422fbb-df4a-4788-8d19-03ca5ab472f4,-2.134996,-0.498226,-1.464614,0.17787
2,fa4c4929-ae24-4cf5-a3dd-d77f754d900e,0.696281,-0.038751,1.148659,0.36082
3,22e8b506-0aa5-400a-aca0-ad1457dc1f19,0.529736,-1.417176,-0.333251,0.194901
4,af7980d7-0f7d-4600-82c5-14f34f16f53a,-1.96845,-1.417176,-0.99874,0.041338


Number of entries: 75

Q1
Mean Words: 936


Unnamed: 0,ID,judge,score
0,28f16501-0f48-4649-9348-caf3c3ce5d09,Human,0.36319
1,f7422fbb-df4a-4788-8d19-03ca5ab472f4,Human,-2.134996
2,fa4c4929-ae24-4cf5-a3dd-d77f754d900e,Human,0.696281
3,22e8b506-0aa5-400a-aca0-ad1457dc1f19,Human,0.529736
4,af7980d7-0f7d-4600-82c5-14f34f16f53a,Human,-1.96845


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.540244,3.350137,74,75,2.086765e-07,"[0.36, 0.68]"
ICC2,Single random raters,0.538824,3.305833,74,74,3.106074e-07,"[0.36, 0.68]"
ICC3,Single fixed raters,0.535514,3.305833,74,74,3.106074e-07,"[0.35, 0.68]"
ICC1k,Average raters absolute,0.701505,3.350137,74,75,2.086765e-07,"[0.53, 0.81]"
ICC2k,Average random raters,0.700306,3.305833,74,74,3.106074e-07,"[0.52, 0.81]"
ICC3k,Average fixed raters,0.697504,3.305833,74,74,3.106074e-07,"[0.52, 0.81]"


Cohen's kappa for rounded(z scores x 10): 0.5102469877573461
Algo score, swaps needed: 903.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1418, 1213, 1350, 1136, 1529, 1347, 1411, 1248, 1310, 1204]
Pseudo-random, avergae swaps needed: 1323.80177

Q2
Mean Words: 570


Unnamed: 0,ID,judge,score
0,28f16501-0f48-4649-9348-caf3c3ce5d09,Human,0.880199
1,f7422fbb-df4a-4788-8d19-03ca5ab472f4,Human,-0.498226
2,fa4c4929-ae24-4cf5-a3dd-d77f754d900e,Human,-0.038751
3,22e8b506-0aa5-400a-aca0-ad1457dc1f19,Human,-1.417176
4,af7980d7-0f7d-4600-82c5-14f34f16f53a,Human,-1.417176


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.177572,1.431825,74,75,0.061668,"[-0.05, 0.39]"
ICC2,Single random raters,0.173036,1.412976,74,74,0.069672,"[-0.06, 0.39]"
ICC3,Single fixed raters,0.171148,1.412976,74,74,0.069672,"[-0.06, 0.38]"
ICC1k,Average raters absolute,0.301591,1.431825,74,75,0.061668,"[-0.1, 0.56]"
ICC2k,Average random raters,0.295023,1.412976,74,74,0.069672,"[-0.12, 0.56]"
ICC3k,Average fixed raters,0.292274,1.412976,74,74,0.069672,"[-0.12, 0.55]"


Cohen's kappa for rounded(z scores x 10): 0.23952975753122707
Algo score, swaps needed: 830.99856
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1286, 1353, 1162, 1355, 1181, 1096, 1154, 1424, 1124, 1215]
Pseudo-random, avergae swaps needed: 1219.98775

../data/contracts_instructor_D

['ID', 'Q1', 'Q2', 'Q3']
Leaving z scores in place
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q1_x,Q2_x,Q3_x,Q1_y,Q2_y,Q3_y
0,737af13d-d5ba-4cf8-beaa-63bdc403d760,-0.61557,-2.523529,-0.659332,0.363257,-0.771901,-0.055214
1,39268066-b798-49dd-8159-eda5299389ef,0.963604,-1.08724,0.022388,0.771376,0.221689,0.074232
2,da6a7c2f-35f3-476f-a98e-c2445b5f08f4,1.650202,0.759417,1.385828,0.289398,1.190295,-0.01477
3,30234d01-ef3f-41bf-ae4a-1efaba3522d8,1.718862,1.477562,-0.075,1.05068,1.379937,0.777783
4,c1b34a08-528e-4824-85ac-1d2043c2bb32,-2.194745,-2.215753,-2.120159,-0.4768,-0.283172,-0.913744


Number of entries: 78

Q1
Mean Words: 1823


Unnamed: 0,ID,judge,score
0,737af13d-d5ba-4cf8-beaa-63bdc403d760,Human,-0.61557
1,39268066-b798-49dd-8159-eda5299389ef,Human,0.963604
2,b46d83cb-4d90-49b1-a40a-9a10cc909734,Human,-1.096189
3,da6a7c2f-35f3-476f-a98e-c2445b5f08f4,Human,1.650202
4,30234d01-ef3f-41bf-ae4a-1efaba3522d8,Human,1.718862


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.277498,1.768157,77,78,0.00646,"[0.06, 0.47]"
ICC2,Single random raters,0.274216,1.7463,77,77,0.007704,"[0.05, 0.47]"
ICC3,Single fixed raters,0.271748,1.7463,77,77,0.007704,"[0.05, 0.47]"
ICC1k,Average raters absolute,0.434439,1.768157,77,78,0.00646,"[0.11, 0.64]"
ICC2k,Average random raters,0.430408,1.7463,77,77,0.007704,"[0.1, 0.64]"
ICC3k,Average fixed raters,0.427361,1.7463,77,77,0.007704,"[0.1, 0.63]"


Cohen's kappa for rounded(z scores x 10): 0.2949785232758061
Algo score, swaps needed: 1176.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1354, 1635, 1289, 1425, 1436, 1514, 1458, 1396, 1537, 1441]
Pseudo-random, avergae swaps needed: 1472.10551

Q2
Mean Words: 1050


Unnamed: 0,ID,judge,score
0,737af13d-d5ba-4cf8-beaa-63bdc403d760,Human,-2.523529
1,39268066-b798-49dd-8159-eda5299389ef,Human,-1.08724
2,b46d83cb-4d90-49b1-a40a-9a10cc909734,Human,0.041273
3,da6a7c2f-35f3-476f-a98e-c2445b5f08f4,Human,0.759417
4,30234d01-ef3f-41bf-ae4a-1efaba3522d8,Human,1.477562


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.319052,1.937081,77,78,0.002003,"[0.11, 0.5]"
ICC2,Single random raters,0.316352,1.914873,77,77,0.002425,"[0.1, 0.5]"
ICC3,Single fixed raters,0.313864,1.914873,77,77,0.002425,"[0.1, 0.5]"
ICC1k,Average raters absolute,0.483759,1.937081,77,78,0.002003,"[0.19, 0.67]"
ICC2k,Average random raters,0.48065,1.914873,77,77,0.002425,"[0.18, 0.67]"
ICC3k,Average fixed raters,0.477772,1.914873,77,77,0.002425,"[0.18, 0.67]"


Cohen's kappa for rounded(z scores x 10): 0.3509009258745911
Algo score, swaps needed: 1043.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1484, 1394, 1340, 1355, 1287, 1507, 1680, 1411, 1312, 1324]
Pseudo-random, avergae swaps needed: 1449.9545

Q3
Mean Words: 839


Unnamed: 0,ID,judge,score
0,737af13d-d5ba-4cf8-beaa-63bdc403d760,Human,-0.659332
1,39268066-b798-49dd-8159-eda5299389ef,Human,0.022388
2,b46d83cb-4d90-49b1-a40a-9a10cc909734,Human,0.606719
3,da6a7c2f-35f3-476f-a98e-c2445b5f08f4,Human,1.385828
4,30234d01-ef3f-41bf-ae4a-1efaba3522d8,Human,-0.075


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.557776,3.522595,77,78,4.057886e-08,"[0.38, 0.69]"
ICC2,Single random raters,0.556504,3.477485,77,77,6.165675e-08,"[0.38, 0.69]"
ICC3,Single fixed raters,0.553321,3.477485,77,77,6.165675e-08,"[0.38, 0.69]"
ICC1k,Average raters absolute,0.716118,3.522595,77,78,4.057886e-08,"[0.56, 0.82]"
ICC2k,Average random raters,0.715069,3.477485,77,77,6.165675e-08,"[0.55, 0.82]"
ICC3k,Average fixed raters,0.712436,3.477485,77,77,6.165675e-08,"[0.55, 0.82]"


Cohen's kappa for rounded(z scores x 10): 0.6055882812907261
Algo score, swaps needed: 783.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1354, 1360, 1544, 1309, 1322, 1425, 1494, 1427, 1371, 1442]
Pseudo-random, avergae swaps needed: 1457.60233

../data/crim_instructor_E

['ID', 'Q1', 'Q2']
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,69b6ce9c-4cdb-4afd-9ace-a993777b08e1,-0.945323,-0.705991,0.137195,1.510483
1,e824f723-032e-4900-a6e5-a82e279eefc8,-1.281854,-0.705991,-0.251746,-0.178808
2,b79bcda6-c021-453b-96e9-fd6d58c20469,-0.945323,-1.284442,-0.273292,-0.944506
3,54273ba8-efd3-4d24-89aa-bb5f5cda0ee0,-0.608793,0.258094,0.841415,0.545715
4,afb1f54f-6525-497c-9642-32aca074c609,-1.618384,-1.284442,-1.047122,-0.123607


Number of entries: 92

Q1
Mean Words: 3353


Unnamed: 0,ID,judge,score
0,69b6ce9c-4cdb-4afd-9ace-a993777b08e1,Human,-0.945323
1,e824f723-032e-4900-a6e5-a82e279eefc8,Human,-1.281854
2,b79bcda6-c021-453b-96e9-fd6d58c20469,Human,-0.945323
3,54273ba8-efd3-4d24-89aa-bb5f5cda0ee0,Human,-0.608793
4,afb1f54f-6525-497c-9642-32aca074c609,Human,-1.618384


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.090017,1.197843,91,92,0.194626,"[-0.12, 0.29]"
ICC2,Single random raters,0.085672,1.185463,91,91,0.209323,"[-0.12, 0.29]"
ICC3,Single fixed raters,0.084862,1.185463,91,91,0.209323,"[-0.12, 0.28]"
ICC1k,Average raters absolute,0.165166,1.197843,91,92,0.194626,"[-0.26, 0.45]"
ICC2k,Average random raters,0.157823,1.185463,91,91,0.209323,"[-0.28, 0.44]"
ICC3k,Average fixed raters,0.156447,1.185463,91,91,0.209323,"[-0.28, 0.44]"


Cohen's kappa for rounded(z scores x 10): 0.15220449016453974
Algo score, swaps needed: 1705.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[2177, 1981, 2158, 1882, 2158, 2116, 2215, 2154, 1748, 1832]
Pseudo-random, avergae swaps needed: 2045.71394

Q2
Mean Words: 2137


Unnamed: 0,ID,judge,score
0,69b6ce9c-4cdb-4afd-9ace-a993777b08e1,Human,-0.705991
1,e824f723-032e-4900-a6e5-a82e279eefc8,Human,-0.705991
2,b79bcda6-c021-453b-96e9-fd6d58c20469,Human,-1.284442
3,54273ba8-efd3-4d24-89aa-bb5f5cda0ee0,Human,0.258094
4,afb1f54f-6525-497c-9642-32aca074c609,Human,-1.284442


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.26712,1.728959,91,92,0.0047,"[0.07, 0.45]"
ICC2,Single random raters,0.264758,1.713932,91,91,0.005423,"[0.06, 0.45]"
ICC3,Single fixed raters,0.263062,1.713932,91,91,0.005423,"[0.06, 0.44]"
ICC1k,Average raters absolute,0.421617,1.728959,91,92,0.0047,"[0.13, 0.62]"
ICC2k,Average random raters,0.418669,1.713932,91,91,0.005423,"[0.12, 0.62]"
ICC3k,Average fixed raters,0.416546,1.713932,91,91,0.005423,"[0.12, 0.61]"


Cohen's kappa for rounded(z scores x 10): 0.25302929011688136
Algo score, swaps needed: 1589.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1864, 1885, 2070, 2055, 2154, 1972, 1830, 2085, 1889, 1932]
Pseudo-random, avergae swaps needed: 1994.7508

Swaps (Machine):
[998.0, 1054.0, 930.0, 1275.0, 1224.0, 117.0, 903.0, 830.99856, 1176.0, 1043.0, 783.0, 1705.0, 1589.0]
N: 13 	Mean:  1048.3075815384616 	Var:  141907.95346760284
Avergae swaps (Pseudo-random):
[1499.68528, 1566.90241, 1560.20248, 1838.09328, 1756.86311, 182.35202, 1323.80177, 1219.98775, 1472.10551, 1449.9545, 1457.60233, 2045.71394, 1994.7508]
N: 13 	Mean:  1489.8473215384618 	Var:  198489.0188468986 

1 ) 1499.68528 > 998.0 => 40.17201718865076
2 ) 1566.90241 > 1054.0 => 39.13937489950265
3 ) 1560.20248 > 930.0 => 50.61455725479801
4 ) 1838.09328 > 1275.0 => 36.17580517857146
5 ) 1756.86311 > 1224.0 => 35.75226975115942
6 ) 182.35202 > 117.0 => 43.66232103594958
7 ) 1323.80177 > 903.0 => 37.79427299449291
8 

The following cell presents the output for the parameters used when all of the exam data was first run through the above scoring method, with the exception of `runs`. To avoid biasing parameter selection, prior work on the method did NOT use the exam data used here. 

Note: the below cell has been re-run since this first run. Additionally, the current version makes use of the average number of swaps over 100,000 instances to provide a better representation of the method's performance. This is because the number of swaps can vary by small amounts for scoring methods where the granularity of the machine scores/grading scale is sufficiently coarse for exams with different human scores to find themselves with the same machine grades. That is, when sorting on the machine grade, one could get different absolute orderings based on the initial order of the list. For example: 

|ID|Q1_human|Q1_machine|
|--|--|--|
|00001|4|90|
|00002|5|90|

and

|ID|Q1_human|Q1_machine|
|--|--|--|
|00002|5|90|
|00001|4|90|

are both proper orderings based on the values of `Q1_machine` and could occur if the list before sorting had different orderings. In all cases below, the difference between swaps needed for the pseudo-random and machine orderings exhibit a significant difference.

Data from the cell above is cited in the paper so that comparisons can be made between the performance of machine, pseudo-random, and human orderings whereas the data in the cell below includes only comparisons between machine and pseudo-random orderings. 

In [12]:
score_exams(exams,model='nlp1',normv=0,score=0,goal="centroid",runs=100000)


../data/property_instructor_A

['ID', 'SHORT_ANS', 'Q1', 'Q2']


Unnamed: 0,ID,SHORT_ANS_x,Q1_x,Q2_x,SHORT_ANS_y,Q1_y,Q2_y
0,351e0d0e-9901-4878-9d31-decad062eb91,2,21,6,90,87,81
1,0adab6a0-9008-4fdb-9fb7-a0be1edcda37,8,16,8,78,88,84
2,70d749d7-dd2e-4b83-be1a-1409b0dcb3e2,9,35,17,85,84,86
3,7a25c1e8-4bd9-438f-ad5b-393ad679a0ab,3,13,11,82,81,83
4,031928ff-6248-4987-bb30-75af9273ea92,10,41,21,83,85,85


Number of entries: 81

SHORT_ANS
Mean Words: 436
Algo score, swaps needed: 989.50331
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1591, 1599, 1465, 1386, 1545, 1389, 1609, 1744, 1616, 1637]
Pseudo-random, avergae swaps needed: 1499.72726

Q1
Mean Words: 2048
Algo score, swaps needed: 1021.46179
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1688, 1418, 1513, 1558, 1453, 1742, 1635, 1270, 1463, 1713]
Pseudo-random, avergae swaps needed: 1566.17577

Q2
Mean Words: 947
Algo score, swaps needed: 829.93601
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1560, 1477, 1644, 1828, 1799, 1743, 1766, 1683, 1562, 1394]
Pseudo-random, avergae swaps needed: 1560.0728

../data/property_instructor_B

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,ecdebb90-4db9-4967-b6f9-ea3759679fcc,13.0,16.0,90,94
1,d57e3e5c-10f4-4610-9ea6-49852423ede4,18.0,23.0,88,87
2,6f684adb-beb1-4e9c-8d7d-76e0637baf2b,14.0,18.0,85,84
3,d2a0868b-f1ef-43d6-a277-0b4b33bd78d0,11.5,21.0,82,79
4,da455e55-d2ac-4054-aae9-e62723a9dc6c,17.0,17.0,79,84


Number of entries: 88

Q1
Mean Words: 1431
Algo score, swaps needed: 1289.51587
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1774, 1743, 1938, 1834, 1669, 1834, 2189, 1843, 1991, 1652]
Pseudo-random, avergae swaps needed: 1837.38272

Q2
Mean Words: 1546
Algo score, swaps needed: 1319.45138
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1776, 1771, 1959, 1737, 1595, 2008, 1706, 1698, 1872, 1868]
Pseudo-random, avergae swaps needed: 1757.88976

../data/environ_instructor_B

['ID', 'Q2']


Unnamed: 0,ID,Q2_x,Q2_y
0,eabdc089-b66a-4412-a8ca-7241ef5e59a3,31.0,73
1,981c3959-fe3b-4222-97de-742768dee8e9,23.0,80
2,f1a75c07-eda1-4552-907a-c976d9ae7c0c,42.0,82
3,a7931454-2722-4600-ad92-baccd06a38a8,42.0,96
4,8fee9d29-68ff-4b2d-b54d-58f8303249ff,29.5,91


Number of entries: 28

Q2
Mean Words: 2094
Algo score, swaps needed: 121.01466
Swaps needed for first 10 out of 100000 pseudo-random runs:
[195, 184, 209, 160, 178, 147, 170, 202, 153, 162]
Pseudo-random, avergae swaps needed: 182.61788

../data/PR_instructor_C

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,66da3df6-08d8-4757-bbd0-2d5d0df848fa,9.5,5.5,74,85
1,f7422fbb-df4a-4788-8d19-03ca5ab472f4,5.0,4.0,74,85
2,6c261e5d-2212-43a3-ab4f-28bd7dd90612,9.0,3.0,88,85
3,473b0dda-6847-467e-8589-32a882dcfde1,12.0,5.0,81,83
4,836dd271-a288-4467-b688-9d61bc2b99ff,12.5,5.0,85,85


Number of entries: 75

Q1
Mean Words: 936
Algo score, swaps needed: 908.03549
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1502, 1318, 1217, 1319, 1269, 1358, 1401, 1272, 1337, 1195]
Pseudo-random, avergae swaps needed: 1324.44786

Q2
Mean Words: 570
Algo score, swaps needed: 814.05907
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1264, 1108, 1349, 1480, 1260, 963, 1205, 1043, 1220, 998]
Pseudo-random, avergae swaps needed: 1219.65644

../data/contracts_instructor_D

['ID', 'Q1', 'Q2', 'Q3']


Unnamed: 0,ID,Q1_x,Q2_x,Q3_x,Q1_y,Q2_y,Q3_y
0,737af13d-d5ba-4cf8-beaa-63bdc403d760,38,2,17,85,79,82
1,6c715ed9-9520-4c0d-8d4f-b56ceef0320a,53,43,15,81,86,81
2,bb065bd5-206a-4c0f-ba0f-f193651f5ac6,62,23,40,83,84,88
3,cbe7586a-af37-49e3-86ad-4e36380adad9,74,40,45,90,91,90
4,496aeba0-ea0b-4003-bd5b-7e4ac8bd9073,47,35,20,86,78,81


Number of entries: 78

Q1
Mean Words: 1823
Algo score, swaps needed: 1176.01861
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1388, 1298, 1489, 1456, 1549, 1324, 1352, 1599, 1684, 1609]
Pseudo-random, avergae swaps needed: 1471.9416

Q2
Mean Words: 1050
Algo score, swaps needed: 995.51218
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1358, 1507, 1473, 1510, 1686, 1271, 1371, 1605, 1644, 1360]
Pseudo-random, avergae swaps needed: 1449.70805

Q3
Mean Words: 839
Algo score, swaps needed: 769.55054
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1440, 1391, 1549, 1429, 1392, 1520, 1326, 1485, 1404, 1589]
Pseudo-random, avergae swaps needed: 1457.44482

../data/crim_instructor_E

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,fc6ff039-d412-461c-a096-7039bac5bd73,25.25,20.5,76,62
1,d18b3687-1b51-4f7b-b1e4-94b2918c2e1f,16.0,17.0,82,89
2,230325ce-77e5-46d5-adf5-e088088aee8f,21.0,21.0,87,81
3,da2f8241-c0df-4081-b6bc-347b35a3d470,21.0,21.0,83,83
4,0c3b2e97-82af-40a0-81c9-0e19e9ddff09,19.0,20.0,90,91


Number of entries: 92

Q1
Mean Words: 3353
Algo score, swaps needed: 1777.99084
Swaps needed for first 10 out of 100000 pseudo-random runs:
[2057, 1872, 1910, 1664, 2135, 2271, 2014, 1972, 2058, 1751]
Pseudo-random, avergae swaps needed: 2045.42864

Q2
Mean Words: 2137
Algo score, swaps needed: 1635.99287
Swaps needed for first 10 out of 100000 pseudo-random runs:
[2278, 1949, 1648, 2197, 1839, 1946, 1976, 2155, 2021, 2084]
Pseudo-random, avergae swaps needed: 1994.22284

Swaps (Machine):
[989.50331, 1021.46179, 829.93601, 1289.51587, 1319.45138, 121.01466, 908.03549, 814.05907, 1176.01861, 995.51218, 769.55054, 1777.99084, 1635.99287]
N: 13 	Mean:  1049.8494323076923 	Var:  160966.89750620702
Avergae swaps (Pseudo-random):
[1499.72726, 1566.17577, 1560.0728, 1837.38272, 1757.88976, 182.61788, 1324.44786, 1219.65644, 1471.9416, 1449.70805, 1457.44482, 2045.42864, 1994.22284]
N: 13 	Mean:  1489.7474184615385 	Var:  198364.50812390135 

1 ) 1499.72726 > 989.50331 => 40.994511006668205
2 

The following cells present a variety of parameter selections, including a mix of vector normalization, grade translations, and "best answer" selections.

In [13]:
score_exams(exams,model='nlp1',normv=1,score=0,goal="centroid",runs=100000)


../data/property_instructor_A

['ID', 'SHORT_ANS', 'Q1', 'Q2']


Unnamed: 0,ID,SHORT_ANS_x,Q1_x,Q2_x,SHORT_ANS_y,Q1_y,Q2_y
0,a503bcd8-7709-45ef-b48f-d56131f99108,6,15,11,78,84,86
1,dcef7f09-f784-48cb-b65e-59d0293f4195,10,30,32,87,83,86
2,ccd4c7f7-60a4-48c3-9dbf-96e923a98d93,6,23,28,86,79,86
3,c74cd4a8-869a-4714-892e-be61183e7835,4,27,24,89,91,83
4,7a4dfc2e-553b-4bc6-bc18-3b0626c17a1e,5,18,18,85,88,84


Number of entries: 81

SHORT_ANS
Mean Words: 436
Algo score, swaps needed: 996.46505
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1310, 1488, 1320, 1562, 1612, 1426, 1638, 1551, 1372, 1488]
Pseudo-random, avergae swaps needed: 1499.14618

Q1
Mean Words: 2048
Algo score, swaps needed: 1054.98323
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1658, 1752, 1703, 1683, 1707, 1709, 1600, 1737, 1774, 1408]
Pseudo-random, avergae swaps needed: 1566.48594

Q2
Mean Words: 947
Algo score, swaps needed: 959.91972
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1471, 1484, 1446, 1407, 1472, 1561, 1721, 1494, 1540, 1534]
Pseudo-random, avergae swaps needed: 1560.00533

../data/property_instructor_B

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,8c9de0d1-eff5-4b5c-b1ff-7cecd2bb6c34,12.0,12.0,65,80
1,1b15cca5-b292-4c29-8e83-384ed76e7b77,20.0,18.0,79,83
2,d9863d9e-0473-482d-ae21-bcfb05f1753c,9.0,17.0,71,82
3,a9738fe5-2df1-4661-b147-de435b7eeeb6,17.0,20.0,80,94
4,529faf72-9186-4652-8fcc-58ba8b46a5cb,22.0,20.0,89,86


Number of entries: 88

Q1
Mean Words: 1431
Algo score, swaps needed: 1247.49193
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1813, 1981, 1568, 2121, 1575, 1863, 2028, 1707, 1575, 1886]
Pseudo-random, avergae swaps needed: 1837.51867

Q2
Mean Words: 1546
Algo score, swaps needed: 1213.96858
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1938, 1632, 1899, 1795, 1660, 2098, 1546, 1900, 1834, 1816]
Pseudo-random, avergae swaps needed: 1758.20714

../data/environ_instructor_B

['ID', 'Q2']


Unnamed: 0,ID,Q2_x,Q2_y
0,ab1fbbba-d6e0-4551-aed0-3fc164fa8570,27.0,86
1,9ce4f501-176f-4180-ad8a-0a44c2c21a21,34.0,85
2,a7931454-2722-4600-ad92-baccd06a38a8,42.0,95
3,467e6ace-0fab-4f54-a773-979058184ac2,25.0,84
4,3307e79a-3565-478c-8749-b095d12c7e56,20.5,82


Number of entries: 28

Q2
Mean Words: 2094
Algo score, swaps needed: 114.0029
Swaps needed for first 10 out of 100000 pseudo-random runs:
[169, 222, 170, 208, 151, 148, 156, 168, 226, 157]
Pseudo-random, avergae swaps needed: 182.47244

../data/PR_instructor_C

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,330e4993-30e4-4192-9e02-07e960fad404,16.5,6.0,89,86
1,bb846754-6c06-4082-a827-eee27975d766,10.0,4.5,80,85
2,f835cc4c-68ff-4a67-a6a8-17c0d3997a2a,16.0,5.0,85,85
3,054dc8e6-dc85-4ebb-92f4-f09e5bc559c9,10.0,3.0,91,59
4,af7980d7-0f7d-4600-82c5-14f34f16f53a,5.5,3.0,77,83


Number of entries: 75

Q1
Mean Words: 936
Algo score, swaps needed: 908.98728
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1473, 1182, 1417, 1249, 1410, 1401, 1076, 1334, 1079, 1336]
Pseudo-random, avergae swaps needed: 1324.01157

Q2
Mean Words: 570
Algo score, swaps needed: 907.39679
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1154, 1370, 1052, 1319, 1287, 1226, 1149, 1108, 1222, 1072]
Pseudo-random, avergae swaps needed: 1219.49946

../data/contracts_instructor_D

['ID', 'Q1', 'Q2', 'Q3']


Unnamed: 0,ID,Q1_x,Q2_x,Q3_x,Q1_y,Q2_y,Q3_y
0,496aeba0-ea0b-4003-bd5b-7e4ac8bd9073,47,35,20,83,74,81
1,49fdf82b-9794-42a5-9a7b-8fb177111688,28,25,17,80,82,80
2,ed6cf722-94b9-4474-8bf7-318cb15713b8,67,25,28,85,89,87
3,c1b34a08-528e-4824-85ac-1d2043c2bb32,15,5,2,80,81,78
4,ee5b756f-84b7-4707-ab9a-a9064b65ebe2,43,25,16,87,85,80


Number of entries: 78

Q1
Mean Words: 1823
Algo score, swaps needed: 1174.50544
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1373, 1263, 1463, 1603, 1533, 1465, 1509, 1349, 1563, 1521]
Pseudo-random, avergae swaps needed: 1472.93425

Q2
Mean Words: 1050
Algo score, swaps needed: 1041.00029
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1430, 1487, 1436, 1619, 1536, 1363, 1301, 1294, 1438, 1435]
Pseudo-random, avergae swaps needed: 1450.00772

Q3
Mean Words: 839
Algo score, swaps needed: 803.94188
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1520, 1399, 1458, 1504, 1481, 1406, 1533, 1621, 1241, 1440]
Pseudo-random, avergae swaps needed: 1458.14977

../data/crim_instructor_E

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,0a1cbcb8-fb01-45be-bb70-9c97770f647c,18.0,11.0,83,84
1,25b1719f-d684-490a-b15d-9596e54c29ee,14.0,3.5,76,63
2,5b092aa8-350b-4919-8267-f3309dbcb8fc,19.5,19.0,81,81
3,348bfa7e-1c58-4b5b-9b11-af9b507ee3cb,17.0,15.0,79,84
4,858981e3-e43d-4777-ac97-16678c7f6ac0,19.0,20.0,88,92


Number of entries: 92

Q1
Mean Words: 3353
Algo score, swaps needed: 1688.05055
Swaps needed for first 10 out of 100000 pseudo-random runs:
[2388, 1906, 2017, 2148, 2162, 2038, 2254, 1994, 2205, 1959]
Pseudo-random, avergae swaps needed: 2045.85193

Q2
Mean Words: 2137
Algo score, swaps needed: 1578.9718
Swaps needed for first 10 out of 100000 pseudo-random runs:
[2294, 1959, 1827, 1962, 2017, 2153, 2094, 1988, 1979, 1939]
Pseudo-random, avergae swaps needed: 1993.83577

Swaps (Machine):
[996.46505, 1054.98323, 959.91972, 1247.49193, 1213.96858, 114.0029, 908.98728, 907.39679, 1174.50544, 1041.00029, 803.94188, 1688.05055, 1578.9718]
N: 13 	Mean:  1053.0527261538462 	Var:  135086.9684930775
Avergae swaps (Pseudo-random):
[1499.14618, 1566.48594, 1560.00533, 1837.51867, 1758.20714, 182.47244, 1324.01157, 1219.49946, 1472.93425, 1450.00772, 1458.14977, 2045.85193, 1993.83577]
N: 13 	Mean:  1489.8558592307693 	Var:  198432.03504219826 

1 ) 1499.14618 > 996.46505 => 40.28521141091355
2 ) 

In [14]:
score_exams(exams,model='nlp1',normv=0,score=1,goal="centroid",runs=100000)


../data/property_instructor_A

['ID', 'SHORT_ANS', 'Q1', 'Q2']


Unnamed: 0,ID,SHORT_ANS_x,Q1_x,Q2_x,SHORT_ANS_y,Q1_y,Q2_y
0,3563df1a-2ee1-4323-a204-fb3a13e13f57,1,1,4,80,70,77
1,70d749d7-dd2e-4b83-be1a-1409b0dcb3e2,9,35,17,83,83,83
2,354b8e2c-ce96-49fc-8a7b-80c7b8f04be3,3,13,2,80,63,80
3,e8c7236a-2cae-40ea-8d17-5405ce4a0e0f,12,17,14,87,80,80
4,07ac7940-e702-498a-a97c-333cb233a847,2,14,4,77,83,77


Number of entries: 81

SHORT_ANS
Mean Words: 436
Algo score, swaps needed: 991.10428
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1310, 1608, 1730, 1663, 1360, 1523, 1626, 1454, 1405, 1740]
Pseudo-random, avergae swaps needed: 1500.25208

Q1
Mean Words: 2048
Algo score, swaps needed: 1023.11096
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1431, 1541, 1747, 1377, 1676, 1525, 1692, 1442, 1375, 1642]
Pseudo-random, avergae swaps needed: 1566.50075

Q2
Mean Words: 947
Algo score, swaps needed: 918.7735
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1702, 1558, 1459, 1443, 1767, 1453, 1580, 1413, 1650, 1367]
Pseudo-random, avergae swaps needed: 1560.09365

../data/property_instructor_B

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,05ab5f67-d7d3-441f-8d3c-e879ece151f9,19.0,22.0,80,80
1,d3d89043-c41a-4cf9-aba0-17b47f4750c9,14.0,12.0,87,87
2,6dfac62a-26c6-4c69-afe0-39aa1b1a3e44,15.0,19.0,83,83
3,d57e3e5c-10f4-4610-9ea6-49852423ede4,18.0,23.0,87,87
4,6c430619-e918-45cc-8164-b092cb27ee8a,11.0,21.0,83,80


Number of entries: 88

Q1
Mean Words: 1431
Algo score, swaps needed: 1331.50755
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1666, 1787, 1725, 1750, 1861, 1565, 2026, 1805, 1799, 2077]
Pseudo-random, avergae swaps needed: 1837.78548

Q2
Mean Words: 1546
Algo score, swaps needed: 1371.00386
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1762, 1976, 1553, 1693, 1799, 1760, 1792, 1938, 1873, 1643]
Pseudo-random, avergae swaps needed: 1757.7674

../data/environ_instructor_B

['ID', 'Q2']


Unnamed: 0,ID,Q2_x,Q2_y
0,a0b1247e-71db-499f-a8e9-5191a179ac6c,28.0,77
1,740ba105-1c40-49be-b74d-e78072fdbf34,25.0,73
2,2872fa41-b643-452a-b6c6-87f4e140f336,43.0,87
3,467e6ace-0fab-4f54-a773-979058184ac2,25.0,80
4,a7931454-2722-4600-ad92-baccd06a38a8,42.0,93


Number of entries: 28

Q2
Mean Words: 2094
Algo score, swaps needed: 122.00948
Swaps needed for first 10 out of 100000 pseudo-random runs:
[197, 212, 230, 219, 190, 212, 178, 175, 158, 144]
Pseudo-random, avergae swaps needed: 182.51655

../data/PR_instructor_C

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,867eac98-6036-4439-ad09-d9c49c511b56,10.5,6.0,80,83
1,5d7cfc35-9967-4ab0-bab2-011f73bd8856,12.0,4.0,87,83
2,4e4a357a-bc9d-46b9-b0c5-01e68cb32ba0,10.0,3.0,70,83
3,eb2a08b7-b5f4-4b84-b4bb-b3ae393af3e7,14.5,5.5,83,83
4,77b75df3-9aa3-4653-861d-e8dd9c5cf920,7.0,4.0,77,83


Number of entries: 75

Q1
Mean Words: 936
Algo score, swaps needed: 903.94001
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1455, 1467, 1342, 1351, 1419, 1302, 1333, 1366, 1257, 1313]
Pseudo-random, avergae swaps needed: 1324.79922

Q2
Mean Words: 570
Algo score, swaps needed: 1074.27863
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1257, 1232, 1104, 1248, 1359, 1134, 1220, 1170, 1187, 1202]
Pseudo-random, avergae swaps needed: 1218.77426

../data/contracts_instructor_D

['ID', 'Q1', 'Q2', 'Q3']


Unnamed: 0,ID,Q1_x,Q2_x,Q3_x,Q1_y,Q2_y,Q3_y
0,e4873808-277a-47f8-9235-2188343dc4ab,32,33,30,87,87,83
1,496aeba0-ea0b-4003-bd5b-7e4ac8bd9073,47,35,20,83,77,80
2,737af13d-d5ba-4cf8-beaa-63bdc403d760,38,2,17,83,77,80
3,1dd97d65-c89a-4744-bfc1-c14311cc97b7,38,36,20,80,83,80
4,7e5c12f6-95ca-485d-91ec-71a4031f22d5,20,5,20,80,77,83


Number of entries: 78

Q1
Mean Words: 1823
Algo score, swaps needed: 1189.94081
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1534, 1358, 1507, 1549, 1670, 1694, 1405, 1519, 1365, 1353]
Pseudo-random, avergae swaps needed: 1472.50809

Q2
Mean Words: 1050
Algo score, swaps needed: 1004.40795
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1523, 1338, 1593, 1306, 1328, 1397, 1350, 1304, 1455, 1503]
Pseudo-random, avergae swaps needed: 1450.15038

Q3
Mean Words: 839
Algo score, swaps needed: 773.30653
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1637, 1310, 1430, 1499, 1580, 1559, 1270, 1730, 1328, 1565]
Pseudo-random, avergae swaps needed: 1458.04045

../data/crim_instructor_E

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,95eca05c-7d2d-4e74-80e7-f4dd862e71b1,24.5,24.0,87,87
1,c268db25-0c25-4dcb-a5dd-13e0b0b3b4e5,26.5,21.0,83,87
2,d5556a5d-e8c0-4613-853e-e6448f546a32,20.5,18.0,87,87
3,d7cbd32b-0f8a-4ee7-a4a8-5679eab81ae5,11.5,13.0,80,83
4,c2f90f17-5912-4155-a6d4-60da6b5733b4,17.0,18.0,80,83


Number of entries: 92

Q1
Mean Words: 3353
Algo score, swaps needed: 1767.51708
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1877, 2270, 1927, 2161, 2495, 1985, 2122, 2196, 1972, 1813]
Pseudo-random, avergae swaps needed: 2045.88172

Q2
Mean Words: 2137
Algo score, swaps needed: 1633.86401
Swaps needed for first 10 out of 100000 pseudo-random runs:
[2064, 1826, 1848, 1903, 1798, 2116, 1994, 1746, 1985, 2002]
Pseudo-random, avergae swaps needed: 1993.96205

Swaps (Machine):
[991.10428, 1023.11096, 918.7735, 1331.50755, 1371.00386, 122.00948, 903.94001, 1074.27863, 1189.94081, 1004.40795, 773.30653, 1767.51708, 1633.86401]
N: 13 	Mean:  1084.981896153846 	Var:  155756.32549119272
Avergae swaps (Pseudo-random):
[1500.25208, 1566.50075, 1560.09365, 1837.78548, 1757.7674, 182.51655, 1324.79922, 1218.77426, 1472.50809, 1450.15038, 1458.04045, 2045.88172, 1993.96205]
N: 13 	Mean:  1489.925544615385 	Var:  198445.37818547385 

1 ) 1500.25208 > 991.10428 => 40.87314108688972
2 )

In [15]:
score_exams(exams,model='nlp1',normv=1,score=2,goal="centroid",runs=100000)


../data/property_instructor_A

['ID', 'SHORT_ANS', 'Q1', 'Q2']
Leaving z scores in place
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,SHORT_ANS_x,Q1_x,Q2_x,SHORT_ANS_y,Q1_y,Q2_y
0,68975dd8-703c-43ba-8247-1e6cde4fe067,0.120927,-0.147495,0.798654,0.503427,0.869034,0.537341
1,72162df3-a93a-447b-9155-24b29b7c4a05,0.401122,0.367653,0.354091,0.530529,0.201332,0.047264
2,ddee1d35-ea9d-4e90-b6dd-4a9c5ff90aa0,-0.999856,-1.48688,-0.535036,0.636763,0.584401,0.496988
3,b54398fb-14ce-4ad0-9a86-9813160d627f,2.362492,2.119157,1.391404,0.516124,0.26203,0.379471
4,02694d63-a2bc-45b7-98fd-77c89066e667,-1.280052,-1.074762,-0.535036,0.007574,-1.10145,0.458851


Number of entries: 81

SHORT_ANS
Mean Words: 436


Unnamed: 0,ID,judge,score
0,68975dd8-703c-43ba-8247-1e6cde4fe067,Human,0.120927
1,72162df3-a93a-447b-9155-24b29b7c4a05,Human,0.401122
2,ddee1d35-ea9d-4e90-b6dd-4a9c5ff90aa0,Human,-0.999856
3,b54398fb-14ce-4ad0-9a86-9813160d627f,Human,2.362492
4,02694d63-a2bc-45b7-98fd-77c89066e667,Human,-1.280052


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.426259,2.485892,80,81,3e-05,"[0.23, 0.59]"
ICC2,Single random raters,0.42424,2.455869,80,80,4.1e-05,"[0.23, 0.59]"
ICC3,Single fixed raters,0.421274,2.455869,80,80,4.1e-05,"[0.22, 0.58]"
ICC1k,Average raters absolute,0.59773,2.485892,80,81,3e-05,"[0.38, 0.74]"
ICC2k,Average random raters,0.595742,2.455869,80,80,4.1e-05,"[0.37, 0.74]"
ICC3k,Average fixed raters,0.592812,2.455869,80,80,4.1e-05,"[0.37, 0.74]"


Cohen's kappa for rounded(z scores x 10): 0.446784149383303
Algo score, swaps needed: 998.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1501, 1258, 1382, 1416, 1286, 1546, 1583, 1355, 1486, 1469]
Pseudo-random, avergae swaps needed: 1499.90568

Q1
Mean Words: 2048


Unnamed: 0,ID,judge,score
0,68975dd8-703c-43ba-8247-1e6cde4fe067,Human,-0.147495
1,72162df3-a93a-447b-9155-24b29b7c4a05,Human,0.367653
2,ddee1d35-ea9d-4e90-b6dd-4a9c5ff90aa0,Human,-1.48688
3,b54398fb-14ce-4ad0-9a86-9813160d627f,Human,2.119157
4,02694d63-a2bc-45b7-98fd-77c89066e667,Human,-1.074762


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.506941,3.056311,80,81,5.281393e-07,"[0.33, 0.65]"
ICC2,Single random raters,0.505419,3.018623,80,80,7.604948e-07,"[0.32, 0.65]"
ICC3,Single fixed raters,0.502317,3.018623,80,80,7.604948e-07,"[0.32, 0.65]"
ICC1k,Average raters absolute,0.672808,3.056311,80,81,5.281393e-07,"[0.49, 0.79]"
ICC2k,Average random raters,0.671466,3.018623,80,80,7.604948e-07,"[0.49, 0.79]"
ICC3k,Average fixed raters,0.668723,3.018623,80,80,7.604948e-07,"[0.48, 0.79]"


Cohen's kappa for rounded(z scores x 10): 0.48698613796947476
Algo score, swaps needed: 1054.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1514, 1862, 1390, 1671, 1496, 1632, 1676, 1651, 1435, 1786]
Pseudo-random, avergae swaps needed: 1566.23236

Q2
Mean Words: 947


Unnamed: 0,ID,judge,score
0,68975dd8-703c-43ba-8247-1e6cde4fe067,Human,0.798654
1,72162df3-a93a-447b-9155-24b29b7c4a05,Human,0.354091
2,ddee1d35-ea9d-4e90-b6dd-4a9c5ff90aa0,Human,-0.535036
3,b54398fb-14ce-4ad0-9a86-9813160d627f,Human,1.391404
4,02694d63-a2bc-45b7-98fd-77c89066e667,Human,-0.535036


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.418784,2.441062,80,81,4.2e-05,"[0.22, 0.58]"
ICC2,Single random raters,0.41699,2.415478,80,80,5.4e-05,"[0.22, 0.58]"
ICC3,Single fixed raters,0.41443,2.415478,80,80,5.4e-05,"[0.22, 0.58]"
ICC1k,Average raters absolute,0.590342,2.441062,80,81,4.2e-05,"[0.36, 0.74]"
ICC2k,Average random raters,0.588557,2.415478,80,80,5.4e-05,"[0.36, 0.74]"
ICC3k,Average fixed raters,0.586003,2.415478,80,80,5.4e-05,"[0.36, 0.73]"


Cohen's kappa for rounded(z scores x 10): 0.457167253080513
Algo score, swaps needed: 930.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1701, 1560, 1672, 1504, 1827, 1572, 1689, 1567, 1405, 1634]
Pseudo-random, avergae swaps needed: 1560.29005

../data/property_instructor_B

['ID', 'Q1', 'Q2']
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,d2a0868b-f1ef-43d6-a277-0b4b33bd78d0,-1.2205,0.65718,-0.181182,-0.910573
1,d16fe8d2-bea5-4a56-bbd8-8f8bd80bfe97,-0.039371,0.957513,1.074028,0.50268
2,1a67bcd2-4be4-4221-ab02-0cea41c74809,0.196855,-0.243819,-0.390988,0.322597
3,fad9d7b4-6245-4e76-ae50-76d00c933b26,-1.811065,-1.144818,-0.706111,-0.55072
4,140fa70f-38c7-45ce-86f4-f66ecc52218d,-0.39371,0.807347,-1.009999,-1.275871


Number of entries: 88

Q1
Mean Words: 1431


Unnamed: 0,ID,judge,score
0,d2a0868b-f1ef-43d6-a277-0b4b33bd78d0,Human,-1.2205
1,d16fe8d2-bea5-4a56-bbd8-8f8bd80bfe97,Human,-0.039371
2,1a67bcd2-4be4-4221-ab02-0cea41c74809,Human,0.196855
3,fad9d7b4-6245-4e76-ae50-76d00c933b26,Human,-1.811065
4,140fa70f-38c7-45ce-86f4-f66ecc52218d,Human,-0.39371


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.435186,2.540989,87,88,9e-06,"[0.25, 0.59]"
ICC2,Single random raters,0.43344,2.513555,87,87,1.3e-05,"[0.25, 0.59]"
ICC3,Single fixed raters,0.430776,2.513555,87,87,1.3e-05,"[0.24, 0.59]"
ICC1k,Average raters absolute,0.606452,2.540989,87,88,9e-06,"[0.4, 0.74]"
ICC2k,Average random raters,0.604755,2.513555,87,87,1.3e-05,"[0.4, 0.74]"
ICC3k,Average fixed raters,0.602157,2.513555,87,87,1.3e-05,"[0.39, 0.74]"


Cohen's kappa for rounded(z scores x 10): 0.4602594296957241
Algo score, swaps needed: 1275.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1785, 2042, 1790, 1734, 1833, 1907, 1810, 1817, 1613, 1692]
Pseudo-random, avergae swaps needed: 1837.85846

Q2
Mean Words: 1546


Unnamed: 0,ID,judge,score
0,d2a0868b-f1ef-43d6-a277-0b4b33bd78d0,Human,0.65718
1,d16fe8d2-bea5-4a56-bbd8-8f8bd80bfe97,Human,0.957513
2,1a67bcd2-4be4-4221-ab02-0cea41c74809,Human,-0.243819
3,fad9d7b4-6245-4e76-ae50-76d00c933b26,Human,-1.144818
4,140fa70f-38c7-45ce-86f4-f66ecc52218d,Human,0.807347


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.403045,2.350336,87,88,4.2e-05,"[0.21, 0.56]"
ICC2,Single random raters,0.401305,2.327666,87,87,5.4e-05,"[0.21, 0.56]"
ICC3,Single fixed raters,0.398978,2.327666,87,87,5.4e-05,"[0.21, 0.56]"
ICC1k,Average raters absolute,0.574529,2.350336,87,88,4.2e-05,"[0.35, 0.72]"
ICC2k,Average random raters,0.572759,2.327666,87,87,5.4e-05,"[0.35, 0.72]"
ICC3k,Average fixed raters,0.570385,2.327666,87,87,5.4e-05,"[0.34, 0.72]"


Cohen's kappa for rounded(z scores x 10): 0.4202472486054576
Algo score, swaps needed: 1224.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1954, 2106, 1798, 1916, 1883, 1613, 1413, 1467, 1651, 1745]
Pseudo-random, avergae swaps needed: 1757.2092

../data/environ_instructor_B

['ID', 'Q2']
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q2_x,Q2_y
0,c6f9b6cb-c5b8-4622-bdad-51aabad768cd,-0.540495,0.96956
1,dda68a48-2475-4338-9956-67fc7efbf74b,1.010796,1.324034
2,f870a18d-a560-4a6a-90b8-e1660d8877d1,0.294815,-0.354441
3,2872fa41-b643-452a-b6c6-87f4e140f336,1.726776,1.232272
4,eabdc089-b66a-4412-a8ca-7241ef5e59a3,0.294815,-1.465595


Number of entries: 28

Q2
Mean Words: 2094


Unnamed: 0,ID,judge,score
0,c6f9b6cb-c5b8-4622-bdad-51aabad768cd,Human,-0.540495
1,dda68a48-2475-4338-9956-67fc7efbf74b,Human,1.010796
2,f870a18d-a560-4a6a-90b8-e1660d8877d1,Human,0.294815
3,2872fa41-b643-452a-b6c6-87f4e140f336,Human,1.726776
4,3c3d7b28-148d-4465-a14c-14429181c310,Human,-0.063175


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.568949,3.639821,27,28,0.000549,"[0.26, 0.77]"
ICC2,Single random raters,0.566593,3.550264,27,27,0.000783,"[0.25, 0.77]"
ICC3,Single fixed raters,0.560465,3.550264,27,27,0.000783,"[0.24, 0.77]"
ICC1k,Average raters absolute,0.725261,3.639821,27,28,0.000549,"[0.41, 0.87]"
ICC2k,Average random raters,0.723344,3.550264,27,27,0.000783,"[0.4, 0.87]"
ICC3k,Average fixed raters,0.718331,3.550264,27,27,0.000783,"[0.39, 0.87]"


Cohen's kappa for rounded(z scores x 10): 0.5166799574694312
Algo score, swaps needed: 117.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[157, 209, 170, 190, 193, 193, 150, 215, 181, 198]
Pseudo-random, avergae swaps needed: 182.56658

../data/PR_instructor_C

['ID', 'Q1', 'Q2']
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,7e891b5b-c0ab-461d-993c-86650e3db067,-0.136447,-0.957701,-0.670447,0.205259
1,1c8aeeb4-a1f3-492a-ae5c-87af4b3ad8e2,0.196644,-0.038751,-0.988681,0.217333
2,7e96b523-06ee-45f4-9c2e-6247c261f646,0.030099,-0.038751,0.761941,0.387743
3,5c11e5b2-953a-43ee-a21b-79f47664d5b2,0.529736,0.420724,-0.791942,0.066243
4,3c3821f0-5e8a-45a9-b9bb-06f850649a0a,0.196644,0.880199,0.186709,0.26366


Number of entries: 75

Q1
Mean Words: 936


Unnamed: 0,ID,judge,score
0,7e891b5b-c0ab-461d-993c-86650e3db067,Human,-0.136447
1,1c8aeeb4-a1f3-492a-ae5c-87af4b3ad8e2,Human,0.196644
2,7e96b523-06ee-45f4-9c2e-6247c261f646,Human,0.030099
3,5c11e5b2-953a-43ee-a21b-79f47664d5b2,Human,0.529736
4,3c3821f0-5e8a-45a9-b9bb-06f850649a0a,Human,0.196644


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.540244,3.350137,74,75,2.086765e-07,"[0.36, 0.68]"
ICC2,Single random raters,0.538824,3.305833,74,74,3.106074e-07,"[0.36, 0.68]"
ICC3,Single fixed raters,0.535514,3.305833,74,74,3.106074e-07,"[0.35, 0.68]"
ICC1k,Average raters absolute,0.701505,3.350137,74,75,2.086765e-07,"[0.53, 0.81]"
ICC2k,Average random raters,0.700306,3.305833,74,74,3.106074e-07,"[0.52, 0.81]"
ICC3k,Average fixed raters,0.697504,3.305833,74,74,3.106074e-07,"[0.52, 0.81]"


Cohen's kappa for rounded(z scores x 10): 0.5102469877573461
Algo score, swaps needed: 903.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1540, 1252, 1347, 1305, 1219, 1314, 1375, 1252, 1314, 1198]
Pseudo-random, avergae swaps needed: 1323.97221

Q2
Mean Words: 570


Unnamed: 0,ID,judge,score
0,7e891b5b-c0ab-461d-993c-86650e3db067,Human,-0.957701
1,1c8aeeb4-a1f3-492a-ae5c-87af4b3ad8e2,Human,-0.038751
2,7e96b523-06ee-45f4-9c2e-6247c261f646,Human,-0.038751
3,5c11e5b2-953a-43ee-a21b-79f47664d5b2,Human,0.420724
4,3c3821f0-5e8a-45a9-b9bb-06f850649a0a,Human,0.880199


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.177572,1.431825,74,75,0.061668,"[-0.05, 0.39]"
ICC2,Single random raters,0.173036,1.412976,74,74,0.069672,"[-0.06, 0.39]"
ICC3,Single fixed raters,0.171148,1.412976,74,74,0.069672,"[-0.06, 0.38]"
ICC1k,Average raters absolute,0.301591,1.431825,74,75,0.061668,"[-0.1, 0.56]"
ICC2k,Average random raters,0.295023,1.412976,74,74,0.069672,"[-0.12, 0.56]"
ICC3k,Average fixed raters,0.292274,1.412976,74,74,0.069672,"[-0.12, 0.55]"


Cohen's kappa for rounded(z scores x 10): 0.23952975753122707
Algo score, swaps needed: 831.00127
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1164, 1204, 1431, 1154, 1247, 1357, 1284, 1287, 1036, 1192]
Pseudo-random, avergae swaps needed: 1219.08912

../data/contracts_instructor_D

['ID', 'Q1', 'Q2', 'Q3']
Leaving z scores in place
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q1_x,Q2_x,Q3_x,Q1_y,Q2_y,Q3_y
0,39ed6bf0-e1fe-45eb-88ec-fdda59f87141,1.238243,1.374969,1.970159,1.333419,0.823378,0.792395
1,1460cb8a-dfcb-4279-97df-77be7213064a,-0.82155,0.349049,2.067547,-0.219469,0.768768,0.642212
2,4b2b17a2-b7f3-4386-b297-74b1f14afa6b,1.650202,0.451641,-0.367166,0.899784,-0.38605,0.069744
3,ed6cf722-94b9-4474-8bf7-318cb15713b8,1.375563,-0.163911,0.411942,0.227733,0.862354,0.521911
4,1dd97d65-c89a-4744-bfc1-c14311cc97b7,-0.61557,0.964601,-0.367166,-0.361493,-0.021564,-0.413383


Number of entries: 78

Q1
Mean Words: 1823


Unnamed: 0,ID,judge,score
0,39ed6bf0-e1fe-45eb-88ec-fdda59f87141,Human,1.238243
1,1460cb8a-dfcb-4279-97df-77be7213064a,Human,-0.82155
2,4b2b17a2-b7f3-4386-b297-74b1f14afa6b,Human,1.650202
3,ed6cf722-94b9-4474-8bf7-318cb15713b8,Human,1.375563
4,1dd97d65-c89a-4744-bfc1-c14311cc97b7,Human,-0.61557


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.277498,1.768157,77,78,0.00646,"[0.06, 0.47]"
ICC2,Single random raters,0.274216,1.7463,77,77,0.007704,"[0.05, 0.47]"
ICC3,Single fixed raters,0.271748,1.7463,77,77,0.007704,"[0.05, 0.47]"
ICC1k,Average raters absolute,0.434439,1.768157,77,78,0.00646,"[0.11, 0.64]"
ICC2k,Average random raters,0.430408,1.7463,77,77,0.007704,"[0.1, 0.64]"
ICC3k,Average fixed raters,0.427361,1.7463,77,77,0.007704,"[0.1, 0.63]"


Cohen's kappa for rounded(z scores x 10): 0.2949785232758061
Algo score, swaps needed: 1176.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1425, 1358, 1286, 1390, 1540, 1488, 1385, 1448, 1321, 1709]
Pseudo-random, avergae swaps needed: 1472.53206

Q2
Mean Words: 1050


Unnamed: 0,ID,judge,score
0,39ed6bf0-e1fe-45eb-88ec-fdda59f87141,Human,1.374969
1,1460cb8a-dfcb-4279-97df-77be7213064a,Human,0.349049
2,4b2b17a2-b7f3-4386-b297-74b1f14afa6b,Human,0.451641
3,ed6cf722-94b9-4474-8bf7-318cb15713b8,Human,-0.163911
4,1dd97d65-c89a-4744-bfc1-c14311cc97b7,Human,0.964601


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.319052,1.937081,77,78,0.002003,"[0.11, 0.5]"
ICC2,Single random raters,0.316352,1.914873,77,77,0.002425,"[0.1, 0.5]"
ICC3,Single fixed raters,0.313864,1.914873,77,77,0.002425,"[0.1, 0.5]"
ICC1k,Average raters absolute,0.483759,1.937081,77,78,0.002003,"[0.19, 0.67]"
ICC2k,Average random raters,0.48065,1.914873,77,77,0.002425,"[0.18, 0.67]"
ICC3k,Average fixed raters,0.477772,1.914873,77,77,0.002425,"[0.18, 0.67]"


Cohen's kappa for rounded(z scores x 10): 0.3509009258745911
Algo score, swaps needed: 1043.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1781, 1446, 1468, 1600, 1529, 1356, 1529, 1433, 1557, 1438]
Pseudo-random, avergae swaps needed: 1449.32702

Q3
Mean Words: 839


Unnamed: 0,ID,judge,score
0,39ed6bf0-e1fe-45eb-88ec-fdda59f87141,Human,1.970159
1,1460cb8a-dfcb-4279-97df-77be7213064a,Human,2.067547
2,4b2b17a2-b7f3-4386-b297-74b1f14afa6b,Human,-0.367166
3,ed6cf722-94b9-4474-8bf7-318cb15713b8,Human,0.411942
4,1dd97d65-c89a-4744-bfc1-c14311cc97b7,Human,-0.367166


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.557776,3.522595,77,78,4.057886e-08,"[0.38, 0.69]"
ICC2,Single random raters,0.556504,3.477485,77,77,6.165675e-08,"[0.38, 0.69]"
ICC3,Single fixed raters,0.553321,3.477485,77,77,6.165675e-08,"[0.38, 0.69]"
ICC1k,Average raters absolute,0.716118,3.522595,77,78,4.057886e-08,"[0.56, 0.82]"
ICC2k,Average random raters,0.715069,3.477485,77,77,6.165675e-08,"[0.55, 0.82]"
ICC3k,Average fixed raters,0.712436,3.477485,77,77,6.165675e-08,"[0.55, 0.82]"


Cohen's kappa for rounded(z scores x 10): 0.6055882812907261
Algo score, swaps needed: 783.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1311, 1515, 1372, 1294, 1557, 1657, 1481, 1288, 1405, 1291]
Pseudo-random, avergae swaps needed: 1457.98898

../data/crim_instructor_E

['ID', 'Q1', 'Q2']
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,1345e6e8-c15f-4a31-a12a-3642a8d2eb92,-0.047909,0.836545,1.295143,1.300717
1,b79bcda6-c021-453b-96e9-fd6d58c20469,-0.945323,-1.284442,-0.273292,-0.944506
2,85ca12a0-f4e5-4045-8350-676e0e159221,0.849505,0.258094,-0.398175,-0.10568
3,da2f8241-c0df-4081-b6bc-347b35a3d470,0.625152,1.029362,0.263096,-0.056564
4,c2f90f17-5912-4155-a6d4-60da6b5733b4,-0.272262,0.450911,-0.509601,0.290321


Number of entries: 92

Q1
Mean Words: 3353


Unnamed: 0,ID,judge,score
0,1345e6e8-c15f-4a31-a12a-3642a8d2eb92,Human,-0.047909
1,b79bcda6-c021-453b-96e9-fd6d58c20469,Human,-0.945323
2,85ca12a0-f4e5-4045-8350-676e0e159221,Human,0.849505
3,da2f8241-c0df-4081-b6bc-347b35a3d470,Human,0.625152
4,c2f90f17-5912-4155-a6d4-60da6b5733b4,Human,-0.272262


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.090017,1.197843,91,92,0.194626,"[-0.12, 0.29]"
ICC2,Single random raters,0.085672,1.185463,91,91,0.209323,"[-0.12, 0.29]"
ICC3,Single fixed raters,0.084862,1.185463,91,91,0.209323,"[-0.12, 0.28]"
ICC1k,Average raters absolute,0.165166,1.197843,91,92,0.194626,"[-0.26, 0.45]"
ICC2k,Average random raters,0.157823,1.185463,91,91,0.209323,"[-0.28, 0.44]"
ICC3k,Average fixed raters,0.156447,1.185463,91,91,0.209323,"[-0.28, 0.44]"


Cohen's kappa for rounded(z scores x 10): 0.15220449016453974
Algo score, swaps needed: 1705.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[2081, 2187, 2193, 2140, 2030, 1831, 2220, 1940, 2074, 1901]
Pseudo-random, avergae swaps needed: 2045.00169

Q2
Mean Words: 2137


Unnamed: 0,ID,judge,score
0,1345e6e8-c15f-4a31-a12a-3642a8d2eb92,Human,0.836545
1,b79bcda6-c021-453b-96e9-fd6d58c20469,Human,-1.284442
2,85ca12a0-f4e5-4045-8350-676e0e159221,Human,0.258094
3,da2f8241-c0df-4081-b6bc-347b35a3d470,Human,1.029362
4,c2f90f17-5912-4155-a6d4-60da6b5733b4,Human,0.450911


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.26712,1.728959,91,92,0.0047,"[0.07, 0.45]"
ICC2,Single random raters,0.264758,1.713932,91,91,0.005423,"[0.06, 0.45]"
ICC3,Single fixed raters,0.263062,1.713932,91,91,0.005423,"[0.06, 0.44]"
ICC1k,Average raters absolute,0.421617,1.728959,91,92,0.0047,"[0.13, 0.62]"
ICC2k,Average random raters,0.418669,1.713932,91,91,0.005423,"[0.12, 0.62]"
ICC3k,Average fixed raters,0.416546,1.713932,91,91,0.005423,"[0.12, 0.61]"


Cohen's kappa for rounded(z scores x 10): 0.25302929011688136
Algo score, swaps needed: 1589.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[2053, 1928, 1861, 2327, 2038, 1984, 1850, 1868, 1715, 2068]
Pseudo-random, avergae swaps needed: 1994.42136

Swaps (Machine):
[998.0, 1054.0, 930.0, 1275.0, 1224.0, 117.0, 903.0, 831.00127, 1176.0, 1043.0, 783.0, 1705.0, 1589.0]
N: 13 	Mean:  1048.30779 	Var:  141907.86286697842
Avergae swaps (Pseudo-random):
[1499.90568, 1566.23236, 1560.29005, 1837.85846, 1757.2092, 182.56658, 1323.97221, 1219.08912, 1472.53206, 1449.32702, 1457.98898, 2045.00169, 1994.42136]
N: 13 	Mean:  1489.722674615385 	Var:  198388.26468606733 

1 ) 1499.90568 > 998.0 => 40.186119437464114
2 ) 1566.23236 > 1054.0 => 39.09823936377917
3 ) 1560.29005 > 930.0 => 50.61981033092913
4 ) 1837.85846 > 1275.0 => 36.16344702033127
5 ) 1757.2092 > 1224.0 => 35.77133734861679
6 ) 182.56658 > 117.0 => 43.77429551721022
7 ) 1323.97221 > 903.0 => 37.806687313803515
8 ) 121

In [16]:
score_exams(exams,model='nlp1',normv=1,score=1,goal="medoid",runs=100000)


../data/property_instructor_A

['ID', 'SHORT_ANS', 'Q1', 'Q2']


Unnamed: 0,ID,SHORT_ANS_x,Q1_x,Q2_x,SHORT_ANS_y,Q1_y,Q2_y
0,499fc744-a3e7-4927-afd5-b4df47ca7874,0,25,10,73,87,80
1,ebd5efa9-f2f3-47f8-9959-1918046163c8,4,13,8,87,83,83
2,341a81d8-db2d-4017-95ca-f11d39c62331,5,10,10,87,83,83
3,7b622388-4835-41b0-916e-bb4ca373d6c4,13,26,25,83,83,80
4,6e472a30-2be4-4670-bff5-0a78a5fab822,9,33,21,70,87,83


Number of entries: 81

SHORT_ANS
Mean Words: 436
Algo score, swaps needed: 1441.96402
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1436, 1483, 1738, 1483, 1603, 1516, 1292, 1531, 1522, 1636]
Pseudo-random, avergae swaps needed: 1499.1106

Q1
Mean Words: 2048
Algo score, swaps needed: 1165.3767
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1689, 1696, 1601, 1684, 1452, 1535, 1388, 1478, 1512, 1470]
Pseudo-random, avergae swaps needed: 1566.27993

Q2
Mean Words: 947
Algo score, swaps needed: 1284.64089
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1535, 1616, 1611, 1268, 1311, 1636, 1582, 1632, 1591, 1562]
Pseudo-random, avergae swaps needed: 1560.16865

../data/property_instructor_B

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,8d5c28dd-7532-4b93-b654-7c254bd47fce,17.0,20.0,87,90
1,da455e55-d2ac-4054-aae9-e62723a9dc6c,17.0,17.0,73,80
2,ebf4deeb-44b2-4c4d-992e-b4e44cda8cc9,17.5,20.0,90,83
3,af76b772-5886-4e53-a62d-e87b17f052c9,13.0,15.0,70,77
4,faca6a15-3d4c-4771-ad55-f8fc6f55b71a,10.0,13.0,73,80


Number of entries: 88

Q1
Mean Words: 1431
Algo score, swaps needed: 1217.07259
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1564, 1857, 1897, 2003, 1973, 1738, 1768, 1815, 1870, 1777]
Pseudo-random, avergae swaps needed: 1837.17634

Q2
Mean Words: 1546
Algo score, swaps needed: 1703.51235
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1900, 1846, 1666, 1523, 1645, 1857, 1991, 1795, 1687, 1664]
Pseudo-random, avergae swaps needed: 1757.39543

../data/environ_instructor_B

['ID', 'Q2']


Unnamed: 0,ID,Q2_x,Q2_y
0,dbd130ac-e5e2-4938-9aad-d4a30c0406f0,45.0,83
1,47e96791-9e80-4638-b63a-a00903ce8508,30.0,80
2,a6db6331-93ae-48a4-ab5d-04cd34cf3989,25.0,77
3,3307e79a-3565-478c-8749-b095d12c7e56,20.5,80
4,467e6ace-0fab-4f54-a773-979058184ac2,25.0,87


Number of entries: 28

Q2
Mean Words: 2094
Algo score, swaps needed: 131.48975
Swaps needed for first 10 out of 100000 pseudo-random runs:
[170, 163, 183, 202, 168, 147, 195, 182, 159, 202]
Pseudo-random, avergae swaps needed: 182.60821

../data/PR_instructor_C

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,1ebf5189-e22f-4dfd-9f72-5ab06b37ca76,9.0,6.5,80,83
1,eecc8b68-5ac3-4146-8346-f775be3eceef,14.0,4.0,90,83
2,cbcfa92d-9d08-4487-84f4-9431b21fc8fa,13.5,4.5,83,83
3,25f7b700-3561-4102-ba38-4753c54f9eb0,11.0,5.0,83,83
4,5ff78e7f-5dbb-4fce-a97d-f2dc6c5a6fbe,12.0,6.0,83,83


Number of entries: 75

Q1
Mean Words: 936
Algo score, swaps needed: 1024.01478
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1384, 1308, 1303, 1302, 1298, 1362, 1207, 1375, 1230, 1514]
Pseudo-random, avergae swaps needed: 1324.45789

Q2
Mean Words: 570
Algo score, swaps needed: 1129.23608
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1216, 1280, 1276, 1289, 1081, 1243, 1201, 1239, 1133, 1292]
Pseudo-random, avergae swaps needed: 1219.24489

../data/contracts_instructor_D

['ID', 'Q1', 'Q2', 'Q3']


Unnamed: 0,ID,Q1_x,Q2_x,Q3_x,Q1_y,Q2_y,Q3_y
0,b21e248a-45c5-43ae-b06a-10261395fd1c,66,28,33,87,77,80
1,1b273341-a255-40ef-9a90-17fbdf05af4b,48,39,31,80,83,87
2,a7bf4849-6187-42da-8439-600dce189d0f,37,28,19,73,83,83
3,30234d01-ef3f-41bf-ae4a-1efaba3522d8,72,41,23,83,87,87
4,fbc35066-d18b-49cc-9540-9ec417e0e160,38,25,31,87,80,83


Number of entries: 78

Q1
Mean Words: 1823
Algo score, swaps needed: 1279.13438
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1659, 1575, 1460, 1316, 1404, 1372, 1392, 1448, 1530, 1249]
Pseudo-random, avergae swaps needed: 1472.41306

Q2
Mean Words: 1050
Algo score, swaps needed: 1111.52218
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1551, 1425, 1617, 1418, 1379, 1218, 1516, 1339, 1546, 1557]
Pseudo-random, avergae swaps needed: 1449.40899

Q3
Mean Words: 839
Algo score, swaps needed: 947.0725
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1365, 1473, 1398, 1606, 1517, 1234, 1233, 1528, 1500, 1492]
Pseudo-random, avergae swaps needed: 1457.26523

../data/crim_instructor_E

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,f234936b-f804-47e5-b956-9f3c9feef83c,20.5,22.0,83,87
1,83c20b41-645b-4931-97bf-870d498e43de,14.0,8.0,80,83
2,0091731f-bae9-4326-a66a-a744137584a2,13.5,8.0,83,77
3,858981e3-e43d-4777-ac97-16678c7f6ac0,19.0,20.0,90,90
4,85ca12a0-f4e5-4045-8350-676e0e159221,22.0,17.0,80,83


Number of entries: 92

Q1
Mean Words: 3353
Algo score, swaps needed: 2023.38838
Swaps needed for first 10 out of 100000 pseudo-random runs:
[2152, 2110, 1868, 2089, 2156, 2139, 1961, 1961, 2235, 1950]
Pseudo-random, avergae swaps needed: 2045.98698

Q2
Mean Words: 2137
Algo score, swaps needed: 1759.92724
Swaps needed for first 10 out of 100000 pseudo-random runs:
[2193, 1875, 1769, 2061, 1820, 1979, 1942, 2009, 1967, 1817]
Pseudo-random, avergae swaps needed: 1993.77345

Swaps (Machine):
[1441.96402, 1165.3767, 1284.64089, 1217.07259, 1703.51235, 131.48975, 1024.01478, 1129.23608, 1279.13438, 1111.52218, 947.0725, 2023.38838, 1759.92724]
N: 13 	Mean:  1247.5655261538461 	Var:  195273.5144029216
Avergae swaps (Pseudo-random):
[1499.1106, 1566.27993, 1560.16865, 1837.17634, 1757.39543, 182.60821, 1324.45789, 1219.24489, 1472.41306, 1449.40899, 1457.26523, 2045.98698, 1993.77345]
N: 13 	Mean:  1489.6376653846155 	Var:  198367.60072464152 

1 ) 1499.1106 > 1441.96402 => 3.886102012603821


In [17]:
score_exams(exams,model='nlp1',normv=1,score=2,goal="medoid",runs=100000)


../data/property_instructor_A

['ID', 'SHORT_ANS', 'Q1', 'Q2']
Leaving z scores in place
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,SHORT_ANS_x,Q1_x,Q2_x,SHORT_ANS_y,Q1_y,Q2_y
0,02694d63-a2bc-45b7-98fd-77c89066e667,-1.280052,-1.074762,-0.535036,1.229655,-1.120584,0.033619
1,8b2e671a-8164-4467-9f19-a6dc7cbf313c,-0.719661,0.367653,0.205903,0.555071,0.557392,0.432039
2,07ac7940-e702-498a-a97c-333cb233a847,-1.280052,-0.971732,-1.572349,-0.494572,0.849569,-0.376901
3,499fc744-a3e7-4927-afd5-b4df47ca7874,-1.840443,0.161594,-0.683223,-1.526789,1.02449,0.133701
4,0adab6a0-9008-4fdb-9fb7-a0be1edcda37,0.401122,-0.765673,-0.979599,-1.051691,0.474231,0.012859


Number of entries: 81

SHORT_ANS
Mean Words: 436


Unnamed: 0,ID,judge,score
0,02694d63-a2bc-45b7-98fd-77c89066e667,Human,-1.280052
1,8b2e671a-8164-4467-9f19-a6dc7cbf313c,Human,-0.719661
2,7ab0ea89-c22f-4ca6-bd74-c3aec57452f8,Human,-0.999856
3,07ac7940-e702-498a-a97c-333cb233a847,Human,-1.280052
4,499fc744-a3e7-4927-afd5-b4df47ca7874,Human,-1.840443


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.137978,1.320126,80,81,0.107329,"[-0.08, 0.34]"
ICC2,Single random raters,0.133377,1.304064,80,80,0.118631,"[-0.09, 0.34]"
ICC3,Single fixed raters,0.131969,1.304064,80,80,0.118631,"[-0.09, 0.34]"
ICC1k,Average raters absolute,0.242496,1.320126,80,81,0.107329,"[-0.18, 0.51]"
ICC2k,Average random raters,0.235362,1.304064,80,80,0.118631,"[-0.2, 0.51]"
ICC3k,Average fixed raters,0.233167,1.304064,80,80,0.118631,"[-0.19, 0.51]"


Cohen's kappa for rounded(z scores x 10): 0.16994789109397024
Algo score, swaps needed: 1320.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1474, 1561, 1577, 1650, 1573, 1428, 1492, 1199, 1508, 1441]
Pseudo-random, avergae swaps needed: 1498.90842

Q1
Mean Words: 2048


Unnamed: 0,ID,judge,score
0,02694d63-a2bc-45b7-98fd-77c89066e667,Human,-1.074762
1,8b2e671a-8164-4467-9f19-a6dc7cbf313c,Human,0.367653
2,7ab0ea89-c22f-4ca6-bd74-c3aec57452f8,Human,-0.765673
3,07ac7940-e702-498a-a97c-333cb233a847,Human,-0.971732
4,499fc744-a3e7-4927-afd5-b4df47ca7874,Human,0.161594


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.34895,2.07196,80,81,0.000628,"[0.14, 0.53]"
ICC2,Single random raters,0.346292,2.046403,80,80,0.000791,"[0.14, 0.52]"
ICC3,Single fixed raters,0.343488,2.046403,80,80,0.000791,"[0.14, 0.52]"
ICC1k,Average raters absolute,0.517365,2.07196,80,81,0.000628,"[0.25, 0.69]"
ICC2k,Average random raters,0.514438,2.046403,80,80,0.000791,"[0.24, 0.69]"
ICC3k,Average fixed raters,0.511338,2.046403,80,80,0.000791,"[0.24, 0.69]"


Cohen's kappa for rounded(z scores x 10): 0.3545040820172465
Algo score, swaps needed: 1231.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1566, 1671, 1587, 1454, 1224, 1567, 1473, 1583, 1492, 1568]
Pseudo-random, avergae swaps needed: 1566.7739

Q2
Mean Words: 947


Unnamed: 0,ID,judge,score
0,02694d63-a2bc-45b7-98fd-77c89066e667,Human,-0.535036
1,8b2e671a-8164-4467-9f19-a6dc7cbf313c,Human,0.205903
2,7ab0ea89-c22f-4ca6-bd74-c3aec57452f8,Human,-0.386848
3,07ac7940-e702-498a-a97c-333cb233a847,Human,-1.572349
4,499fc744-a3e7-4927-afd5-b4df47ca7874,Human,-0.683223


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.42947,2.505512,80,81,2.6e-05,"[0.23, 0.59]"
ICC2,Single random raters,0.427747,2.479341,80,80,3.4e-05,"[0.23, 0.59]"
ICC3,Single fixed raters,0.425178,2.479341,80,80,3.4e-05,"[0.23, 0.59]"
ICC1k,Average raters absolute,0.60088,2.505512,80,81,2.6e-05,"[0.38, 0.74]"
ICC2k,Average random raters,0.599191,2.479341,80,80,3.4e-05,"[0.38, 0.74]"
ICC3k,Average fixed raters,0.596667,2.479341,80,80,3.4e-05,"[0.37, 0.74]"


Cohen's kappa for rounded(z scores x 10): 0.4491160108350567
Algo score, swaps needed: 976.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1571, 1506, 1777, 1670, 1781, 1536, 1645, 1401, 1514, 1447]
Pseudo-random, avergae swaps needed: 1559.69847

../data/property_instructor_B

['ID', 'Q1', 'Q2']
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,51373c65-23cf-4027-bf63-ab9c1ce537b4,-0.866162,0.65718,0.467738,0.504613
1,83af18c1-132e-4728-b685-77436f376b67,0.78742,0.957513,0.460779,1.178211
2,0b4a8297-98e1-4153-bddb-d6562cd5041e,-0.748049,0.957513,-0.008358,-0.030989
3,05baba87-2dab-4b20-b95e-1150bd449ba4,0.078742,-0.544152,0.056882,0.289218
4,f1a855a4-dfef-48b6-99bc-b5e2324ce2fc,1.496097,-0.243819,0.308445,-1.200206


Number of entries: 88

Q1
Mean Words: 1431


Unnamed: 0,ID,judge,score
0,51373c65-23cf-4027-bf63-ab9c1ce537b4,Human,-0.866162
1,83af18c1-132e-4728-b685-77436f376b67,Human,0.78742
2,0b4a8297-98e1-4153-bddb-d6562cd5041e,Human,-0.748049
3,05baba87-2dab-4b20-b95e-1150bd449ba4,Human,0.078742
4,f1a855a4-dfef-48b6-99bc-b5e2324ce2fc,Human,1.496097


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.407178,2.373694,87,88,3.5e-05,"[0.22, 0.57]"
ICC2,Single random raters,0.405249,2.348002,87,87,4.6e-05,"[0.21, 0.57]"
ICC3,Single fixed raters,0.402629,2.348002,87,87,4.6e-05,"[0.21, 0.56]"
ICC1k,Average raters absolute,0.578716,2.373694,87,88,3.5e-05,"[0.36, 0.72]"
ICC2k,Average random raters,0.576765,2.348002,87,87,4.6e-05,"[0.35, 0.72]"
ICC3k,Average fixed raters,0.574106,2.348002,87,87,4.6e-05,"[0.35, 0.72]"


Cohen's kappa for rounded(z scores x 10): 0.47507119479746673
Algo score, swaps needed: 1205.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1643, 1502, 1963, 1926, 2102, 1804, 2026, 2149, 1937, 1971]
Pseudo-random, avergae swaps needed: 1838.54332

Q2
Mean Words: 1546


Unnamed: 0,ID,judge,score
0,51373c65-23cf-4027-bf63-ab9c1ce537b4,Human,0.65718
1,83af18c1-132e-4728-b685-77436f376b67,Human,0.957513
2,0b4a8297-98e1-4153-bddb-d6562cd5041e,Human,0.957513
3,05baba87-2dab-4b20-b95e-1150bd449ba4,Human,-0.544152
4,f1a855a4-dfef-48b6-99bc-b5e2324ce2fc,Human,-0.243819


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.285825,1.800433,87,88,0.003216,"[0.08, 0.47]"
ICC2,Single random raters,0.283259,1.78256,87,87,0.003795,"[0.08, 0.46]"
ICC3,Single fixed raters,0.281237,1.78256,87,87,0.003795,"[0.08, 0.46]"
ICC1k,Average raters absolute,0.444578,1.800433,87,88,0.003216,"[0.15, 0.64]"
ICC2k,Average random raters,0.441468,1.78256,87,87,0.003795,"[0.14, 0.63]"
ICC3k,Average fixed raters,0.439009,1.78256,87,87,0.003795,"[0.14, 0.63]"


Cohen's kappa for rounded(z scores x 10): 0.3682931944811738
Algo score, swaps needed: 1235.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1635, 1698, 1366, 1886, 1916, 1924, 1624, 1526, 1797, 1838]
Pseudo-random, avergae swaps needed: 1756.92053

../data/environ_instructor_B

['ID', 'Q2']
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q2_x,Q2_y
0,799aa117-453f-459f-8d42-dfc84e23f308,-0.182505,0.202074
1,f1a75c07-eda1-4552-907a-c976d9ae7c0c,1.607446,-0.942145
2,3307e79a-3565-478c-8749-b095d12c7e56,-0.95815,-0.540016
3,ab1fbbba-d6e0-4551-aed0-3fc164fa8570,-0.182505,-0.305673
4,c1dcfb9e-4f1e-4c8b-9ab6-7290cd1c035b,0.294815,0.12668


Number of entries: 28

Q2
Mean Words: 2094


Unnamed: 0,ID,judge,score
0,799aa117-453f-459f-8d42-dfc84e23f308,Human,-0.182505
1,f1a75c07-eda1-4552-907a-c976d9ae7c0c,Human,1.607446
2,4852efe1-c79c-4b78-94b2-de3958cdf5aa,Human,-0.779155
3,3307e79a-3565-478c-8749-b095d12c7e56,Human,-0.95815
4,ab1fbbba-d6e0-4551-aed0-3fc164fa8570,Human,-0.182505


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,-0.10323,0.812859,27,28,0.703723,"[-0.45, 0.27]"
ICC2,Single random raters,-0.123284,0.787375,27,27,0.730599,"[-0.49, 0.27]"
ICC3,Single fixed raters,-0.118959,0.787375,27,27,0.730599,"[-0.47, 0.26]"
ICC1k,Average raters absolute,-0.230226,0.812859,27,28,0.703723,"[-1.63, 0.43]"
ICC2k,Average random raters,-0.281241,0.787375,27,27,0.730599,"[-1.91, 0.42]"
ICC3k,Average fixed raters,-0.270043,0.787375,27,27,0.730599,"[-1.74, 0.41]"


Cohen's kappa for rounded(z scores x 10): -0.0984864410342654
Algo score, swaps needed: 191.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[177, 154, 157, 214, 116, 208, 193, 162, 137, 185]
Pseudo-random, avergae swaps needed: 182.4611

../data/PR_instructor_C

['ID', 'Q1', 'Q2']
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,f7422fbb-df4a-4788-8d19-03ca5ab472f4,-2.134996,-0.498226,-1.00946,0.428823
1,f835cc4c-68ff-4a67-a6a8-17c0d3997a2a,1.52901,0.420724,-0.427363,0.366663
2,330e4993-30e4-4192-9e02-07e960fad404,1.695556,1.339674,-0.151769,0.18163
3,22e8b506-0aa5-400a-aca0-ad1457dc1f19,0.529736,-1.417176,-0.107893,0.056022
4,473b0dda-6847-467e-8589-32a882dcfde1,0.196644,0.420724,-1.03341,-0.244192


Number of entries: 75

Q1
Mean Words: 936


Unnamed: 0,ID,judge,score
0,f7422fbb-df4a-4788-8d19-03ca5ab472f4,Human,-2.134996
1,f835cc4c-68ff-4a67-a6a8-17c0d3997a2a,Human,1.52901
2,330e4993-30e4-4192-9e02-07e960fad404,Human,1.695556
3,457c5af7-4834-422d-927e-fe3c52bcc0f6,Human,0.196644
4,22e8b506-0aa5-400a-aca0-ad1457dc1f19,Human,0.529736


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.30487,1.877158,74,75,0.003581,"[0.09, 0.5]"
ICC2,Single random raters,0.301607,1.852265,74,74,0.004372,"[0.08, 0.49]"
ICC3,Single fixed raters,0.298803,1.852265,74,74,0.004372,"[0.08, 0.49]"
ICC1k,Average raters absolute,0.46728,1.877158,74,75,0.003581,"[0.16, 0.66]"
ICC2k,Average random raters,0.463438,1.852265,74,74,0.004372,"[0.15, 0.66]"
ICC3k,Average fixed raters,0.46012,1.852265,74,74,0.004372,"[0.15, 0.66]"


Cohen's kappa for rounded(z scores x 10): 0.25088830064062484
Algo score, swaps needed: 1107.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1293, 1303, 1224, 1403, 1239, 1260, 1380, 1331, 1408, 1393]
Pseudo-random, avergae swaps needed: 1323.96503

Q2
Mean Words: 570


Unnamed: 0,ID,judge,score
0,f7422fbb-df4a-4788-8d19-03ca5ab472f4,Human,-0.498226
1,f835cc4c-68ff-4a67-a6a8-17c0d3997a2a,Human,0.420724
2,330e4993-30e4-4192-9e02-07e960fad404,Human,1.339674
3,457c5af7-4834-422d-927e-fe3c52bcc0f6,Human,1.339674
4,22e8b506-0aa5-400a-aca0-ad1457dc1f19,Human,-1.417176


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.161149,1.384214,74,75,0.081296,"[-0.07, 0.37]"
ICC2,Single random raters,0.156428,1.365987,74,74,0.091065,"[-0.07, 0.37]"
ICC3,Single fixed raters,0.154687,1.365987,74,74,0.091065,"[-0.07, 0.37]"
ICC1k,Average raters absolute,0.277568,1.384214,74,75,0.081296,"[-0.14, 0.54]"
ICC2k,Average random raters,0.270536,1.365987,74,74,0.091065,"[-0.16, 0.54]"
ICC3k,Average fixed raters,0.267929,1.365987,74,74,0.091065,"[-0.16, 0.54]"


Cohen's kappa for rounded(z scores x 10): 0.13987914505579846
Algo score, swaps needed: 1026.9973
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1227, 1247, 1206, 1162, 1225, 1022, 1158, 1218, 1310, 1174]
Pseudo-random, avergae swaps needed: 1219.80319

../data/contracts_instructor_D

['ID', 'Q1', 'Q2', 'Q3']
Leaving z scores in place
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q1_x,Q2_x,Q3_x,Q1_y,Q2_y,Q3_y
0,aa684706-1bbf-47a8-8828-62f8fde0f3c7,0.414326,0.041273,-0.075,-0.841008,-1.60608,-0.522871
1,bb065bd5-206a-4c0f-ba0f-f193651f5ac6,1.032264,-0.369096,1.580605,-0.876663,-0.508243,0.291306
2,e039c490-f389-4a42-a88e-15fc431b7865,0.414326,-0.266504,-0.464554,0.368989,-0.91404,0.246203
3,55530149-172a-4627-ba6a-99a4f2ae48c9,-0.958869,0.041273,0.509331,0.265634,0.15163,-0.018894
4,0193ec10-5235-470a-8037-73db98e4f036,0.345666,-0.266504,0.119777,0.255167,-0.76298,-0.607294


Number of entries: 78

Q1
Mean Words: 1823


Unnamed: 0,ID,judge,score
0,aa684706-1bbf-47a8-8828-62f8fde0f3c7,Human,0.414326
1,cf60d4fc-75ab-45fb-97ab-775ce4333299,Human,-1.164848
2,bb065bd5-206a-4c0f-ba0f-f193651f5ac6,Human,1.032264
3,e039c490-f389-4a42-a88e-15fc431b7865,Human,0.414326
4,55530149-172a-4627-ba6a-99a4f2ae48c9,Human,-0.958869


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.071258,1.153451,77,78,0.26544,"[-0.15, 0.29]"
ICC2,Single random raters,0.065783,1.139076,77,77,0.284585,"[-0.16, 0.28]"
ICC3,Single fixed raters,0.065017,1.139076,77,77,0.284585,"[-0.16, 0.28]"
ICC1k,Average raters absolute,0.133037,1.153451,77,78,0.26544,"[-0.36, 0.45]"
ICC2k,Average random raters,0.123446,1.139076,77,77,0.284585,"[-0.38, 0.44]"
ICC3k,Average fixed raters,0.122095,1.139076,77,77,0.284585,"[-0.38, 0.44]"


Cohen's kappa for rounded(z scores x 10): 0.06282814994898334
Algo score, swaps needed: 1395.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1420, 1594, 1488, 1392, 1551, 1455, 1469, 1517, 1471, 1520]
Pseudo-random, avergae swaps needed: 1472.55787

Q2
Mean Words: 1050


Unnamed: 0,ID,judge,score
0,aa684706-1bbf-47a8-8828-62f8fde0f3c7,Human,0.041273
1,cf60d4fc-75ab-45fb-97ab-775ce4333299,Human,-0.369096
2,bb065bd5-206a-4c0f-ba0f-f193651f5ac6,Human,-0.369096
3,e039c490-f389-4a42-a88e-15fc431b7865,Human,-0.266504
4,55530149-172a-4627-ba6a-99a4f2ae48c9,Human,0.041273


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.237117,1.621635,77,78,0.017324,"[0.02, 0.44]"
ICC2,Single random raters,0.233684,1.602808,77,77,0.020002,"[0.01, 0.43]"
ICC3,Single fixed raters,0.231599,1.602808,77,77,0.020002,"[0.01, 0.43]"
ICC1k,Average raters absolute,0.383338,1.621635,77,78,0.017324,"[0.03, 0.61]"
ICC2k,Average random raters,0.378839,1.602808,77,77,0.020002,"[0.02, 0.6]"
ICC3k,Average fixed raters,0.376095,1.602808,77,77,0.020002,"[0.02, 0.6]"


Cohen's kappa for rounded(z scores x 10): 0.2186304510280832
Algo score, swaps needed: 1176.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1557, 1458, 1541, 1435, 1422, 1508, 1454, 1342, 1443, 1460]
Pseudo-random, avergae swaps needed: 1450.1969

Q3
Mean Words: 839


Unnamed: 0,ID,judge,score
0,aa684706-1bbf-47a8-8828-62f8fde0f3c7,Human,-0.075
1,cf60d4fc-75ab-45fb-97ab-775ce4333299,Human,-0.367166
2,bb065bd5-206a-4c0f-ba0f-f193651f5ac6,Human,1.580605
3,e039c490-f389-4a42-a88e-15fc431b7865,Human,-0.464554
4,55530149-172a-4627-ba6a-99a4f2ae48c9,Human,0.509331


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.505779,3.046773,77,78,8.944293e-07,"[0.32, 0.65]"
ICC2,Single random raters,0.50419,3.007752,77,77,1.287034e-06,"[0.32, 0.65]"
ICC3,Single fixed raters,0.500967,3.007752,77,77,1.287034e-06,"[0.31, 0.65]"
ICC1k,Average raters absolute,0.671784,3.046773,77,78,8.944293e-07,"[0.49, 0.79]"
ICC2k,Average random raters,0.67038,3.007752,77,77,1.287034e-06,"[0.48, 0.79]"
ICC3k,Average fixed raters,0.667526,3.007752,77,77,1.287034e-06,"[0.48, 0.79]"


Cohen's kappa for rounded(z scores x 10): 0.49625066373222937
Algo score, swaps needed: 942.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1647, 1209, 1235, 1466, 1296, 1677, 1583, 1400, 1502, 1450]
Pseudo-random, avergae swaps needed: 1458.46293

../data/crim_instructor_E

['ID', 'Q1', 'Q2']
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,59b5388b-5b2b-430c-a246-0d8b83249966,0.176445,0.836545,0.990707,0.671642
1,e2459d8b-2f3f-469e-b9f9-f5acfe633b0d,1.298213,1.80063,-0.421015,-0.399676
2,0ab76566-bd98-4796-b32d-f23794084401,1.129947,-0.12754,0.669283,0.465763
3,af5f9a64-909b-4e1b-8100-e82f9813b148,1.186036,1.414996,0.801939,1.342865
4,f0166970-fdaf-4f6b-9ae6-e8e25ef8b198,0.00818,-0.513174,-0.564429,-0.736895


Number of entries: 92

Q1
Mean Words: 3353


Unnamed: 0,ID,judge,score
0,59b5388b-5b2b-430c-a246-0d8b83249966,Human,0.176445
1,e2459d8b-2f3f-469e-b9f9-f5acfe633b0d,Human,1.298213
2,0ab76566-bd98-4796-b32d-f23794084401,Human,1.129947
3,af5f9a64-909b-4e1b-8100-e82f9813b148,Human,1.186036
4,f0166970-fdaf-4f6b-9ae6-e8e25ef8b198,Human,0.00818


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.038908,1.080967,91,92,0.355056,"[-0.17, 0.24]"
ICC2,Single random raters,0.034047,1.069763,91,91,0.374203,"[-0.17, 0.24]"
ICC3,Single fixed raters,0.033706,1.069763,91,91,0.374203,"[-0.17, 0.24]"
ICC1k,Average raters absolute,0.074902,1.080967,91,92,0.355056,"[-0.4, 0.39]"
ICC2k,Average random raters,0.065852,1.069763,91,91,0.374203,"[-0.42, 0.38]"
ICC3k,Average fixed raters,0.065214,1.069763,91,91,0.374203,"[-0.41, 0.38]"


Cohen's kappa for rounded(z scores x 10): 0.1003820015488901
Algo score, swaps needed: 1757.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[2043, 2059, 1988, 2385, 2113, 2180, 1834, 2101, 1961, 2254]
Pseudo-random, avergae swaps needed: 2045.23351

Q2
Mean Words: 2137


Unnamed: 0,ID,judge,score
0,59b5388b-5b2b-430c-a246-0d8b83249966,Human,0.836545
1,e2459d8b-2f3f-469e-b9f9-f5acfe633b0d,Human,1.80063
2,0ab76566-bd98-4796-b32d-f23794084401,Human,-0.12754
3,af5f9a64-909b-4e1b-8100-e82f9813b148,Human,1.414996
4,f0166970-fdaf-4f6b-9ae6-e8e25ef8b198,Human,-0.513174


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.095113,1.21022,91,92,0.181434,"[-0.11, 0.29]"
ICC2,Single random raters,0.091335,1.199202,91,91,0.193938,"[-0.12, 0.29]"
ICC3,Single fixed raters,0.090579,1.199202,91,91,0.193938,"[-0.12, 0.29]"
ICC1k,Average raters absolute,0.173704,1.21022,91,92,0.181434,"[-0.25, 0.45]"
ICC2k,Average random raters,0.167383,1.199202,91,91,0.193938,"[-0.26, 0.45]"
ICC3k,Average fixed raters,0.166112,1.199202,91,91,0.193938,"[-0.26, 0.45]"


Cohen's kappa for rounded(z scores x 10): 0.14589349325298806
Algo score, swaps needed: 1715.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[2004, 1996, 1802, 2049, 2284, 1970, 1738, 2122, 2075, 1794]
Pseudo-random, avergae swaps needed: 1995.11491

Swaps (Machine):
[1320.0, 1231.0, 976.0, 1205.0, 1235.0, 191.0, 1107.0, 1026.9973, 1395.0, 1176.0, 942.0, 1757.0, 1715.0]
N: 13 	Mean:  1175.1536384615383 	Var:  138158.49941116854
Avergae swaps (Pseudo-random):
[1498.90842, 1566.7739, 1559.69847, 1838.54332, 1756.92053, 182.4611, 1323.96503, 1219.80319, 1472.55787, 1450.1969, 1458.46293, 2045.23351, 1995.11491]
N: 13 	Mean:  1489.8953907692307 	Var:  198469.2821672779 

1 ) 1498.90842 > 1320.0 => 12.693453872474509
2 ) 1566.7739 > 1231.0 => 24.002933189132968
3 ) 1559.69847 > 976.0 => 46.03847633350506
4 ) 1838.54332 > 1205.0 => 41.63195679435902
5 ) 1756.92053 > 1235.0 => 34.88866263436483
6 ) 182.4611 > 191.0 => -4.57284573948934
7 ) 1323.96503 > 1107.0 => 17.8501152688321

In [18]:
score_exams(exams,model='nlp2',normv=1,score=1,goal="centroid",runs=100000)


../data/property_instructor_A

['ID', 'SHORT_ANS', 'Q1', 'Q2']


Unnamed: 0,ID,SHORT_ANS_x,Q1_x,Q2_x,SHORT_ANS_y,Q1_y,Q2_y
0,e021f764-a5d2-432a-89f5-00ba2b760f0f,9,31,23,87,87,87
1,4eb794d7-21bd-4bcd-96e7-6807a2197049,5,8,5,77,73,80
2,7a25c1e8-4bd9-438f-ad5b-393ad679a0ab,3,13,11,70,83,80
3,b8efdf86-6377-4fc4-b0b3-64d1bf7df43a,5,27,18,80,80,87
4,351e0d0e-9901-4878-9d31-decad062eb91,2,21,6,87,90,70


Number of entries: 81

SHORT_ANS
Mean Words: 436
Algo score, swaps needed: 1002.03083
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1519, 1547, 1536, 1539, 1672, 1510, 1401, 1408, 1372, 1525]
Pseudo-random, avergae swaps needed: 1498.81278

Q1
Mean Words: 2048
Algo score, swaps needed: 1193.03933
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1555, 1660, 1662, 1749, 1595, 1720, 1584, 1280, 1739, 1650]
Pseudo-random, avergae swaps needed: 1566.86461

Q2
Mean Words: 947
Algo score, swaps needed: 970.57382
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1492, 1512, 1656, 1665, 1685, 1666, 1411, 1295, 1426, 1706]
Pseudo-random, avergae swaps needed: 1560.04081

../data/property_instructor_B

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,68202246-13a0-4c1f-a238-2590685176fa,20.0,21.0,90,87
1,d6db68eb-a97f-49b6-b974-8146d81c7417,13.5,13.0,80,80
2,d14fe6dc-ad38-4146-815b-82d3e66692cf,19.5,23.0,87,83
3,165f436c-ce5e-4cf6-8005-5c96b0801ab8,18.0,21.0,83,83
4,a6f29834-7a4e-4947-a2f0-3daeb1b23284,18.0,20.0,80,83


Number of entries: 88

Q1
Mean Words: 1431
Algo score, swaps needed: 1283.03442
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1884, 1924, 1945, 1884, 1731, 1831, 1816, 1727, 1834, 1888]
Pseudo-random, avergae swaps needed: 1837.85914

Q2
Mean Words: 1546
Algo score, swaps needed: 1230.06142
Swaps needed for first 10 out of 100000 pseudo-random runs:
[2108, 1767, 1512, 1821, 1741, 1660, 1653, 1618, 1630, 1764]
Pseudo-random, avergae swaps needed: 1757.28153

../data/environ_instructor_B

['ID', 'Q2']


Unnamed: 0,ID,Q2_x,Q2_y
0,c6f9b6cb-c5b8-4622-bdad-51aabad768cd,24.0,90
1,981c3959-fe3b-4222-97de-742768dee8e9,23.0,80
2,2872fa41-b643-452a-b6c6-87f4e140f336,43.0,87
3,dda68a48-2475-4338-9956-67fc7efbf74b,37.0,90
4,8fee9d29-68ff-4b2d-b54d-58f8303249ff,29.5,83


Number of entries: 28

Q2
Mean Words: 2094
Algo score, swaps needed: 91.00509
Swaps needed for first 10 out of 100000 pseudo-random runs:
[168, 193, 175, 192, 212, 191, 228, 212, 191, 175]
Pseudo-random, avergae swaps needed: 182.40981

../data/PR_instructor_C

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,a4a3f431-59a4-4fd1-b7fa-41bb5effd731,10.5,3.5,77,83
1,73c7a609-ce12-4e96-b1f4-58059cb60d4a,13.5,5.0,77,73
2,1ebf5189-e22f-4dfd-9f72-5ab06b37ca76,9.0,6.5,80,90
3,64ec3c75-b047-4c85-8b3e-99c63a1953aa,5.5,3.0,70,77
4,098adb0c-c0ab-4a2d-a590-8d917e3604f2,12.5,4.0,83,80


Number of entries: 75

Q1
Mean Words: 936
Algo score, swaps needed: 919.00724
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1498, 1277, 1260, 1280, 1377, 1230, 1555, 1374, 1264, 1274]
Pseudo-random, avergae swaps needed: 1324.88911

Q2
Mean Words: 570
Algo score, swaps needed: 834.03755
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1213, 1156, 1373, 1219, 1103, 1335, 1077, 1069, 1339, 1345]
Pseudo-random, avergae swaps needed: 1219.81396

../data/contracts_instructor_D

['ID', 'Q1', 'Q2', 'Q3']


Unnamed: 0,ID,Q1_x,Q2_x,Q3_x,Q1_y,Q2_y,Q3_y
0,ba674cb4-50fb-4023-8bb7-5ae68452d116,55,40,41,83,93,87
1,ab397bfe-7c68-4f02-baee-46407830b974,36,28,22,87,87,83
2,11db58c4-af5b-4ffb-a2be-a807e6d3b0f6,34,26,22,83,87,80
3,09e22550-57f9-46d9-b5e2-63a3b0f30890,43,31,23,83,83,83
4,c013d719-f4b0-4788-8955-41e80cd92edf,41,32,41,83,87,87


Number of entries: 78

Q1
Mean Words: 1823
Algo score, swaps needed: 1139.95249
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1473, 1248, 1600, 1508, 1557, 1409, 1458, 1547, 1574, 1450]
Pseudo-random, avergae swaps needed: 1471.74111

Q2
Mean Words: 1050
Algo score, swaps needed: 908.52944
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1438, 1543, 1439, 1637, 1741, 1580, 1491, 1531, 1358, 1271]
Pseudo-random, avergae swaps needed: 1450.04051

Q3
Mean Words: 839
Algo score, swaps needed: 857.42491
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1573, 1620, 1499, 1287, 1396, 1525, 1296, 1598, 1455, 1386]
Pseudo-random, avergae swaps needed: 1458.2128

../data/crim_instructor_E

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,5588c2b6-6cba-4955-a41c-b54a085d6751,15.5,15.0,83,80
1,922969bb-2109-4d45-8d53-919f843faa07,23.0,16.0,77,83
2,19504cf2-0e8e-47e7-8cd8-37ccc42cc860,20.5,15.0,83,83
3,59b5388b-5b2b-430c-a246-0d8b83249966,19.0,20.0,83,80
4,f9b1bd52-ca9b-4c23-8758-7dcb7006a1b1,6.25,12.0,77,77


Number of entries: 92

Q1
Mean Words: 3353
Algo score, swaps needed: 1745.88055
Swaps needed for first 10 out of 100000 pseudo-random runs:
[2349, 1807, 2094, 2042, 2084, 1777, 1822, 2271, 1952, 2124]
Pseudo-random, avergae swaps needed: 2045.07331

Q2
Mean Words: 2137
Algo score, swaps needed: 1731.0502
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1862, 1824, 1812, 1953, 1979, 1899, 1893, 2309, 1853, 1980]
Pseudo-random, avergae swaps needed: 1993.76677

Swaps (Machine):
[1002.03083, 1193.03933, 970.57382, 1283.03442, 1230.06142, 91.00509, 919.00724, 834.03755, 1139.95249, 908.52944, 857.42491, 1745.88055, 1731.0502]
N: 13 	Mean:  1069.6636376923077 	Var:  162116.16715370378
Avergae swaps (Pseudo-random):
[1498.81278, 1566.86461, 1560.04081, 1837.85914, 1757.28153, 182.40981, 1324.88911, 1219.81396, 1471.74111, 1450.04051, 1458.2128, 2045.07331, 1993.76677]
N: 13 	Mean:  1489.7543269230769 	Var:  198324.48312159677 

1 ) 1498.81278 > 1002.03083 => 39.729149636829945
2 

In [19]:
score_exams(exams,model='nlp2',normv=1,score=2,goal="centroid",runs=100000)


../data/property_instructor_A

['ID', 'SHORT_ANS', 'Q1', 'Q2']
Leaving z scores in place
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,SHORT_ANS_x,Q1_x,Q2_x,SHORT_ANS_y,Q1_y,Q2_y
0,7a4dfc2e-553b-4bc6-bc18-3b0626c17a1e,-0.439465,-0.559614,0.502278,0.807046,1.343111,-0.026743
1,d2054499-4dd2-4ea1-9dae-7751c37f58e5,0.120927,-1.074762,0.205903,1.037645,0.22043,0.220944
2,f09ccb39-f576-42af-9803-e5d3c10fdfba,-0.159269,1.19189,0.798654,-1.40979,0.657383,0.681871
3,341a81d8-db2d-4017-95ca-f11d39c62331,-0.439465,-1.383851,-0.683223,-1.035278,0.005974,0.18316
4,68975dd8-703c-43ba-8247-1e6cde4fe067,0.120927,-0.147495,0.798654,0.28859,0.551271,0.357628


Number of entries: 81

SHORT_ANS
Mean Words: 436


Unnamed: 0,ID,judge,score
0,7a4dfc2e-553b-4bc6-bc18-3b0626c17a1e,Human,-0.439465
1,d2054499-4dd2-4ea1-9dae-7751c37f58e5,Human,0.120927
2,f09ccb39-f576-42af-9803-e5d3c10fdfba,Human,-0.159269
3,341a81d8-db2d-4017-95ca-f11d39c62331,Human,-0.439465
4,68975dd8-703c-43ba-8247-1e6cde4fe067,Human,0.120927


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.469717,2.771568,80,81,4e-06,"[0.28, 0.62]"
ICC2,Single random raters,0.467995,2.738156,80,80,5e-06,"[0.28, 0.62]"
ICC3,Single fixed raters,0.464977,2.738156,80,80,5e-06,"[0.28, 0.62]"
ICC1k,Average raters absolute,0.639193,2.771568,80,81,4e-06,"[0.44, 0.77]"
ICC2k,Average random raters,0.637598,2.738156,80,80,5e-06,"[0.44, 0.77]"
ICC3k,Average fixed raters,0.634791,2.738156,80,80,5e-06,"[0.43, 0.77]"


Cohen's kappa for rounded(z scores x 10): 0.44649725553974373
Algo score, swaps needed: 1004.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1407, 1646, 1709, 1520, 1446, 1424, 1600, 1612, 1681, 1355]
Pseudo-random, avergae swaps needed: 1499.66689

Q1
Mean Words: 2048


Unnamed: 0,ID,judge,score
0,7a4dfc2e-553b-4bc6-bc18-3b0626c17a1e,Human,-0.559614
1,d2054499-4dd2-4ea1-9dae-7751c37f58e5,Human,-1.074762
2,f09ccb39-f576-42af-9803-e5d3c10fdfba,Human,1.19189
3,341a81d8-db2d-4017-95ca-f11d39c62331,Human,-1.383851
4,68975dd8-703c-43ba-8247-1e6cde4fe067,Human,-0.147495


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.457011,2.683315,80,81,7e-06,"[0.27, 0.61]"
ICC2,Single random raters,0.455164,2.650223,80,80,1e-05,"[0.26, 0.61]"
ICC3,Single fixed raters,0.452088,2.650223,80,80,1e-05,"[0.26, 0.61]"
ICC1k,Average raters absolute,0.627327,2.683315,80,81,7e-06,"[0.42, 0.76]"
ICC2k,Average random raters,0.625584,2.650223,80,80,1e-05,"[0.42, 0.76]"
ICC3k,Average fixed raters,0.622673,2.650223,80,80,1e-05,"[0.41, 0.76]"


Cohen's kappa for rounded(z scores x 10): 0.43873165193414976
Algo score, swaps needed: 1176.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1615, 1477, 1725, 1751, 1514, 1830, 1620, 1521, 1481, 1526]
Pseudo-random, avergae swaps needed: 1566.66396

Q2
Mean Words: 947


Unnamed: 0,ID,judge,score
0,7a4dfc2e-553b-4bc6-bc18-3b0626c17a1e,Human,0.502278
1,d2054499-4dd2-4ea1-9dae-7751c37f58e5,Human,0.205903
2,f09ccb39-f576-42af-9803-e5d3c10fdfba,Human,0.798654
3,341a81d8-db2d-4017-95ca-f11d39c62331,Human,-0.683223
4,68975dd8-703c-43ba-8247-1e6cde4fe067,Human,0.798654


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.533812,3.290118,80,81,1.06954e-07,"[0.36, 0.67]"
ICC2,Single random raters,0.53271,3.257148,80,80,1.504265e-07,"[0.36, 0.67]"
ICC3,Single fixed raters,0.530202,3.257148,80,80,1.504265e-07,"[0.35, 0.67]"
ICC1k,Average raters absolute,0.696059,3.290118,80,81,1.06954e-07,"[0.53, 0.8]"
ICC2k,Average random raters,0.695122,3.257148,80,80,1.504265e-07,"[0.53, 0.8]"
ICC3k,Average fixed raters,0.692983,3.257148,80,80,1.504265e-07,"[0.52, 0.8]"


Cohen's kappa for rounded(z scores x 10): 0.5623944153304041
Algo score, swaps needed: 898.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1656, 1621, 1500, 1456, 1641, 1532, 1604, 1468, 1524, 1539]
Pseudo-random, avergae swaps needed: 1560.97148

../data/property_instructor_B

['ID', 'Q1', 'Q2']
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,8a2c52d8-2f79-4f23-bbca-9dacca67bf27,-1.811065,-1.144818,-2.912223,0.19375
1,af76b772-5886-4e53-a62d-e87b17f052c9,-0.866162,-1.144818,-0.974972,-1.062013
2,381e6532-4d26-4cb3-bd80-79b5628f6ba8,-0.866162,0.356847,-0.010128,0.501987
3,68202246-13a0-4c1f-a238-2590685176fa,0.78742,0.65718,1.060997,0.638339
4,05baba87-2dab-4b20-b95e-1150bd449ba4,0.078742,-0.544152,-0.257584,-0.270717


Number of entries: 88

Q1
Mean Words: 1431


Unnamed: 0,ID,judge,score
0,8a2c52d8-2f79-4f23-bbca-9dacca67bf27,Human,-1.811065
1,af76b772-5886-4e53-a62d-e87b17f052c9,Human,-0.866162
2,381e6532-4d26-4cb3-bd80-79b5628f6ba8,Human,-0.866162
3,68202246-13a0-4c1f-a238-2590685176fa,Human,0.78742
4,05baba87-2dab-4b20-b95e-1150bd449ba4,Human,0.078742


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.443737,2.595423,87,88,6e-06,"[0.26, 0.6]"
ICC2,Single random raters,0.442045,2.567424,87,87,8e-06,"[0.26, 0.6]"
ICC3,Single fixed raters,0.439371,2.567424,87,87,8e-06,"[0.25, 0.59]"
ICC1k,Average raters absolute,0.614706,2.595423,87,88,6e-06,"[0.41, 0.75]"
ICC2k,Average random raters,0.613081,2.567424,87,87,8e-06,"[0.41, 0.75]"
ICC3k,Average fixed raters,0.610505,2.567424,87,87,8e-06,"[0.41, 0.74]"


Cohen's kappa for rounded(z scores x 10): 0.43825737722516567
Algo score, swaps needed: 1271.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1887, 1918, 1749, 1747, 1875, 1862, 1663, 1777, 1825, 1948]
Pseudo-random, avergae swaps needed: 1837.65831

Q2
Mean Words: 1546


Unnamed: 0,ID,judge,score
0,8a2c52d8-2f79-4f23-bbca-9dacca67bf27,Human,-1.144818
1,af76b772-5886-4e53-a62d-e87b17f052c9,Human,-1.144818
2,381e6532-4d26-4cb3-bd80-79b5628f6ba8,Human,0.356847
3,68202246-13a0-4c1f-a238-2590685176fa,Human,0.65718
4,05baba87-2dab-4b20-b95e-1150bd449ba4,Human,-0.544152


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.411264,2.39711,87,88,2.9e-05,"[0.22, 0.57]"
ICC2,Single random raters,0.409576,2.374046,87,87,3.7e-05,"[0.22, 0.57]"
ICC3,Single fixed raters,0.40724,2.374046,87,87,3.7e-05,"[0.22, 0.57]"
ICC1k,Average raters absolute,0.582831,2.39711,87,88,2.9e-05,"[0.36, 0.73]"
ICC2k,Average random raters,0.581133,2.374046,87,87,3.7e-05,"[0.36, 0.73]"
ICC3k,Average fixed raters,0.578778,2.374046,87,87,3.7e-05,"[0.36, 0.72]"


Cohen's kappa for rounded(z scores x 10): 0.4449625465040623
Algo score, swaps needed: 1196.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1957, 1785, 1632, 1665, 1572, 1925, 1867, 1909, 1699, 1764]
Pseudo-random, avergae swaps needed: 1757.52432

../data/environ_instructor_B

['ID', 'Q2']
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q2_x,Q2_y
0,4e406595-0ea8-477c-af86-86048cfcb9b8,-0.60016,0.391077
1,a7931454-2722-4600-ad92-baccd06a38a8,1.607446,0.803181
2,8fee9d29-68ff-4b2d-b54d-58f8303249ff,0.11582,0.064055
3,f1a75c07-eda1-4552-907a-c976d9ae7c0c,1.607446,1.073256
4,a6db6331-93ae-48a4-ab5d-04cd34cf3989,-0.421165,-0.033444


Number of entries: 28

Q2
Mean Words: 2094


Unnamed: 0,ID,judge,score
0,4e406595-0ea8-477c-af86-86048cfcb9b8,Human,-0.60016
1,a7931454-2722-4600-ad92-baccd06a38a8,Human,1.607446
2,8fee9d29-68ff-4b2d-b54d-58f8303249ff,Human,0.11582
3,f1a75c07-eda1-4552-907a-c976d9ae7c0c,Human,1.607446
4,1d16424e-886f-4420-9479-d806d0fd5262,Human,-0.182505


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.613381,4.173052,27,28,0.000167,"[0.32, 0.8]"
ICC2,Single random raters,0.611588,4.07573,27,27,0.000247,"[0.31, 0.8]"
ICC3,Single fixed raters,0.605968,4.07573,27,27,0.000247,"[0.31, 0.8]"
ICC1k,Average raters absolute,0.760367,4.173052,27,28,0.000167,"[0.49, 0.89]"
ICC2k,Average random raters,0.758988,4.07573,27,27,0.000247,"[0.48, 0.89]"
ICC3k,Average fixed raters,0.754645,4.07573,27,27,0.000247,"[0.47, 0.89]"


Cohen's kappa for rounded(z scores x 10): 0.6450260023485992
Algo score, swaps needed: 92.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[179, 237, 214, 164, 199, 154, 172, 223, 166, 194]
Pseudo-random, avergae swaps needed: 182.57292

../data/PR_instructor_C

['ID', 'Q1', 'Q2']
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,3d29b5ad-e82f-4972-89a4-adb948591eb1,1.195919,0.880199,0.36945,-2.099819
1,6c261e5d-2212-43a3-ab4f-28bd7dd90612,-0.80263,-1.417176,0.676527,0.349529
2,75fe19f4-4652-4851-afe6-affd379502de,-1.468813,-0.498226,-0.256541,-0.019297
3,a868c5d1-d01c-44b6-bbd8-1f9eb5857baa,0.030099,0.880199,-0.298173,-0.002833
4,1ebf5189-e22f-4dfd-9f72-5ab06b37ca76,-0.80263,1.799149,-0.392862,1.111576


Number of entries: 75

Q1
Mean Words: 936


Unnamed: 0,ID,judge,score
0,80abd277-bd34-4f34-b425-30190a9eaa98,Human,0.196644
1,3d29b5ad-e82f-4972-89a4-adb948591eb1,Human,1.195919
2,6c261e5d-2212-43a3-ab4f-28bd7dd90612,Human,-0.80263
3,75fe19f4-4652-4851-afe6-affd379502de,Human,-1.468813
4,a868c5d1-d01c-44b6-bbd8-1f9eb5857baa,Human,0.030099


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.50537,3.043424,74,75,1e-06,"[0.32, 0.66]"
ICC2,Single random raters,0.503724,3.003152,74,74,2e-06,"[0.31, 0.66]"
ICC3,Single fixed raters,0.500394,3.003152,74,74,2e-06,"[0.31, 0.65]"
ICC1k,Average raters absolute,0.671423,3.043424,74,75,1e-06,"[0.48, 0.79]"
ICC2k,Average random raters,0.669969,3.003152,74,74,2e-06,"[0.48, 0.79]"
ICC3k,Average fixed raters,0.667017,3.003152,74,74,2e-06,"[0.47, 0.79]"


Cohen's kappa for rounded(z scores x 10): 0.48119882696588434
Algo score, swaps needed: 919.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1241, 1148, 1364, 1282, 1272, 1410, 1209, 1216, 1255, 1265]
Pseudo-random, avergae swaps needed: 1324.95001

Q2
Mean Words: 570


Unnamed: 0,ID,judge,score
0,80abd277-bd34-4f34-b425-30190a9eaa98,Human,0.420724
1,3d29b5ad-e82f-4972-89a4-adb948591eb1,Human,0.880199
2,6c261e5d-2212-43a3-ab4f-28bd7dd90612,Human,-1.417176
3,75fe19f4-4652-4851-afe6-affd379502de,Human,-0.498226
4,a868c5d1-d01c-44b6-bbd8-1f9eb5857baa,Human,0.880199


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.375496,2.202543,70,71,0.000547,"[0.16, 0.56]"
ICC2,Single random raters,0.373043,2.175286,70,70,0.000687,"[0.15, 0.56]"
ICC3,Single fixed raters,0.370136,2.175286,70,70,0.000687,"[0.15, 0.55]"
ICC1k,Average raters absolute,0.545979,2.202543,70,71,0.000547,"[0.27, 0.72]"
ICC2k,Average random raters,0.543382,2.175286,70,70,0.000687,"[0.27, 0.72]"
ICC3k,Average fixed raters,0.54029,2.175286,70,70,0.000687,"[0.26, 0.71]"


Cohen's kappa for rounded(z scores x 10): 0.3848289799263249
Algo score, swaps needed: 750.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1017, 1013, 1238, 1015, 924, 1062, 1042, 1183, 845, 1043]
Pseudo-random, avergae swaps needed: 1091.64396

../data/contracts_instructor_D

['ID', 'Q1', 'Q2', 'Q3']
Leaving z scores in place
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q1_x,Q2_x,Q3_x,Q1_y,Q2_y,Q3_y
0,ba674cb4-50fb-4023-8bb7-5ae68452d116,0.551646,1.374969,1.677993,0.405869,1.715751,0.950219
1,41120ba2-92db-431f-935f-f7904ad647e8,0.414326,0.041273,0.314554,0.017621,0.589728,0.868936
2,fc5a292f-eeb9-413b-82b7-feae67846c21,-0.75289,-0.266504,-0.269777,-1.010606,0.337776,0.176556
3,4c68352f-27a5-49c2-aecd-887e788ab441,0.551646,-1.189832,-0.075,-0.163684,0.090107,0.386832
4,30234d01-ef3f-41bf-ae4a-1efaba3522d8,1.718862,1.477562,-0.075,1.154458,1.391002,0.837809


Number of entries: 78

Q1
Mean Words: 1823


Unnamed: 0,ID,judge,score
0,ba674cb4-50fb-4023-8bb7-5ae68452d116,Human,0.551646
1,41120ba2-92db-431f-935f-f7904ad647e8,Human,0.414326
2,fc5a292f-eeb9-413b-82b7-feae67846c21,Human,-0.75289
3,f65fe2ab-5f3e-4b04-9a45-8eb8a63e0c19,Human,-0.409591
4,417062c9-d62a-4452-8878-4c9c32b7a455,Human,0.345666


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.348297,2.068883,77,78,0.000793,"[0.14, 0.53]"
ICC2,Single random raters,0.345639,2.043412,77,77,0.000991,"[0.13, 0.53]"
ICC3,Single fixed raters,0.342843,2.043412,77,77,0.000991,"[0.13, 0.52]"
ICC1k,Average raters absolute,0.516647,2.068883,77,78,0.000793,"[0.24, 0.69]"
ICC2k,Average random raters,0.513718,2.043412,77,77,0.000991,"[0.23, 0.69]"
ICC3k,Average fixed raters,0.510622,2.043412,77,77,0.000991,"[0.23, 0.69]"


Cohen's kappa for rounded(z scores x 10): 0.3314799343147705
Algo score, swaps needed: 1184.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1463, 1557, 1576, 1463, 1453, 1446, 1514, 1426, 1520, 1489]
Pseudo-random, avergae swaps needed: 1472.83579

Q2
Mean Words: 1050


Unnamed: 0,ID,judge,score
0,ba674cb4-50fb-4023-8bb7-5ae68452d116,Human,1.374969
1,41120ba2-92db-431f-935f-f7904ad647e8,Human,0.041273
2,fc5a292f-eeb9-413b-82b7-feae67846c21,Human,-0.266504
3,f65fe2ab-5f3e-4b04-9a45-8eb8a63e0c19,Human,0.554233
4,417062c9-d62a-4452-8878-4c9c32b7a455,Human,-0.061319


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.449742,2.634659,77,78,1.5e-05,"[0.25, 0.61]"
ICC2,Single random raters,0.448031,2.6053,77,77,2e-05,"[0.25, 0.61]"
ICC3,Single fixed raters,0.445261,2.6053,77,77,2e-05,"[0.25, 0.61]"
ICC1k,Average raters absolute,0.620444,2.634659,77,78,1.5e-05,"[0.41, 0.76]"
ICC2k,Average random raters,0.618814,2.6053,77,77,2e-05,"[0.4, 0.76]"
ICC3k,Average fixed raters,0.616167,2.6053,77,77,2e-05,"[0.4, 0.76]"


Cohen's kappa for rounded(z scores x 10): 0.4694298872478958
Algo score, swaps needed: 905.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1586, 1400, 1382, 1464, 1372, 1383, 1560, 1658, 1394, 1431]
Pseudo-random, avergae swaps needed: 1450.35153

Q3
Mean Words: 839


Unnamed: 0,ID,judge,score
0,ba674cb4-50fb-4023-8bb7-5ae68452d116,Human,1.677993
1,41120ba2-92db-431f-935f-f7904ad647e8,Human,0.314554
2,fc5a292f-eeb9-413b-82b7-feae67846c21,Human,-0.269777
3,f65fe2ab-5f3e-4b04-9a45-8eb8a63e0c19,Human,-0.464554
4,417062c9-d62a-4452-8878-4c9c32b7a455,Human,1.483216


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.561804,3.564167,77,78,3.122207e-08,"[0.39, 0.7]"
ICC2,Single random raters,0.560555,3.518526,77,77,4.765327e-08,"[0.39, 0.7]"
ICC3,Single fixed raters,0.557378,3.518526,77,77,4.765327e-08,"[0.38, 0.69]"
ICC1k,Average raters absolute,0.71943,3.564167,77,78,3.122207e-08,"[0.56, 0.82]"
ICC2k,Average random raters,0.718405,3.518526,77,77,4.765327e-08,"[0.56, 0.82]"
ICC3k,Average fixed raters,0.71579,3.518526,77,77,4.765327e-08,"[0.55, 0.82]"


Cohen's kappa for rounded(z scores x 10): 0.5560003359180361
Algo score, swaps needed: 831.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1474, 1434, 1411, 1304, 1532, 1387, 1481, 1600, 1345, 1469]
Pseudo-random, avergae swaps needed: 1457.60248

../data/crim_instructor_E

['ID', 'Q1', 'Q2']
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,0091731f-bae9-4326-a66a-a744137584a2,-1.0575,-1.477259,0.367094,0.180184
1,83cb5f25-30c8-4aff-b6e7-1b9e323e657c,-1.169677,0.065277,0.590992,0.701665
2,546ad478-6cd2-4ad5-92c6-2f32ef76d808,-1.169677,-0.705991,-1.265803,-0.338207
3,f9b1bd52-ca9b-4c23-8758-7dcb7006a1b1,-2.684063,-0.705991,-0.672031,-0.991709
4,90104063-f02a-4ab0-b693-12b727a108e5,0.625152,-0.12754,-2.015043,-1.561243


Number of entries: 92

Q1
Mean Words: 3353


Unnamed: 0,ID,judge,score
0,0568651d-0e40-4bf5-8e24-d8f16403f49e,Human,-0.496616
1,0091731f-bae9-4326-a66a-a744137584a2,Human,-1.0575
2,83cb5f25-30c8-4aff-b6e7-1b9e323e657c,Human,-1.169677
3,546ad478-6cd2-4ad5-92c6-2f32ef76d808,Human,-1.169677
4,f9b1bd52-ca9b-4c23-8758-7dcb7006a1b1,Human,-2.684063


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.194317,1.482365,91,92,0.030563,"[-0.01, 0.38]"
ICC2,Single random raters,0.190936,1.467145,91,91,0.034549,"[-0.02, 0.38]"
ICC3,Single fixed raters,0.189346,1.467145,91,91,0.034549,"[-0.02, 0.38]"
ICC1k,Average raters absolute,0.325402,1.482365,91,92,0.030563,"[-0.02, 0.55]"
ICC2k,Average random raters,0.320648,1.467145,91,91,0.034549,"[-0.03, 0.55]"
ICC3k,Average fixed raters,0.318404,1.467145,91,91,0.034549,"[-0.03, 0.55]"


Cohen's kappa for rounded(z scores x 10): 0.1986696787148594
Algo score, swaps needed: 1729.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[2098, 2279, 2321, 2313, 1974, 1858, 2197, 2276, 1821, 2032]
Pseudo-random, avergae swaps needed: 2045.02919

Q2
Mean Words: 2137


Unnamed: 0,ID,judge,score
0,0568651d-0e40-4bf5-8e24-d8f16403f49e,Human,-0.12754
1,0091731f-bae9-4326-a66a-a744137584a2,Human,-1.477259
2,83cb5f25-30c8-4aff-b6e7-1b9e323e657c,Human,0.065277
3,546ad478-6cd2-4ad5-92c6-2f32ef76d808,Human,-0.705991
4,f9b1bd52-ca9b-4c23-8758-7dcb7006a1b1,Human,-0.705991


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.196336,1.488603,91,92,0.029217,"[-0.01, 0.38]"
ICC2,Single random raters,0.193431,1.47538,91,91,0.03258,"[-0.01, 0.38]"
ICC3,Single fixed raters,0.192043,1.47538,91,91,0.03258,"[-0.01, 0.38]"
ICC1k,Average raters absolute,0.328229,1.488603,91,92,0.029217,"[-0.01, 0.56]"
ICC2k,Average random raters,0.32416,1.47538,91,91,0.03258,"[-0.02, 0.55]"
ICC3k,Average fixed raters,0.322208,1.47538,91,91,0.03258,"[-0.02, 0.55]"


Cohen's kappa for rounded(z scores x 10): 0.2047859977228783
Algo score, swaps needed: 1753.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1804, 1946, 2214, 2029, 2130, 2049, 2043, 1926, 1742, 2155]
Pseudo-random, avergae swaps needed: 1993.26688

Swaps (Machine):
[1004.0, 1176.0, 898.0, 1271.0, 1196.0, 92.0, 919.0, 750.0, 1184.0, 905.0, 831.0, 1729.0, 1753.0]
N: 13 	Mean:  1054.4615384615386 	Var:  167547.01775147932
Avergae swaps (Pseudo-random):
[1499.66689, 1566.66396, 1560.97148, 1837.65831, 1757.52432, 182.57292, 1324.95001, 1091.64396, 1472.83579, 1450.35153, 1457.60248, 2045.02919, 1993.26688]
N: 13 	Mean:  1480.0567476923075 	Var:  204746.44657664374 

1 ) 1499.66689 > 1004.0 => 39.59527459341845
2 ) 1566.66396 > 1176.0 => 28.487920189828877
3 ) 1560.97148 > 898.0 => 53.92266526002976
4 ) 1837.65831 > 1271.0 => 36.45677675009577
5 ) 1757.52432 > 1196.0 => 38.02401870860505
6 ) 182.57292 > 92.0 => 65.9736728589258
7 ) 1324.95001 > 919.0 => 36.18173383461426
8 ) 

In [20]:
score_exams(exams,model='nlp2',normv=1,score=1,goal="medoid",runs=100000)


../data/property_instructor_A

['ID', 'SHORT_ANS', 'Q1', 'Q2']


Unnamed: 0,ID,SHORT_ANS_x,Q1_x,Q2_x,SHORT_ANS_y,Q1_y,Q2_y
0,e8c7236a-2cae-40ea-8d17-5405ce4a0e0f,12,17,14,90,80,83
1,cfa09378-0303-42fb-9ff1-066c8e2f7e88,4,14,12,70,83,80
2,4fda7a79-36eb-4380-923e-3878eaadddde,7,22,12,70,70,80
3,8b2e671a-8164-4467-9f19-a6dc7cbf313c,4,27,16,90,83,90
4,c6b6baea-3f79-4d14-bc22-83d685e04cfb,9,30,22,83,80,80


Number of entries: 81

SHORT_ANS
Mean Words: 436
Algo score, swaps needed: 1063.58667
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1416, 1695, 1575, 1607, 1510, 1478, 1474, 1643, 1455, 1365]
Pseudo-random, avergae swaps needed: 1499.33637

Q1
Mean Words: 2048
Algo score, swaps needed: 1364.57761
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1670, 1440, 1581, 1430, 1473, 1577, 1719, 1602, 1378, 1392]
Pseudo-random, avergae swaps needed: 1566.50795

Q2
Mean Words: 947
Algo score, swaps needed: 1112.97992
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1358, 1589, 1523, 1634, 1553, 1586, 1582, 1622, 1535, 1484]
Pseudo-random, avergae swaps needed: 1559.56629

../data/property_instructor_B

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,1b15cca5-b292-4c29-8e83-384ed76e7b77,20.0,18.0,77,77
1,da455e55-d2ac-4054-aae9-e62723a9dc6c,17.0,17.0,73,83
2,d43eda57-827c-4f18-af5f-dcd5b9660eec,24.0,21.0,80,83
3,9db4494e-5ddd-4a7d-b3e9-031a33003a02,22.0,25.0,90,87
4,346c7a84-2b02-4aaa-81f4-4ae8633b9b69,24.0,25.0,87,80


Number of entries: 88

Q1
Mean Words: 1431
Algo score, swaps needed: 1395.29161
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1914, 1909, 1743, 1821, 1602, 1776, 2085, 1889, 2085, 1876]
Pseudo-random, avergae swaps needed: 1837.29656

Q2
Mean Words: 1546
Algo score, swaps needed: 1554.92878
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1740, 1819, 1511, 1745, 1815, 1743, 1948, 1765, 1956, 1831]
Pseudo-random, avergae swaps needed: 1756.82817

../data/environ_instructor_B

['ID', 'Q2']


Unnamed: 0,ID,Q2_x,Q2_y
0,ab1fbbba-d6e0-4551-aed0-3fc164fa8570,27.0,80
1,f1a75c07-eda1-4552-907a-c976d9ae7c0c,42.0,97
2,3307e79a-3565-478c-8749-b095d12c7e56,20.5,83
3,daa06ff3-2614-4aa8-8fad-914ddc4b1cb3,28.0,83
4,981c3959-fe3b-4222-97de-742768dee8e9,23.0,80


Number of entries: 28

Q2
Mean Words: 2094
Algo score, swaps needed: 107.99537
Swaps needed for first 10 out of 100000 pseudo-random runs:
[148, 182, 159, 239, 188, 185, 224, 188, 201, 155]
Pseudo-random, avergae swaps needed: 182.46159

../data/PR_instructor_C

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,4e4a357a-bc9d-46b9-b0c5-01e68cb32ba0,10.0,3.0,83,77
1,473b0dda-6847-467e-8589-32a882dcfde1,12.0,5.0,80,73
2,836dd271-a288-4467-b688-9d61bc2b99ff,12.5,5.0,80,80
3,47285a02-f301-408b-b50a-6a25530cadae,7.0,3.5,83,77
4,872764dd-e402-419b-b819-b4e2337263c3,9.5,3.5,77,80


Number of entries: 75

Q1
Mean Words: 936
Algo score, swaps needed: 959.03298
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1200, 1360, 1517, 1144, 1154, 1326, 1191, 1400, 1227, 1367]
Pseudo-random, avergae swaps needed: 1324.14649

Q2
Mean Words: 570
Algo score, swaps needed: 1046.08702
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1441, 1139, 1282, 1129, 1251, 1229, 1027, 1248, 1286, 1377]
Pseudo-random, avergae swaps needed: 1219.73759

../data/contracts_instructor_D

['ID', 'Q1', 'Q2', 'Q3']


Unnamed: 0,ID,Q1_x,Q2_x,Q3_x,Q1_y,Q2_y,Q3_y
0,39268066-b798-49dd-8159-eda5299389ef,61,16,24,77,80,77
1,cee3d697-38c2-403f-a5e8-2bf351cced67,60,11,18,87,87,83
2,5a6f914f-609c-4043-b2ba-6ad26b7fb36c,41,4,25,87,87,83
3,ab397bfe-7c68-4f02-baee-46407830b974,36,28,22,83,83,83
4,ee5b756f-84b7-4707-ab9a-a9064b65ebe2,43,25,16,77,77,80


Number of entries: 78

Q1
Mean Words: 1823
Algo score, swaps needed: 1073.8407
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1327, 1577, 1514, 1633, 1585, 1558, 1714, 1623, 1338, 1434]
Pseudo-random, avergae swaps needed: 1472.20717

Q2
Mean Words: 1050
Algo score, swaps needed: 1225.50466
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1508, 1353, 1572, 1447, 1486, 1367, 1369, 1511, 1357, 1354]
Pseudo-random, avergae swaps needed: 1450.233

Q3
Mean Words: 839
Algo score, swaps needed: 1030.00018
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1336, 1551, 1475, 1520, 1395, 1334, 1650, 1456, 1199, 1303]
Pseudo-random, avergae swaps needed: 1457.32272

../data/crim_instructor_E

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,1f8baf28-0b57-47fe-bddb-0a27b92be278,14.5,19.0,73,80
1,3608d390-675a-4e40-afc1-1f17fdebe8e8,23.5,22.0,83,90
2,fcfd3df1-72e1-4e8a-b798-68fb0162b8b8,18.5,21.0,87,87
3,90104063-f02a-4ab0-b693-12b727a108e5,21.0,15.0,70,73
4,a577fd48-7230-4784-a7eb-6e268b042b9c,13.0,16.0,80,83


Number of entries: 92

Q1
Mean Words: 3353
Algo score, swaps needed: 1977.50891
Swaps needed for first 10 out of 100000 pseudo-random runs:
[2138, 1921, 2000, 2165, 2030, 2339, 2295, 2168, 2084, 1891]
Pseudo-random, avergae swaps needed: 2045.7049

Q2
Mean Words: 2137
Algo score, swaps needed: 1852.88044
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1987, 2261, 1935, 1799, 1967, 2180, 1988, 1884, 1907, 2168]
Pseudo-random, avergae swaps needed: 1993.2798

Swaps (Machine):
[1063.58667, 1364.57761, 1112.97992, 1395.29161, 1554.92878, 107.99537, 959.03298, 1046.08702, 1073.8407, 1225.50466, 1030.00018, 1977.50891, 1852.88044]
N: 13 	Mean:  1212.6319115384617 	Var:  197367.21358401564
Avergae swaps (Pseudo-random):
[1499.33637, 1566.50795, 1559.56629, 1837.29656, 1756.82817, 182.46159, 1324.14649, 1219.73759, 1472.20717, 1450.233, 1457.32272, 2045.7049, 1993.2798]
N: 13 	Mean:  1489.5868153846154 	Var:  198296.9387529382 

1 ) 1499.33637 > 1063.58667 => 34.00411898439214
2 )

In [21]:
score_exams(exams,model='nlp3',normv=1,score=1,goal="centroid",runs=100000)


../data/property_instructor_A

['ID', 'SHORT_ANS', 'Q1', 'Q2']


Unnamed: 0,ID,SHORT_ANS_x,Q1_x,Q2_x,SHORT_ANS_y,Q1_y,Q2_y
0,cfa09378-0303-42fb-9ff1-066c8e2f7e88,4,14,12,77,87,87
1,b4b58e7e-2a41-4e45-aca3-7b69511f3866,12,39,24,87,67,80
2,45942e14-a556-414d-b4d7-9287409db104,13,43,19,80,83,83
3,dcef7f09-f784-48cb-b65e-59d0293f4195,10,30,32,87,83,87
4,c74cd4a8-869a-4714-892e-be61183e7835,4,27,24,90,87,83


Number of entries: 81

SHORT_ANS
Mean Words: 436
Algo score, swaps needed: 1327.50167
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1528, 1180, 1568, 1544, 1479, 1658, 1465, 1672, 1642, 1495]
Pseudo-random, avergae swaps needed: 1499.65638

Q1
Mean Words: 2048
Algo score, swaps needed: 1283.97738
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1542, 1534, 1714, 1654, 1706, 1472, 1491, 1555, 1663, 1650]
Pseudo-random, avergae swaps needed: 1566.25132

Q2
Mean Words: 947
Algo score, swaps needed: 1217.39444
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1452, 1579, 1374, 1611, 1422, 1657, 1536, 1609, 1729, 1509]
Pseudo-random, avergae swaps needed: 1559.69437

../data/property_instructor_B

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,ecdebb90-4db9-4967-b6f9-ea3759679fcc,13.0,16.0,83,80
1,0cfd3781-8b12-4e17-8d9d-16c15ce0f246,12.5,18.0,87,87
2,6726ea7e-8d3e-46f5-bb70-5aff7c5b6402,9.0,15.0,80,87
3,a9738fe5-2df1-4661-b147-de435b7eeeb6,17.0,20.0,77,87
4,79fc7532-8d8e-4995-adbc-23e3aa4c75af,11.5,11.0,83,77


Number of entries: 88

Q1
Mean Words: 1431
Algo score, swaps needed: 1578.45767
Swaps needed for first 10 out of 100000 pseudo-random runs:
[2188, 1949, 1709, 1739, 1654, 1537, 1849, 1894, 1901, 1999]
Pseudo-random, avergae swaps needed: 1838.07482

Q2
Mean Words: 1546
Algo score, swaps needed: 1572.3984
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1861, 1911, 1747, 1763, 1721, 2084, 1601, 1552, 1730, 1722]
Pseudo-random, avergae swaps needed: 1757.55939

../data/environ_instructor_B

['ID', 'Q2']


Unnamed: 0,ID,Q2_x,Q2_y
0,4e406595-0ea8-477c-af86-86048cfcb9b8,23.5,83
1,c1609ab5-b028-447c-8f26-60c711841a33,20.0,60
2,3ad4aed5-023b-47c7-9ddf-3f2284a403b1,13.0,77
3,467e6ace-0fab-4f54-a773-979058184ac2,25.0,67
4,f870a18d-a560-4a6a-90b8-e1660d8877d1,31.0,87


Number of entries: 28

Q2
Mean Words: 2094
Algo score, swaps needed: 108.49357
Swaps needed for first 10 out of 100000 pseudo-random runs:
[217, 217, 183, 166, 208, 177, 189, 202, 152, 188]
Pseudo-random, avergae swaps needed: 182.49323

../data/PR_instructor_C

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,bb846754-6c06-4082-a827-eee27975d766,10.0,4.5,70,77
1,6c1573ff-01fa-4d9a-afce-2592baa321ef,12.5,6.0,87,90
2,098adb0c-c0ab-4a2d-a590-8d917e3604f2,12.5,4.0,87,80
3,1c8aeeb4-a1f3-492a-ae5c-87af4b3ad8e2,12.0,4.5,83,73
4,25f7b700-3561-4102-ba38-4753c54f9eb0,11.0,5.0,83,80


Number of entries: 75

Q1
Mean Words: 936
Algo score, swaps needed: 926.46827
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1257, 1476, 1332, 1302, 1213, 1242, 1292, 1587, 1375, 1339]
Pseudo-random, avergae swaps needed: 1324.23719

Q2
Mean Words: 570
Algo score, swaps needed: 1010.98591
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1294, 1186, 1123, 1187, 1364, 1105, 1404, 1091, 1111, 1172]
Pseudo-random, avergae swaps needed: 1219.24378

../data/contracts_instructor_D

['ID', 'Q1', 'Q2', 'Q3']


Unnamed: 0,ID,Q1_x,Q2_x,Q3_x,Q1_y,Q2_y,Q3_y
0,bb065bd5-206a-4c0f-ba0f-f193651f5ac6,62,23,40,77,73,90
1,7fc61220-6c30-428e-9cbe-ba29c1d2e8fb,62,40,32,90,83,80
2,aa684706-1bbf-47a8-8828-62f8fde0f3c7,53,27,23,77,70,80
3,6c715ed9-9520-4c0d-8d4f-b56ceef0320a,53,43,15,83,83,80
4,4556f34c-b15d-4f3e-b818-09cff2ec3400,54,28,6,67,77,63


Number of entries: 78

Q1
Mean Words: 1823
Algo score, swaps needed: 1355.55177
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1262, 1592, 1482, 1730, 1432, 1467, 1540, 1663, 1254, 1458]
Pseudo-random, avergae swaps needed: 1472.71635

Q2
Mean Words: 1050
Algo score, swaps needed: 1163.31542
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1685, 1454, 1395, 1511, 1317, 1490, 1587, 1411, 1326, 1382]
Pseudo-random, avergae swaps needed: 1449.21663

Q3
Mean Words: 839
Algo score, swaps needed: 1059.66915
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1593, 1364, 1310, 1344, 1345, 1578, 1490, 1285, 1628, 1401]
Pseudo-random, avergae swaps needed: 1458.5898

../data/crim_instructor_E

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,509aec91-34c1-4d28-b94d-f7845bfe5226,14.0,19.0,67,70
1,a577fd48-7230-4784-a7eb-6e268b042b9c,13.0,16.0,80,80
2,1f8baf28-0b57-47fe-bddb-0a27b92be278,14.5,19.0,63,80
3,83c20b41-645b-4931-97bf-870d498e43de,14.0,8.0,87,80
4,ba2dd55c-5abd-4f7f-8560-18efd2679188,23.0,27.0,80,83


Number of entries: 92

Q1
Mean Words: 3353
Algo score, swaps needed: 1718.97797
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1936, 2171, 2226, 2158, 1791, 1995, 2136, 2242, 2061, 2156]
Pseudo-random, avergae swaps needed: 2044.98433

Q2
Mean Words: 2137
Algo score, swaps needed: 1802.09925
Swaps needed for first 10 out of 100000 pseudo-random runs:
[2217, 2013, 1930, 2002, 2257, 2055, 1930, 1918, 1862, 1917]
Pseudo-random, avergae swaps needed: 1993.58644

Swaps (Machine):
[1327.50167, 1283.97738, 1217.39444, 1578.45767, 1572.3984, 108.49357, 926.46827, 1010.98591, 1355.55177, 1163.31542, 1059.66915, 1718.97797, 1802.09925]
N: 13 	Mean:  1240.40699 	Var:  174101.9627388273
Avergae swaps (Pseudo-random):
[1499.65638, 1566.25132, 1559.69437, 1838.07482, 1757.55939, 182.49323, 1324.23719, 1219.24378, 1472.71635, 1449.21663, 1458.5898, 2044.98433, 1993.58644]
N: 13 	Mean:  1489.7156946153846 	Var:  198340.275571395 

1 ) 1499.65638 > 1327.50167 => 12.178640667082606
2 ) 156

In [22]:
score_exams(exams,model='nlp3',normv=1,score=2,goal="centroid",runs=100000)


../data/property_instructor_A

['ID', 'SHORT_ANS', 'Q1', 'Q2']
Leaving z scores in place
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,SHORT_ANS_x,Q1_x,Q2_x,SHORT_ANS_y,Q1_y,Q2_y
0,afaca025-5d82-46e8-bfba-a1ce2167adca,-0.159269,0.470683,0.502278,0.144799,1.420513,-0.126103
1,d2054499-4dd2-4ea1-9dae-7751c37f58e5,0.120927,-1.074762,0.205903,0.994781,-0.083617,0.661264
2,5b58226f-9ae5-4ad2-8261-d659f95da503,0.681318,0.161594,0.354091,-0.413266,-1.325318,-0.985834
3,e021f764-a5d2-432a-89f5-00ba2b760f0f,0.681318,0.779772,1.243217,1.063094,0.701413,0.702417
4,07ac7940-e702-498a-a97c-333cb233a847,-1.280052,-0.971732,-1.572349,-2.286492,-3.295831,-1.693152


Number of entries: 81

SHORT_ANS
Mean Words: 436


Unnamed: 0,ID,judge,score
0,afaca025-5d82-46e8-bfba-a1ce2167adca,Human,-0.159269
1,065476cf-e25b-4039-a5b0-69bfd27e61dd,Human,-0.999856
2,d2054499-4dd2-4ea1-9dae-7751c37f58e5,Human,0.120927
3,5b58226f-9ae5-4ad2-8261-d659f95da503,Human,0.681318
4,e021f764-a5d2-432a-89f5-00ba2b760f0f,Human,0.681318


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.260254,1.703632,80,81,0.008882,"[0.05, 0.45]"
ICC2,Single random raters,0.256877,1.682955,80,80,0.010504,"[0.04, 0.45]"
ICC3,Single fixed raters,0.254553,1.682955,80,80,0.010504,"[0.04, 0.45]"
ICC1k,Average raters absolute,0.413019,1.703632,80,81,0.008882,"[0.09, 0.62]"
ICC2k,Average random raters,0.408755,1.682955,80,80,0.010504,"[0.08, 0.62]"
ICC3k,Average fixed raters,0.405807,1.682955,80,80,0.010504,"[0.08, 0.62]"


Cohen's kappa for rounded(z scores x 10): 0.2481144672391804
Algo score, swaps needed: 1261.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1657, 1563, 1476, 1497, 1484, 1431, 1528, 1640, 1694, 1440]
Pseudo-random, avergae swaps needed: 1499.90796

Q1
Mean Words: 2048


Unnamed: 0,ID,judge,score
0,afaca025-5d82-46e8-bfba-a1ce2167adca,Human,0.470683
1,065476cf-e25b-4039-a5b0-69bfd27e61dd,Human,-0.250525
2,d2054499-4dd2-4ea1-9dae-7751c37f58e5,Human,-1.074762
3,5b58226f-9ae5-4ad2-8261-d659f95da503,Human,0.161594
4,e021f764-a5d2-432a-89f5-00ba2b760f0f,Human,0.779772


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.260084,1.70301,80,81,0.008921,"[0.05, 0.45]"
ICC2,Single random raters,0.256649,1.682002,80,80,0.010573,"[0.04, 0.45]"
ICC3,Single fixed raters,0.254288,1.682002,80,80,0.010573,"[0.04, 0.45]"
ICC1k,Average raters absolute,0.412804,1.70301,80,81,0.008921,"[0.09, 0.62]"
ICC2k,Average random raters,0.408466,1.682002,80,80,0.010573,"[0.08, 0.62]"
ICC3k,Average fixed raters,0.40547,1.682002,80,80,0.010573,"[0.08, 0.62]"


Cohen's kappa for rounded(z scores x 10): 0.24637465479630627
Algo score, swaps needed: 1261.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1386, 1516, 1642, 1454, 1477, 1422, 1454, 1516, 1556, 1599]
Pseudo-random, avergae swaps needed: 1566.48321

Q2
Mean Words: 947


Unnamed: 0,ID,judge,score
0,afaca025-5d82-46e8-bfba-a1ce2167adca,Human,0.502278
1,065476cf-e25b-4039-a5b0-69bfd27e61dd,Human,-0.386848
2,d2054499-4dd2-4ea1-9dae-7751c37f58e5,Human,0.205903
3,5b58226f-9ae5-4ad2-8261-d659f95da503,Human,0.354091
4,e021f764-a5d2-432a-89f5-00ba2b760f0f,Human,1.243217


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.388763,2.272052,80,81,0.000145,"[0.19, 0.56]"
ICC2,Single random raters,0.38676,2.248032,80,80,0.000183,"[0.18, 0.56]"
ICC3,Single fixed raters,0.384243,2.248032,80,80,0.000183,"[0.18, 0.56]"
ICC1k,Average raters absolute,0.559869,2.272052,80,81,0.000145,"[0.32, 0.72]"
ICC2k,Average random raters,0.55779,2.248032,80,80,0.000183,"[0.31, 0.72]"
ICC3k,Average fixed raters,0.555166,2.248032,80,80,0.000183,"[0.31, 0.71]"


Cohen's kappa for rounded(z scores x 10): 0.3588961567372144
Algo score, swaps needed: 1165.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1596, 1573, 1416, 1584, 1633, 1674, 1734, 1627, 1799, 1432]
Pseudo-random, avergae swaps needed: 1560.01438

../data/property_instructor_B

['ID', 'Q1', 'Q2']
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,ef5e07b6-ab79-4558-880e-90df1e36afd2,0.78742,-0.243819,1.043109,-0.423503
1,8c9de0d1-eff5-4b5c-b1ff-7cecd2bb6c34,-1.102387,-2.045816,-2.175645,0.714271
2,6726ea7e-8d3e-46f5-bb70-5aff7c5b6402,-1.811065,-1.144818,-0.268016,0.66154
3,f1a855a4-dfef-48b6-99bc-b5e2324ce2fc,1.496097,-0.243819,0.08586,-0.344353
4,b2a62af1-d571-48c7-bece-f369e4005ea1,1.377984,0.356847,-0.103461,-0.202596


Number of entries: 88

Q1
Mean Words: 1431


Unnamed: 0,ID,judge,score
0,ef5e07b6-ab79-4558-880e-90df1e36afd2,Human,0.78742
1,8c9de0d1-eff5-4b5c-b1ff-7cecd2bb6c34,Human,-1.102387
2,6726ea7e-8d3e-46f5-bb70-5aff7c5b6402,Human,-1.811065
3,65949493-9050-46a2-82a6-eaf632b66ce1,Human,-1.102387
4,f1a855a4-dfef-48b6-99bc-b5e2324ce2fc,Human,1.496097


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.236197,1.618475,87,88,0.012669,"[0.03, 0.42]"
ICC2,Single random raters,0.232955,1.600762,87,87,0.014705,"[0.02, 0.42]"
ICC3,Single fixed raters,0.230995,1.600762,87,87,0.014705,"[0.02, 0.42]"
ICC1k,Average raters absolute,0.382134,1.618475,87,88,0.012669,"[0.06, 0.59]"
ICC2k,Average random raters,0.377881,1.600762,87,87,0.014705,"[0.05, 0.59]"
ICC3k,Average fixed raters,0.375298,1.600762,87,87,0.014705,"[0.05, 0.59]"


Cohen's kappa for rounded(z scores x 10): 0.23349407418070156
Algo score, swaps needed: 1571.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1687, 2044, 1821, 1907, 1937, 1958, 1896, 1776, 1888, 1933]
Pseudo-random, avergae swaps needed: 1837.17309

Q2
Mean Words: 1546


Unnamed: 0,ID,judge,score
0,ef5e07b6-ab79-4558-880e-90df1e36afd2,Human,-0.243819
1,8c9de0d1-eff5-4b5c-b1ff-7cecd2bb6c34,Human,-2.045816
2,6726ea7e-8d3e-46f5-bb70-5aff7c5b6402,Human,-1.144818
3,65949493-9050-46a2-82a6-eaf632b66ce1,Human,0.056514
4,f1a855a4-dfef-48b6-99bc-b5e2324ce2fc,Human,-0.243819


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.141915,1.330772,87,88,0.091562,"[-0.07, 0.34]"
ICC2,Single random raters,0.138117,1.317242,87,87,0.10039,"[-0.07, 0.34]"
ICC3,Single fixed raters,0.136905,1.317242,87,87,0.10039,"[-0.07, 0.34]"
ICC1k,Average raters absolute,0.248557,1.330772,87,88,0.091562,"[-0.15, 0.51]"
ICC2k,Average random raters,0.242711,1.317242,87,87,0.10039,"[-0.16, 0.51]"
ICC3k,Average fixed raters,0.240838,1.317242,87,87,0.10039,"[-0.16, 0.5]"


Cohen's kappa for rounded(z scores x 10): 0.14707776126795447
Algo score, swaps needed: 1500.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1797, 1585, 1739, 1934, 1827, 1921, 1911, 1744, 1878, 1636]
Pseudo-random, avergae swaps needed: 1756.61942

../data/environ_instructor_B

['ID', 'Q2']
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q2_x,Q2_y
0,467e6ace-0fab-4f54-a773-979058184ac2,-0.421165,-2.100272
1,c1609ab5-b028-447c-8f26-60c711841a33,-1.017815,-3.33373
2,740ba105-1c40-49be-b74d-e78072fdbf34,-0.421165,-0.653895
3,f870a18d-a560-4a6a-90b8-e1660d8877d1,0.294815,0.648157
4,8fee9d29-68ff-4b2d-b54d-58f8303249ff,0.11582,0.580038


Number of entries: 28

Q2
Mean Words: 2094


Unnamed: 0,ID,judge,score
0,1d16424e-886f-4420-9479-d806d0fd5262,Human,-0.182505
1,467e6ace-0fab-4f54-a773-979058184ac2,Human,-0.421165
2,c1609ab5-b028-447c-8f26-60c711841a33,Human,-1.017815
3,86ec23af-daa0-4987-b123-8b1aa763ea43,Human,1.428451
4,740ba105-1c40-49be-b74d-e78072fdbf34,Human,-0.421165


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.549521,3.439716,27,28,0.000876,"[0.23, 0.76]"
ICC2,Single random raters,0.546895,3.35343,27,27,0.001233,"[0.22, 0.76]"
ICC3,Single fixed raters,0.540592,3.35343,27,27,0.001233,"[0.22, 0.76]"
ICC1k,Average raters absolute,0.709278,3.439716,27,28,0.000876,"[0.38, 0.86]"
ICC2k,Average random raters,0.707087,3.35343,27,27,0.001233,"[0.36, 0.86]"
ICC3k,Average fixed raters,0.701798,3.35343,27,27,0.001233,"[0.36, 0.86]"


Cohen's kappa for rounded(z scores x 10): 0.6192638954277327
Algo score, swaps needed: 101.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[192, 207, 208, 172, 239, 181, 156, 200, 232, 185]
Pseudo-random, avergae swaps needed: 182.54523

../data/PR_instructor_C

['ID', 'Q1', 'Q2']
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,25f7b700-3561-4102-ba38-4753c54f9eb0,-0.136447,0.420724,0.309788,-0.16812
1,7ab9f35c-8fee-43f8-9dbb-3f0a89dcd43d,1.029373,0.420724,1.065766,1.126083
2,7e96b523-06ee-45f4-9c2e-6247c261f646,0.030099,-0.038751,0.750147,0.107727
3,eecc8b68-5ac3-4146-8346-f775be3eceef,0.862827,-0.498226,0.984727,-1.088773
4,735884fe-4a63-4515-a686-0086c998fa04,0.529736,0.880199,0.556233,


Number of entries: 75

Q1
Mean Words: 936


Unnamed: 0,ID,judge,score
0,457c5af7-4834-422d-927e-fe3c52bcc0f6,Human,0.196644
1,25f7b700-3561-4102-ba38-4753c54f9eb0,Human,-0.136447
2,7ab9f35c-8fee-43f8-9dbb-3f0a89dcd43d,Human,1.029373
3,7e96b523-06ee-45f4-9c2e-6247c261f646,Human,0.030099
4,eecc8b68-5ac3-4146-8346-f775be3eceef,Human,0.862827


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.547111,3.416093,74,75,1.387694e-07,"[0.37, 0.69]"
ICC2,Single random raters,0.545733,3.370922,74,74,2.080702e-07,"[0.36, 0.69]"
ICC3,Single fixed raters,0.542431,3.370922,74,74,2.080702e-07,"[0.36, 0.68]"
ICC1k,Average raters absolute,0.707268,3.416093,74,75,1.387694e-07,"[0.54, 0.81]"
ICC2k,Average random raters,0.706115,3.370922,74,74,2.080702e-07,"[0.53, 0.81]"
ICC3k,Average fixed raters,0.703345,3.370922,74,74,2.080702e-07,"[0.53, 0.81]"


Cohen's kappa for rounded(z scores x 10): 0.5176223234540123
Algo score, swaps needed: 891.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1337, 1390, 1485, 1361, 1347, 1121, 1224, 1365, 1413, 1325]
Pseudo-random, avergae swaps needed: 1324.22571

Q2
Mean Words: 570


Unnamed: 0,ID,judge,score
0,457c5af7-4834-422d-927e-fe3c52bcc0f6,Human,1.339674
1,25f7b700-3561-4102-ba38-4753c54f9eb0,Human,0.420724
2,7ab9f35c-8fee-43f8-9dbb-3f0a89dcd43d,Human,0.420724
3,7e96b523-06ee-45f4-9c2e-6247c261f646,Human,-0.038751
4,eecc8b68-5ac3-4146-8346-f775be3eceef,Human,-0.498226


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.180677,1.44104,70,71,0.063576,"[-0.05, 0.4]"
ICC2,Single random raters,0.176309,1.422623,70,70,0.071378,"[-0.06, 0.39]"
ICC3,Single fixed raters,0.174449,1.422623,70,70,0.071378,"[-0.06, 0.39]"
ICC1k,Average raters absolute,0.306057,1.44104,70,71,0.063576,"[-0.11, 0.57]"
ICC2k,Average random raters,0.299766,1.422623,70,70,0.071378,"[-0.13, 0.56]"
ICC3k,Average fixed raters,0.297073,1.422623,70,70,0.071378,"[-0.13, 0.56]"


Cohen's kappa for rounded(z scores x 10): 0.19914633009511273
Algo score, swaps needed: 927.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1061, 1163, 1159, 1110, 1160, 1128, 1084, 860, 1016, 971]
Pseudo-random, avergae swaps needed: 1091.79753

../data/contracts_instructor_D

['ID', 'Q1', 'Q2', 'Q3']
Leaving z scores in place
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q1_x,Q2_x,Q3_x,Q1_y,Q2_y,Q3_y
0,f5c0bbaa-1a35-4f2f-8c89-2267b0420452,1.444223,-1.189832,2.067547,-0.242691,-0.554376,-0.746189
1,fc5a292f-eeb9-413b-82b7-feae67846c21,-0.75289,-0.266504,-0.269777,-0.500509,0.385384,0.25236
2,1b694dbb-dcfa-4dfe-a12a-7bbc836ca5e3,-0.409591,-0.676872,-1.146274,1.034031,0.424481,0.736469
3,242b03f3-ec32-4e62-8bc1-26283ca38604,-2.538044,-1.08724,-1.43844,-2.210997,-2.543719,-2.074636
4,27169089-71cf-44da-906b-6690fbcd8a43,1.238243,1.272377,-0.075,-0.907106,-1.194529,-0.328051


Number of entries: 78

Q1
Mean Words: 1823


Unnamed: 0,ID,judge,score
0,1016b5f2-7f8a-496a-bb0b-42be41dcdfdc,Human,1.375563
1,f5c0bbaa-1a35-4f2f-8c89-2267b0420452,Human,1.444223
2,fc5a292f-eeb9-413b-82b7-feae67846c21,Human,-0.75289
3,1b694dbb-dcfa-4dfe-a12a-7bbc836ca5e3,Human,-0.409591
4,242b03f3-ec32-4e62-8bc1-26283ca38604,Human,-2.538044


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.181365,1.443091,77,78,0.054147,"[-0.04, 0.39]"
ICC2,Single random raters,0.177131,1.425175,77,77,0.061123,"[-0.05, 0.38]"
ICC3,Single fixed raters,0.175317,1.425175,77,77,0.061123,"[-0.05, 0.38]"
ICC1k,Average raters absolute,0.307043,1.443091,77,78,0.054147,"[-0.08, 0.56]"
ICC2k,Average random raters,0.300954,1.425175,77,77,0.061123,"[-0.1, 0.56]"
ICC3k,Average fixed raters,0.298332,1.425175,77,77,0.061123,"[-0.1, 0.55]"


Cohen's kappa for rounded(z scores x 10): 0.15788305287927895
Algo score, swaps needed: 1337.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1448, 1495, 1623, 1408, 1477, 1524, 1428, 1524, 1650, 1322]
Pseudo-random, avergae swaps needed: 1472.17265

Q2
Mean Words: 1050


Unnamed: 0,ID,judge,score
0,1016b5f2-7f8a-496a-bb0b-42be41dcdfdc,Human,-0.779464
1,f5c0bbaa-1a35-4f2f-8c89-2267b0420452,Human,-1.189832
2,fc5a292f-eeb9-413b-82b7-feae67846c21,Human,-0.266504
3,1b694dbb-dcfa-4dfe-a12a-7bbc836ca5e3,Human,-0.676872
4,242b03f3-ec32-4e62-8bc1-26283ca38604,Human,-1.08724


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.289093,1.813309,77,78,0.004737,"[0.07, 0.48]"
ICC2,Single random raters,0.286136,1.792416,77,77,0.005632,"[0.07, 0.48]"
ICC3,Single fixed raters,0.283774,1.792416,77,77,0.005632,"[0.07, 0.48]"
ICC1k,Average raters absolute,0.448522,1.813309,77,78,0.004737,"[0.14, 0.65]"
ICC2k,Average random raters,0.444954,1.792416,77,77,0.005632,"[0.13, 0.65]"
ICC3k,Average fixed raters,0.442094,1.792416,77,77,0.005632,"[0.12, 0.64]"


Cohen's kappa for rounded(z scores x 10): 0.30327190428767214
Algo score, swaps needed: 1112.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1382, 1456, 1489, 1456, 1502, 1497, 1362, 1474, 1492, 1417]
Pseudo-random, avergae swaps needed: 1449.68919

Q3
Mean Words: 839


Unnamed: 0,ID,judge,score
0,1016b5f2-7f8a-496a-bb0b-42be41dcdfdc,Human,0.217165
1,f5c0bbaa-1a35-4f2f-8c89-2267b0420452,Human,2.067547
2,fc5a292f-eeb9-413b-82b7-feae67846c21,Human,-0.269777
3,1b694dbb-dcfa-4dfe-a12a-7bbc836ca5e3,Human,-1.146274
4,242b03f3-ec32-4e62-8bc1-26283ca38604,Human,-1.43844


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.442772,2.589198,77,78,2e-05,"[0.25, 0.6]"
ICC2,Single random raters,0.440751,2.556033,77,77,2.8e-05,"[0.24, 0.6]"
ICC3,Single fixed raters,0.437576,2.556033,77,77,2.8e-05,"[0.24, 0.6]"
ICC1k,Average raters absolute,0.61378,2.589198,77,78,2e-05,"[0.4, 0.75]"
ICC2k,Average random raters,0.611835,2.556033,77,77,2.8e-05,"[0.39, 0.75]"
ICC3k,Average fixed raters,0.608769,2.556033,77,77,2.8e-05,"[0.39, 0.75]"


Cohen's kappa for rounded(z scores x 10): 0.4348797198997916
Algo score, swaps needed: 1040.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1274, 1442, 1494, 1658, 1769, 1608, 1452, 1387, 1498, 1384]
Pseudo-random, avergae swaps needed: 1457.75468

../data/crim_instructor_E

['ID', 'Q1', 'Q2']
Leaving z scores in place
Leaving z scores in place
Computing z scores for human scoring


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,e1308431-d29b-47a7-8201-12fd04f5cac9,0.849505,0.258094,0.24679,1.207579
1,7b74720b-c2e8-4eb3-9274-ea3fb17441e2,0.400798,-0.12754,-0.62628,0.38898
2,348bfa7e-1c58-4b5b-9b11-af9b507ee3cb,-0.272262,-0.12754,0.278983,0.527461
3,e63e6f2b-24aa-45e2-8cf4-ceba745a1010,-0.889235,-0.898808,-0.844614,-0.351062
4,f0166970-fdaf-4f6b-9ae6-e8e25ef8b198,0.00818,-0.513174,-2.355852,-0.574387


Number of entries: 92

Q1
Mean Words: 3353


Unnamed: 0,ID,judge,score
0,e1308431-d29b-47a7-8201-12fd04f5cac9,Human,0.849505
1,7b74720b-c2e8-4eb3-9274-ea3fb17441e2,Human,0.400798
2,348bfa7e-1c58-4b5b-9b11-af9b507ee3cb,Human,-0.272262
3,e63e6f2b-24aa-45e2-8cf4-ceba745a1010,Human,-0.889235
4,f0166970-fdaf-4f6b-9ae6-e8e25ef8b198,Human,0.00818


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.182515,1.446528,91,92,0.039475,"[-0.02, 0.37]"
ICC2,Single random raters,0.179031,1.431663,91,91,0.044348,"[-0.03, 0.37]"
ICC3,Single fixed raters,0.177518,1.431663,91,91,0.044348,"[-0.03, 0.37]"
ICC1k,Average raters absolute,0.308689,1.446528,91,92,0.039475,"[-0.04, 0.54]"
ICC2k,Average random raters,0.303692,1.431663,91,91,0.044348,"[-0.06, 0.54]"
ICC3k,Average fixed raters,0.301512,1.431663,91,91,0.044348,"[-0.06, 0.54]"


Cohen's kappa for rounded(z scores x 10): 0.17165245801214968
Algo score, swaps needed: 1755.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[2298, 2055, 2016, 2089, 2077, 2159, 1799, 1980, 1914, 1988]
Pseudo-random, avergae swaps needed: 2045.44676

Q2
Mean Words: 2137


Unnamed: 0,ID,judge,score
0,e1308431-d29b-47a7-8201-12fd04f5cac9,Human,0.258094
1,7b74720b-c2e8-4eb3-9274-ea3fb17441e2,Human,-0.12754
2,348bfa7e-1c58-4b5b-9b11-af9b507ee3cb,Human,-0.12754
3,e63e6f2b-24aa-45e2-8cf4-ceba745a1010,Human,-0.898808
4,f0166970-fdaf-4f6b-9ae6-e8e25ef8b198,Human,-0.513174


Unnamed: 0_level_0,Description,ICC,F,df1,df2,pval,CI95%
Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ICC1,Single raters absolute,0.217202,1.554939,91,92,0.01794,"[0.01, 0.4]"
ICC2,Single random raters,0.214463,1.541209,91,91,0.0202,"[0.01, 0.4]"
ICC3,Single fixed raters,0.212973,1.541209,91,91,0.0202,"[0.01, 0.4]"
ICC1k,Average raters absolute,0.356888,1.554939,91,92,0.01794,"[0.03, 0.57]"
ICC2k,Average random raters,0.353182,1.541209,91,91,0.0202,"[0.02, 0.57]"
ICC3k,Average fixed raters,0.351159,1.541209,91,91,0.0202,"[0.02, 0.57]"


Cohen's kappa for rounded(z scores x 10): 0.16671429748632016
Algo score, swaps needed: 1796.0
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1891, 1725, 2297, 1936, 1873, 2096, 2122, 2058, 2143, 1717]
Pseudo-random, avergae swaps needed: 1994.1412

Swaps (Machine):
[1261.0, 1261.0, 1165.0, 1571.0, 1500.0, 101.0, 891.0, 927.0, 1337.0, 1112.0, 1040.0, 1755.0, 1796.0]
N: 13 	Mean:  1209.0 	Var:  179109.23076923078
Avergae swaps (Pseudo-random):
[1499.90796, 1566.48321, 1560.01438, 1837.17309, 1756.61942, 182.54523, 1324.22571, 1091.79753, 1472.17265, 1449.68919, 1457.75468, 2045.44676, 1994.1412]
N: 13 	Mean:  1479.843923846154 	Var:  204790.024822612 

1 ) 1499.90796 > 1261.0 => 17.306477684971433
2 ) 1566.48321 > 1261.0 => 21.608136092167992
3 ) 1560.01438 > 1165.0 => 28.991728109706347
4 ) 1837.17309 > 1571.0 => 15.619693188763485
5 ) 1756.61942 > 1500.0 => 15.759865486523445
6 ) 182.54523 > 101.0 => 57.51832256180081
7 ) 1324.22571 > 891.0 => 39.11345991014162
8 ) 1091.

In [23]:
score_exams(exams,model='nlp3',normv=1,score=1,goal="medoid",runs=100000)


../data/property_instructor_A

['ID', 'SHORT_ANS', 'Q1', 'Q2']


Unnamed: 0,ID,SHORT_ANS_x,Q1_x,Q2_x,SHORT_ANS_y,Q1_y,Q2_y
0,c74cd4a8-869a-4714-892e-be61183e7835,4,27,24,87,87,80
1,88b8c7fa-b807-4ab3-b690-602e4b0787aa,6,18,20,83,87,83
2,afaca025-5d82-46e8-bfba-a1ce2167adca,6,28,18,90,87,83
3,de219fed-6a0e-4796-875c-300be31e1347,11,23,8,73,80,77
4,01759358-64dd-4801-b923-5a9914b62f19,6,33,19,87,83,83


Number of entries: 81

SHORT_ANS
Mean Words: 436
Algo score, swaps needed: 1362.98474
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1516, 1345, 1513, 1286, 1490, 1488, 1463, 1562, 1705, 1500]
Pseudo-random, avergae swaps needed: 1499.459

Q1
Mean Words: 2048
Algo score, swaps needed: 1259.61929
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1469, 1559, 1586, 1635, 1725, 1580, 1492, 1626, 1477, 1518]
Pseudo-random, avergae swaps needed: 1566.4181

Q2
Mean Words: 947
Algo score, swaps needed: 1312.33292
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1542, 1433, 1483, 1537, 1357, 1666, 1590, 1522, 1510, 1596]
Pseudo-random, avergae swaps needed: 1560.06736

../data/property_instructor_B

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,4ef2af3a-05d9-4992-9746-a24945431d7b,16.0,23.0,77,60
1,0b4a8297-98e1-4153-bddb-d6562cd5041e,13.5,22.0,80,77
2,5b2a6aa4-fdf6-40fc-880c-09a15c2fe77b,12.0,21.0,83,80
3,732a8471-dca7-4b81-be0c-606cc6e86d0a,22.0,16.0,77,83
4,6dc8ed5f-bc91-4ed5-bc1e-2de723a3aae9,11.0,13.0,83,83


Number of entries: 88

Q1
Mean Words: 1431
Algo score, swaps needed: 1426.94488
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1883, 1719, 1765, 1931, 2222, 1610, 2100, 1847, 2133, 1695]
Pseudo-random, avergae swaps needed: 1837.57631

Q2
Mean Words: 1546
Algo score, swaps needed: 1692.05737
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1819, 1696, 2002, 2010, 1823, 1655, 1400, 1700, 1796, 1394]
Pseudo-random, avergae swaps needed: 1758.10921

../data/environ_instructor_B

['ID', 'Q2']


Unnamed: 0,ID,Q2_x,Q2_y
0,47e96791-9e80-4638-b63a-a00903ce8508,30.0,87
1,a6db6331-93ae-48a4-ab5d-04cd34cf3989,25.0,80
2,8fee9d29-68ff-4b2d-b54d-58f8303249ff,29.5,80
3,467e6ace-0fab-4f54-a773-979058184ac2,25.0,77
4,dda68a48-2475-4338-9956-67fc7efbf74b,37.0,87


Number of entries: 28

Q2
Mean Words: 2094
Algo score, swaps needed: 134.47636
Swaps needed for first 10 out of 100000 pseudo-random runs:
[192, 172, 172, 162, 202, 198, 219, 165, 156, 158]
Pseudo-random, avergae swaps needed: 182.52418

../data/PR_instructor_C

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,6c261e5d-2212-43a3-ab4f-28bd7dd90612,9.0,3.0,77,83
1,3d29b5ad-e82f-4972-89a4-adb948591eb1,15.0,5.5,83,77
2,12190cda-1a67-461b-8ffd-f080188457c5,15.5,5.0,80,83
3,47285a02-f301-408b-b50a-6a25530cadae,7.0,3.5,83,83
4,7525acd7-87ff-42d3-b5f0-700d3234cb05,7.0,4.0,59,59


Number of entries: 75

Q1
Mean Words: 936
Algo score, swaps needed: 1075.06208
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1296, 1434, 1373, 1328, 1232, 1426, 1332, 1390, 1110, 1508]
Pseudo-random, avergae swaps needed: 1324.45835

Q2
Mean Words: 570
Algo score, swaps needed: 1107.89282
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1227, 1052, 1203, 1293, 1183, 1170, 1071, 1094, 1185, 1243]
Pseudo-random, avergae swaps needed: 1219.53493

../data/contracts_instructor_D

['ID', 'Q1', 'Q2', 'Q3']


Unnamed: 0,ID,Q1_x,Q2_x,Q3_x,Q1_y,Q2_y,Q3_y
0,1b273341-a255-40ef-9a90-17fbdf05af4b,48,39,31,83,83,87
1,ed6cf722-94b9-4474-8bf7-318cb15713b8,67,25,28,70,83,87
2,4c68352f-27a5-49c2-aecd-887e788ab441,55,15,23,77,77,83
3,6266e45d-2172-43e6-98f5-f9ed6d82b414,62,31,29,87,90,87
4,c04e80c3-8a55-408c-afd7-c6d5197b5fd2,50,40,18,83,77,83


Number of entries: 78

Q1
Mean Words: 1823
Algo score, swaps needed: 1334.62252
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1595, 1470, 1469, 1542, 1520, 1501, 1542, 1452, 1287, 1395]
Pseudo-random, avergae swaps needed: 1471.70444

Q2
Mean Words: 1050
Algo score, swaps needed: 1083.63437
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1337, 1567, 1517, 1591, 1493, 1389, 1308, 1467, 1237, 1405]
Pseudo-random, avergae swaps needed: 1449.61455

Q3
Mean Words: 839
Algo score, swaps needed: 1002.36012
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1443, 1478, 1428, 1579, 1330, 1318, 1407, 1477, 1443, 1369]
Pseudo-random, avergae swaps needed: 1458.0963

../data/crim_instructor_E

['ID', 'Q1', 'Q2']


Unnamed: 0,ID,Q1_x,Q2_x,Q1_y,Q2_y
0,19504cf2-0e8e-47e7-8cd8-37ccc42cc860,20.5,15.0,73,80
1,e1308431-d29b-47a7-8201-12fd04f5cac9,22.0,17.0,77,83
2,9db89209-0865-45ae-a102-45967994336f,21.0,14.0,90,83
3,85ca12a0-f4e5-4045-8350-676e0e159221,22.0,17.0,97,73
4,0a1cbcb8-fb01-45be-bb70-9c97770f647c,18.0,11.0,87,87


Number of entries: 92

Q1
Mean Words: 3353
Algo score, swaps needed: 1803.05968
Swaps needed for first 10 out of 100000 pseudo-random runs:
[1878, 2178, 1916, 2063, 1978, 2051, 2153, 2108, 2209, 2108]
Pseudo-random, avergae swaps needed: 2045.09657

Q2
Mean Words: 2137
Algo score, swaps needed: 1807.38171
Swaps needed for first 10 out of 100000 pseudo-random runs:
[2117, 2052, 1817, 2172, 2064, 2062, 2176, 1942, 2130, 2155]
Pseudo-random, avergae swaps needed: 1993.39235

Swaps (Machine):
[1362.98474, 1259.61929, 1312.33292, 1426.94488, 1692.05737, 134.47636, 1075.06208, 1107.89282, 1334.62252, 1083.63437, 1002.36012, 1803.05968, 1807.38171]
N: 13 	Mean:  1261.725296923077 	Var:  173044.9624971154
Avergae swaps (Pseudo-random):
[1499.459, 1566.4181, 1560.06736, 1837.57631, 1758.10921, 182.52418, 1324.45835, 1219.53493, 1471.70444, 1449.61455, 1458.0963, 2045.09657, 1993.39235]
N: 13 	Mean:  1489.696280769231 	Var:  198315.17725239138 

1 ) 1499.459 > 1362.98474 => 9.535506888250664
2 )

[back to contents](#Contents)