# Libraries

In [1]:
import pickle
from sklearn.metrics import log_loss
from __future__ import print_function

# Get Data & Models For Testing

In [2]:
with open("/Users/davidziganto/Repositories/Synthetic_Dataset_Generation/pickle_files/py27/X_test_py27.pkl", 'rb') as picklefile: 
    X_test = pickle.load(picklefile)
    
with open("/Users/davidziganto/Repositories/Synthetic_Dataset_Generation/pickle_files/py27/y_test_py27.pkl", 'rb') as picklefile: 
    y_test = pickle.load(picklefile)

with open("/Users/davidziganto/Repositories/Synthetic_Dataset_Generation/pickle_files/py27/knn_needs_improvement_py27.pkl", 'rb') as picklefile: 
    knn_needs_improvement = pickle.load(picklefile)
    
with open("/Users/davidziganto/Repositories/Synthetic_Dataset_Generation/pickle_files/py27/rf_satisfactory_py27.pkl", 'rb') as picklefile: 
    rf_satisfactory = pickle.load(picklefile)

with open("/Users/davidziganto/Repositories/Synthetic_Dataset_Generation/pickle_files/py27/gbc_proficient_py27.pkl", 'rb') as picklefile: 
    gbc_proficient = pickle.load(picklefile)

# Calculate Log Loss

In [3]:
log_loss(y_test, knn_needs_improvement.predict_proba(X_test))

0.77172596402024607

In [4]:
log_loss(y_test, rf_satisfactory.predict_proba(X_test))

0.57484754370751456

In [5]:
log_loss(y_test, gbc_proficient.predict_proba(X_test))

0.54224558841354176

# Create Pickle_Dict

In [6]:
pickle_dict = {'knn':knn_needs_improvement, 'rf_':rf_satisfactory, 'gbc':gbc_proficient}

# Auto_Score()

In [7]:
def auto_score(pickle_dict):
    '''
    Input:
        pickle_dict: dictionary where key is username | ID and value is model
    Output:
        username : [log loss value, classification] 
    '''
    for k, v in pickle_dict.items():
        score = round(log_loss(y_test, v.predict_proba(X_test)), 3)
        if score < 0.57:
            pickle_dict[k] = [score, "Proficient"]
        elif score <= 0.60:
            pickle_dict[k] = [score, "Satisfactory"]
        else:
            pickle_dict[k] = [score, "Needs Improvement"]
            
    return pickle_dict

In [8]:
output = auto_score(pickle_dict)
output

{'gbc': [0.542, 'Proficient'],
 'knn': [0.772, 'Needs Improvement'],
 'rf_': [0.575, 'Satisfactory']}

In [9]:
for k,v in output.items():
    print(k, output[k][0], output[k][1])

knn 0.772 Needs Improvement
rf_ 0.575 Satisfactory
gbc 0.542 Proficient


# Rationale For Scoring

Default settings for KNN, Decision Trees, Logistic Regression, and Random Forest yield log loss values on the test set of 2.348, 13.276, 0.640, 0.608, respectively.

The goal here is to determine each invidual's skill level in achieving performant modeling results. As such, it made sense to me to set the the threshold for *satisfactory* just below the lowest achievable value yielded by default model settings, which turned out to be 0.608 for random forest. A tuned random forest will produce a log loss less than 0.58 on the test set. 

Furthermore, in an attempt to separate the high-achieving students, a category called *proficient* is included. In order to achieve this status, a student must use modeling techniques either not covered or covered only very little detail to achieve the required log loss value. Therefore, the threshold (0.570) was set just below the log loss value of a tuned random forest (0.575). For instance, a tuned gradient boosted classifier can achieve a log loss of 0.542.

**Note: There will be some variablity with inherently non-deterministic algorithms like random forest. The techniques will be the same but the results may vary slightly due to how the model was seeded.**

**Note: This is a first attempt. Score thresholds can be adjusted as we collect data in our pilot program. In other words, this is a WIP.**