# SETI Test Set Classification Accuracy

This notebook provides the code needed to calculate the performance of your signal classification models using the PREVIEW test set (see [Step 1. Get Data notebook](https://github.com/setiQuest/ML4SETI/blob/master/tutorials/Step_1_Get_Data.ipynb))

In [1]:
from sklearn.metrics import classification_report
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder
import numpy as np
import sklearn
import csv
import operator

In [2]:
class_list = ['brightpixel', 'narrowband', 'narrowbanddrd', 'noise', 'squarepulsednarrowband', 'squiggle', 'squigglesquarepulsednarrowband']

In [3]:
fieldnames = ['uuid'] + class_list

In [7]:
#Helper functions for parsing the data and using sklearn to print scoring metrics

def classChooser(listOfDictionaryScores):
    results = []
    for row in listOfDictionaryScores:
        rowscores = dict((k, float(row[k])) for k in class_list)
        maxclass = max(rowscores.iteritems(), key=operator.itemgetter(1))[0]
        results.append({'UUID':row['uuid'], 'SIGNAL_CLASSIFICATION':maxclass})
        
    return results

def printsklearnScores(y_true, y_pred, y_prob):
    
    print sklearn.metrics.classification_report(y_true,y_pred, digits=5)
    print sklearn.metrics.confusion_matrix(y_true,y_pred)
    print("Classification accuracy: %0.6f" % sklearn.metrics.accuracy_score(y_true,y_pred) )
    print("Log Loss: %0.6f" % sklearn.metrics.log_loss(y_true,y_prob) )
    
def score(resultsFile):

    testSetFile = 'private_list_primary_v3_testset_preview_uuid_class_29june_2017.csv'
    
    actual_uuid = csv.DictReader(open(testSetFile))
    actual_uuid_list = [x for x in actual_uuid]
    actual_uuid_list_sorted = sorted(actual_uuid_list, key=lambda k: k['UUID']) 

    classifier_results = csv.DictReader(open(resultsFile), fieldnames=fieldnames)
    classifier_results_list = [x for x in classifier_results]
    classifier_results_list_sorted = sorted(classifier_results_list, key=lambda k: k['uuid']) 

    #yc = classChooser(classifier_results_list_sorted)
    #print yc[:5]
    
    y_true = [x['SIGNAL_CLASSIFICATION'] for x in actual_uuid_list_sorted]
    y_pred = [x['SIGNAL_CLASSIFICATION'] for x in classChooser(classifier_results_list_sorted)]
    y_prob = [[float(row[cl]) for cl in class_list] for row in classifier_results_list_sorted] 
    
    printsklearnScores(y_true, y_pred, y_prob)
   


## Scoring a Scorecard

The Preview test data set can be obtained in the [Step 1. Get Data notebook](https://github.com/setiQuest/ML4SETI/blob/master/tutorials/Step_1_Get_Data.ipynb).  Using your model, you can generate a scorecard for this preview test data set. Your scorecard must be a CSV file with 8 columns. The first column value will contain the UUID and the next 7 will contain the probability estimates for each of the classes that were produced by your model. See the [Judging Information notebook for more information](https://github.com/setiQuest/ML4SETI/blob/master/Judging_Criteria.ipynb).

Now you can score the scorecard you just created using this code. [We now are providing the Preview test data set key! This will give you the exact answers, so don't cheat.]  

<br>
### Using the Example Scorecard.

On the [Judging Information notebook](https://github.com/setiQuest/ML4SETI/blob/master/Judging_Criteria.ipynb) there is a link to download an example scorecard. Your scorecard should be similar. 

In [8]:
score('example_scorecard_codechallenge_v3_testset_preview.csv')

                                precision    recall  f1-score   support

                   brightpixel    0.12349   0.12812   0.12577       320
                    narrowband    0.16467   0.14474   0.15406       380
                 narrowbanddrd    0.19274   0.19885   0.19574       347
                         noise    0.12392   0.12798   0.12592       336
        squarepulsednarrowband    0.16298   0.17302   0.16785       341
                      squiggle    0.18106   0.17711   0.17906       367
squigglesquarepulsednarrowband    0.13043   0.13003   0.13023       323

                   avg / total    0.15525   0.15493   0.15495      2414

[[41 51 50 38 35 60 45]
 [47 55 61 65 48 56 48]
 [42 46 69 52 44 42 52]
 [47 47 56 43 53 41 49]
 [42 42 43 47 59 56 52]
 [60 42 45 64 57 65 34]
 [53 51 34 38 66 39 42]]
Classification accuracy: 0.154930
Log Loss: 2.230604


<br>
### Using the Preview Test Set Key.

If I use the preview test set key, then I will get a perfect score.

In [9]:
#Test with the scoreboard key. This should get 100% accuracy
score('private_list_primary_v3_testset_preview_scoreboard_key_29june_2017.csv')

                                precision    recall  f1-score   support

                   brightpixel    1.00000   1.00000   1.00000       320
                    narrowband    1.00000   1.00000   1.00000       380
                 narrowbanddrd    1.00000   1.00000   1.00000       347
                         noise    1.00000   1.00000   1.00000       336
        squarepulsednarrowband    1.00000   1.00000   1.00000       341
                      squiggle    1.00000   1.00000   1.00000       367
squigglesquarepulsednarrowband    1.00000   1.00000   1.00000       323

                   avg / total    1.00000   1.00000   1.00000      2414

[[320   0   0   0   0   0   0]
 [  0 380   0   0   0   0   0]
 [  0   0 347   0   0   0   0]
 [  0   0   0 336   0   0   0]
 [  0   0   0   0 341   0   0]
 [  0   0   0   0   0 367   0]
 [  0   0   0   0   0   0 323]]
Classification accuracy: 1.000000
Log Loss: 0.000000


In [10]:
score("results_EffSubZee_21july.csv")

                                precision    recall  f1-score   support

                   brightpixel    0.98148   0.82812   0.89831       320
                    narrowband    0.98870   0.92105   0.95368       380
                 narrowbanddrd    0.97101   0.96542   0.96821       347
                         noise    0.77622   0.99107   0.87059       336
        squarepulsednarrowband    0.93175   0.92082   0.92625       341
                      squiggle    0.99169   0.97548   0.98352       367
squigglesquarepulsednarrowband    0.97170   0.95666   0.96412       323

                   avg / total    0.94576   0.93786   0.93892      2414

[[265   0   0  53   1   0   1]
 [  1 350   8  10  10   0   1]
 [  0   1 335   7   4   0   0]
 [  1   0   0 333   2   0   0]
 [  1   3   1  22 314   0   0]
 [  0   0   1   1   0 358   7]
 [  2   0   0   3   6   3 309]]
Classification accuracy: 0.937862
Log Loss: 0.220428


## Example of how to score a CV set

You can use the score functions above with a cross-validation data set from the training data set to measure your model performance.  Then you won't have to always use the preview test set. 

The following code will show you how to split up your training data into a training set and CV set. Then we'll use some fake models to create some predicted values for our cross-validation set. Using those predicted values, we'll use the `printsklearnScores` function above.

<br>

### 1.
First, let's split our data up into a training data set and a cross-validation set. We start with the primay small 
index file.

In [11]:
indexfile = 'public_list_primary_v3_small_21june_2017.csv'
indexfile_uuid = csv.DictReader(open(indexfile))
indexfile_uuid_list = [x for x in indexfile_uuid]
indexfile_uuid_list = sorted(indexfile_uuid_list, key=lambda k: k['UUID'])

X = [x['UUID'] for x in indexfile_uuid_list]
y = [class_list.index(x['SIGNAL_CLASSIFICATION']) for x in indexfile_uuid_list] #also convert from class name to a number

X_train, X_cv, y_train, y_cv = train_test_split(X, y, test_size=0.10, random_state=42)

<br>
## 2.
In normal operation, you'd then use the `X_train` set of UUIDs to grab the `<UUID>.dat`  data files and produce spectrograms and features. You'd then pass your features, along with `y_train`, which contains
your labels, to your model for training. 

<br>
## 3. 
Next, you'd take the `X_cv` set of UUIDs, extract the necessary spectrogram and features and pass that to your model
in order to predict their class. In the code below, there are two fake models: `perfectModel` and `randomModel`. Each of them return a 2d array, M x K, where M is the number of samples and K is the number of classes. The values for each row are the class probability predictions. The `randomModel` produces random probabilities. The `perfectModel` actually uses the known values in the y_cv -- so it will produce a perfect score. Obviously, your model should produce scores that result in a 


In [12]:
def randomModel(X_cv):
    y_prob = np.random.rand(len(X_cv), len(class_list))
    return (y_prob.T / y_prob.sum(axis=1)).T

def perfectModel(X_cv):
    encoder = OneHotEncoder(sparse=False)
    ycv = np.array(y_cv).reshape(1,-1)
    ycv_onehot = encoder.fit_transform(ycv.T)
    return ycv_onehot
    

In [14]:
y_prob = randomModel(X_cv)
y_true = [class_list[i] for i in y_cv]
y_pred = [class_list[probarray.argmax()] for probarray in y_prob]

print 'The randomModel class produces random probability estimates'
print y_prob[:5]
print ''

printsklearnScores(y_true, y_pred, y_prob)

The randomModel class produces random probability estimates
[[ 0.21622444  0.19336502  0.09548523  0.20304027  0.0748022   0.02166179
   0.19542104]
 [ 0.13087778  0.20344906  0.06237559  0.04153681  0.284838    0.19789609
   0.07902665]
 [ 0.12452463  0.09598028  0.14679418  0.2344678   0.20976057  0.083537
   0.10493554]
 [ 0.18092884  0.22081595  0.1201021   0.17731216  0.1036566   0.11998787
   0.07719647]
 [ 0.14025169  0.02794412  0.30522047  0.27883455  0.09998314  0.10675682
   0.04100922]]

                                precision    recall  f1-score   support

                   brightpixel    0.09184   0.08738   0.08955       103
                    narrowband    0.14151   0.15000   0.14563       100
                 narrowbanddrd    0.14286   0.11765   0.12903       102
                         noise    0.14000   0.16092   0.14973        87
        squarepulsednarrowband    0.16514   0.17822   0.17143       101
                      squiggle    0.19811   0.20588   0.20192 

In [13]:
y_prob = perfectModel(X_cv)
y_true = [class_list[i] for i in y_cv]
y_pred = [class_list[probarray.argmax()] for probarray in y_prob]

print y_prob[:5]

printsklearnScores(y_true, y_pred, y_prob)

[[ 0.  0.  0.  0.  1.  0.  0.]
 [ 0.  0.  0.  0.  0.  0.  1.]
 [ 0.  0.  0.  1.  0.  0.  0.]
 [ 0.  0.  0.  0.  1.  0.  0.]
 [ 0.  0.  0.  1.  0.  0.  0.]]
                                precision    recall  f1-score   support

                   brightpixel    1.00000   1.00000   1.00000       103
                    narrowband    1.00000   1.00000   1.00000       100
                 narrowbanddrd    1.00000   1.00000   1.00000       102
                         noise    1.00000   1.00000   1.00000        87
        squarepulsednarrowband    1.00000   1.00000   1.00000       101
                      squiggle    1.00000   1.00000   1.00000       102
squigglesquarepulsednarrowband    1.00000   1.00000   1.00000       105

                   avg / total    1.00000   1.00000   1.00000       700

[[103   0   0   0   0   0   0]
 [  0 100   0   0   0   0   0]
 [  0   0 102   0   0   0   0]
 [  0   0   0  87   0   0   0]
 [  0   0   0   0 101   0   0]
 [  0   0   0   0   0 102   0]
 [  0  