# Judging Criteria

This document shows where to get the test data set, and explains how the judging will be done for the "Best Classification" system. 

Read the [Step 1: Get Data tutorial](tutorials/Step_1_Get_Data.ipynb#Test-Data-Set) for information on where to get the training and test data sets. 


## Generate CSV File with Scores

For each simulated file in the test data set, your job is to create a CSV file where each line contains the signal's UUID, followed by scores for each signal class. (The signal class with the highest score would be your model's class estimation. Typically these's scores are the probability estimates for each class.)

Extract the contents of the file and use `ibmseti` python package to read each file, just like you would the training data. The UUID is for each file found in the header. See [Step_2_reading_SETI_code_challenge_data.ipynb](tutorials/Step_2_reading_SETI_code_challenge_data.ipynb) for examples of reading the data.  

For each data file in the test set, generate the appropriate spectrogram and pass that to your signal classifier to calculate the scores for each class. Then write the results to an output csv files

For example, each line in the CSV file should look like:

  * `abdefg-adbc12-23234-123123-cvaf, 0.1, 0.023, 0.451, 0.232, 0.001, 0.07, 0.0083`

**THE COLUMN ORDER OF THE NUMBERS IS CRITICAL! THEY MUST BE THE SCORE FOR EACH CLASS IN THIS ORDER:**

  * `uuid, brightpixel, narrowband, narrowbanddrd, noise, squarepulsednarrowband, squiggle, squigglesquarepulsednarrowband`

In the example above, the `abdefg-adbc12-23234-123123-cvaf` would be labeled as a `narrowbanddrd` because it has the highest score. 

#### Example PseudoCode

In [None]:
import ibmseti
import csv
import ALL LIBS FOR YOUR CLASSIFIER (tensorflow, Watson, etc)

my_model #this is your trained signal classification model
my_output_results = mydatafolder + '/signal_class_results.csv'
zz = zipfile.ZipFile(mydatafolder + '/' + 'primary_testset.zip')

for fn in zz.namelist():
    data = zz.open(fn).read()
    aca = ibmseti.compamp.SimCompamp(data)
    uuid = aca.header()['uuid']
    spectrogram = draw_spectrogram(aca) #whatever signal processing code you need would go in your `draw_spectrogram` code
    
    #cr = class results. In this example, it's a dictionary. But in your experience it could be something else
    #       like a simple list.
    cr = my_model.classify(spectrogram)
    
    with open(my_output_results, 'a') as csvfile:
        fwriter = csv.writer(csvfile, delimiter=',')
        fwrite.writerow([uuid, cr['brightpixel'], cr['narrowband'], cr['narrowbanddrd'],
                         cr['noise'], cr['squarepulsednarrowband'], cr['squiggle'],
                         cr['squigglesquarepulsednarrowband']
                        ])

### Measured by Logistical Loss

In this contest, because we are using all the scores, we'll use the [LogLoss function](http://scikit-learn.org/stable/modules/generated/sklearn.metrics.log_loss.html) as a measure of your model. These details will be found in an upcoming notebook. There will be a website that you use to submit your CSV file containing the scores for your team. This website will also contain a scoreboard. This will be implemented **after** the hackathon on June 11th -- more details to come. 

## Submit Your Results

### Get your CSV File

If you are running this on IBM DSX (IBM Apache Spark), you'll need to get your `.csv` file to your local working directory in order to submit your results to the Scoreboard website.

The best way to get your data is to move your `.csv` file to your Object Storage account that is provided in DSX. Then, move the `.csv` file from your Object Storage to your local directory.
[This tutorial shows you the basic steps to move data to and from your Object Storage instance.](tutorials/General_move_data_to_from_Object_Storage.ipynb)  

### Submit your CSV File

We are currently constructing a Team Scoreboard for participants. This will be used first at the hackathon. After the hackathon, we will make scoreboard available for the code challenge particitipants.   We will send you notification when the Scoreboard is ready!  We expect the Scoreboard to go live on June 15th!  