# Judging Information

This document explains how the judging will be done for the "Best Classification" system. 

**Read the [Step 1: Get Data tutorial](tutorials/Step_1_Get_Data.ipynb#Test-Data-Set) for information on where to get the training and test data sets. **

<br>


## Generate CSV File with Scores

Your job is to create a CSV file that holds the scores for each of the signals found in the test data set. 

**Each line of your CSV file "Scorecard", will contain the signal's UUID, followed by scores for each signal class. The order of the scores is absolutely critical.** 

*The signal class with the highest score would be your model's class estimation. Typically these scores are the probability estimates for each class.*
 
For each data file in the test set, generate the appropriate spectrogram, and then pass that to your signal classifier (machine-learning model) to calculate the scores for each class.  

For example, each line your the CSV scorecard file should look something like:

  * `abdefgadbc1223234123123cvaf, 0.1, 0.023, 0.451, 0.232, 0.001, 0.07, 0.0083`

### THE COLUMN ORDER IS ABSOLUTELY CRITICAL! 

THE ORDER OF THE SCORES IN EACH ROW OF YOUR CSV FILE MUST BE:

  * `brightpixel, narrowband, narrowbanddrd, noise, squarepulsednarrowband, squiggle, squigglesquarepulsednarrowband`

This is in alphabetical order.


#### Example PseudoCode

In [None]:
import ibmseti
import csv
import ALL LIBS FOR YOUR CLASSIFIER (tensorflow, Watson, etc)

my_model #this is your trained signal classification model
my_output_results = mydatafolder + '/signal_class_results.csv'
zz = zipfile.ZipFile(mydatafolder + '/' + 'primary_testset_preview_v3.zip')

for fn in zz.namelist():
    data = zz.open(fn).read()
    aca = ibmseti.compamp.SimCompamp(data)
    uuid = aca.header()['uuid']
    spectrogram = draw_spectrogram(aca) #whatever signal processing code you need would go in your `draw_spectrogram` code
    
    #cr = class results. In this example, it's a dictionary. But in your experience it could be something else
    #       like a simple list.
    cr = my_model.classify(spectrogram)
    
    with open(my_output_results, 'a') as csvfile:
        fwriter = csv.writer(csvfile, delimiter=',')
        fwrite.writerow([uuid, cr['brightpixel'], cr['narrowband'], cr['narrowbanddrd'],
                         cr['noise'], cr['squarepulsednarrowband'], cr['squiggle'],
                         cr['squigglesquarepulsednarrowband']
                        ])

# Team Signup to Scoreboard

The Scoreboard for the [Preview Test is here.](https://compete.cognitiveclass.ai/event/5956ce1013880a001fd89bd6)

The Scoreboard for the [Final Test is here.](https://compete.cognitiveclass.ai/event/5957f1ec13880a001fd89bd7)

[Please read this walkthrough](tutorials/competitions-walkthrough.pdf) to sign up for the Scoreboard system, form your team, and submit an example result.  An example scorecard is found below.




### Example Scorecard

The scores in this example file are random values between 0 and 1. Typically, your scores will be your classification model's estimate of probability for each class. (As such, they would sum to 1.0. However, to be sure, the Log-loss calculator in our Scoreboard will normalize your score to ensure the values sum to 1.0.)

[Example Scorecard](https://ibm.box.com/v/ml4setiExPreviewTestScorecard)

With this scorecard, you should get the exact same values as "TeamRandom" that is currently on the Scoreboard. 


### Measured by Logistical Loss

In this contest we are using the [Log-Loss function](http://scikit-learn.org/stable/modules/generated/sklearn.metrics.log_loss.html) as a measure of your model's performance.  
<br>


#### Retrieving your CSV File from IBM DSX

If you are running your analysis on IBM DSX (IBM Apache Spark), you'll need to get your `.csv` file to a local machine in order to submit your results to the Scoreboard.

One way is to move your `.csv` file to your Object Storage account that is provided in DSX.
[This tutorial shows you the basic steps to move data to and from your Object Storage instance.](tutorials/General_move_data_to_from_Object_Storage.ipynb)  

Then, from DSX (or Bluemix), navigate to your Object Storage container and you can download the file to your local machine with a click. 

Another good option is to use [Pixiedust](https://github.com/ibm-watson-data-lab/pixiedust). Among the many features of Pixiedust, you can load a `.csv` file into a Pandas or Spark DataFrame and Pixiedust will display that data in your Jupyter notebook. From the display, there is an icon that lets you download the data directly. This is probably the easier option, though you will need to "`pip install --user pixiedust`" and restart your kernel.


<br>
# Set of Rules

### No Fixed Rules

The SETI Institute reserves the right to change the rules or add more rules at any time. This list will be updated as needed.


### Deadline

The code challenge ends at 00:00 August 1, 2017 US West Coast time. All submissions to the Scoreboard must occur before then. 


### Results Verification

All results are subject to verification by the SETI Institute. The results of the top 3 teams will undergo further scrutiny to ensure classification accuracy. In the event of a tie-break, more test data will be provided. Another tie-breaking measure will be the time it takes to perform classifications and the faster classifer will be chosen.


### Open Source

Your code must be open-sourced and licensed with the Apache License 2.0. This is required because the plan is to install the winning team's code in the SETI Institute real-time data analysis pipeline. 