# Testing Script
In this notebook, we create the testing script for a trained model. This script is stored alone in a `scripts` directory both for ease of reference and because the Azure ML SDK limits the contents of this directory to at most 300 MB.

The notebook cells are each appended in turn in the training script, so it is essential that you run the notebook's cells _in order_ for the script to run correctly. If you edit this notebook's cells, be sure to preserve the blank lines at the start and end of the cells, as they prevent the contents of consecutive cells from being improperly concatenated.

The script sections are
- [import libraries](#import),
- [define utility functions and classes](#utility),
- [define the script input parameters](#parameters),
- [load and prepare the testing data](#data),
- [load the trained pipeline](#pipeline),
- [score the test data](#score), and
- [compute the trained pipeline's performance](#performance).

Start by creating the `scripts` directory, if it does not already exist.

In [None]:
!mkdir -p scripts

## Load libraries <a id='import'></a>

In [None]:
%%writefile scripts/TestClassifier.py

from __future__ import print_function
import os
import argparse
import pandas as pd
from itertools import groupby
from sklearn.externals import joblib
from azureml.core import Run
import azureml.core


## Define utility functions and classes <a id='utility'></a>

In [None]:
%%writefile --append scripts/TestClassifier.py

def cumulative_gain(y_true, y_pred, groups, max_gain=1.0, score_at=1):
    """
    Compute the normalized cumulative gain.
    This function assumes the data are sorted by groups.
    """
    gain = sum([sum([v
                     for _, _, v in sorted(g,
                                           key=lambda x: x[1],
                                           reverse=True)[:score_at]])
                for _, g in groupby(zip(groups, y_pred, y_true),
                                    key=lambda x: x[0])])
    eval_result = gain / max_gain
    return eval_result


## Define the input parameters <a id='parameters'></a>

In [None]:
%%writefile --append scripts/TestClassifier.py

if __name__ == '__main__':
    print('azureml.core.VERSION={}'.format(azureml.core.VERSION))
    
    parser = argparse.ArgumentParser(description='Test a model.')
    parser.add_argument('--data-folder', help='the path to the data',
                        dest='data_folder', default='.')
    parser.add_argument('--inputs', help='the inputs directory',
                        default='data')
    parser.add_argument('--test', help='the test dataset name',
                        default='balanced_pairs_test.tsv')
    parser.add_argument('--outputs', help='the outputs directory',
                        default='outputs')
    parser.add_argument('--model', help='the model file base name',
                        default='model')
    parser.add_argument("--rank", help="the maximum rank of a correct match",
                        type=int, default=3)
    args = parser.parse_args()
    

## Load and prepare the testing data <a id='data'></a>

In [None]:
%%writefile --append scripts/TestClassifier.py

    # Get a run logger.
    run = Run.get_context()

    # What to name the metric logged
    metric_name = "accuracy"

    print('Prepare the testing data.')
    
    # Paths to the input data.
    data_path = args.data_folder
    inputs_path = os.path.join(data_path, args.inputs)
    test_path = os.path.join(inputs_path, args.test)

    # Define the input data columns.
    feature_columns = ['Text_x', 'Text_y']
    label_column = 'Label'
    group_column = 'Id_x'
    dupes_answerid_column = 'AnswerId_x'
    questions_answerid_column = 'AnswerId_y'
    score_column = 'score'

    # Load the testing data.
    print('Reading {}'.format(test_path))
    test = pd.read_csv(test_path, sep='\t', encoding='latin1')
    
    # Sort the data by groups
    test.sort_values(group_column, inplace=True)

    # Report on the dataset.
    print('test: {:,} rows with {:.2%} matches'
          .format(test.shape[0], test[label_column].mean()))
    
    # Select and format the testing data.
    test_X = test[feature_columns]
    test_y = test[label_column]
    

## Load the trained model<a id='pipeline'></a>

In [None]:
%%writefile --append scripts/TestClassifier.py

    print('Load the model pipeline.')

    # Paths for the model data.
    outputs_path = args.outputs
    model_path = os.path.join(outputs_path, '{}.pkl'.format(args.model))

    print('Loading the model from {}'.format(model_path))
    model = joblib.load(model_path)


## Score the test data using the model <a id='score'></a>

In [None]:
%%writefile --append scripts/TestClassifier.py

    # Collect the model predictions.
    print('Scoring the test data.')
    test[score_column] = model.predict_proba(test_X)[:, 1]


## Report the model's performance statistics on the test data <a id='performance'></a>

In [None]:
%%writefile --append scripts/TestClassifier.py

    print("Evaluating the model's performance on the test data.")
    
    metric_name = "gain"
    max_gain = test[label_column].sum()
    for i in range(1, args.rank+1):
        gain = cumulative_gain(y_true=test[label_column].values,
                               y_pred=test[score_column].values,
                               groups=test[group_column].values,
                               max_gain=max_gain,
                               score_at=i)
        print('{}@{} = {:.2%}'.format(metric_name, i, gain))
    
    # Log the gain@rank
    run.log("{}@{}".format(metric_name, i), gain)


## Run the script to see that it works <a id='run'></a>
This will take about a minute.

In [None]:
%run -t scripts/TestClassifier.py --rank 5

In [the next notebook](03_Run_Locally.ipynb), we set up and use the AML SDK to run the training script.