# Validator

## Description

This script validates the content of the system output files for the manipulation and splice detection and localization tasks, relative to the index file as specified in the NC2017 Evaluation Plan, as well as the basic format and features of the mask files (i.e. same dimensions as the original image as described in the index file and single-channel grayscale).

The name of the system output should match the name of the directory it is in (with '.csv' appended to it). In this directory should also be a <b>mask</b> directory containing the system output masks.

All csv files passed to the Validator must contain headers and must have their rows separated by pipe characters ('|'). Fields and values in the csv should <i>not</i> be enclosed in quotes ( ' or " ) if possible (e.g. entries 'foo', an empty field, and 'bar', in that order, should look like this on the csv: foo||bar, although this won't be checked).

Both the index and system output files must have their columns in the specified order described under Input Options and no other column. The index and system output files must have the same number of rows; further, the system output must not have duplicate rows.

All masks for the validator will be checked for conformity to the dimensions specified in the index file and for being png's. However, mask fields with blank entries can be skipped over without the validator throwing an error (for the DSD validator, either Probe or Donor mask file name being blank will cause that row to be skipped).

## Command-line Options

Example:

In [None]:
%%bash
python2 validator.py -x ../../data/test_suite/maskScorerTests/indexes/MFC18-manipulation-image-index.csv\
 -s ../../data/test_suite/maskScorerTests/B_MFC18_Unittest_Manipulation_ImgOnly_p-me_1/B_MFC18_Unittest_Manipulation_ImgOnly_p-me_1.csv\
 -nc --ncid MFC18 -vt SSD

Running this code would validate the B_NC2016_Removal_ImgOnly_c-me2_2.csv with additional information provided by the NC2016-removal-index.csv, each under its appropriate directory, through the Single-Source Detection (SSD) validator. The sample inputs shown here should pass the validation.

The command-line options for the mask scorer can be categorized as follows:

### Validation Modes

-vt --valtype

  * Specify the validation type for the relevant task: Single-Source Detection (i.e. 'SSD') or Double-Source Detection (i.e. 'DSD'). Validation for video is specified by 'SSD-video'.

--output_revised_system

 * Set probe status for images that fail dimensionality validation to 'FailedValidation' and output the new CSV to a specified file [e.g. 'my_revised_system.csv']. Submissions that only have 'FailedValidation' will be skipped in image localization scoring. [default=None]

-nc --nameCheck

  * Whether or not to check the naming format of the file according to the ENBF (Extended Backus-Naur Form) <TEAM>_NC17_<DATA>_<TASK>_<CONDITION>_<SYS>_<VERSION>. Selecting the option will run the name checker. Further information on the meaning of the ENBF is available in the evaluation plan.

-nm --neglectMask

 * Whether or not to neglect the mask dimension validation.

### Input Options

-x --inIndex

  * Define the index csv file. The index file contains the TaskID, ProbeFileID, ProbeFileName, ProbeWidth, and ProbeHeight fields, and if scoring on the splice task, the DonorFileID, DonorFileName, DonorWidth, and DonorHeight fields as well. No additional fields are permitted for the index file.

-s --inSys

  * Specify the CSV file of the system performance results formatted according to NC2016 specification. The file must contain the ProbeFileID, ConfidenceScore, and IsOptOut fields, in that order. If scoring on ImgOnly and ImgMeta tasks, the OutputProbeMaskFileName field is also required. If scoring on the splice task, the ProbeFileID, DonorFileID, ConfidenceScore, OutputProbeMaskFileName, OutputDonorMaskFileName, and IsOptOut fields are required, in that order. The OutputProbeMaskFileNames and OutputDonorMaskFileNames (where relevant) should be directory strings relative to the location of the system performance CSV.

-r --inRef

 * Define the reference csv file to filter mask dimensionality validation to only the target masks (i.e. IsTarget == 'Y'). This is especially useful when trying to validate the splice output.

--ncid

 * Specify the NCID for the evaluation. This should be the NCID specified in the evaluation plan. Default: 'NC17'

--optOut

 * Deprecated as of 4/19/2018. Presently will attempt to validate rows regardless of whether the option is set.

### Processing Options

-id --identify
 * Use ImageMagick's `identify` command to get dimensions of the masks for accelerated image processing. OpenCV reading is used by default.

-p --processors
 * The number of processors to use for validation. Choosing too many processors will force the program to default to the number of rows. This option will take effect for SSD or SSD-video validation only (i.e. selecting this for DSD will have no effect). (default = 1)

### Print Options

-v verbose

  * Control print output. Select 1 to print all non-error related output and 0 to suppress all print output (bar argument-parsing errors).

## Disclaimer

This software was developed at the National Institute of Standards
and Technology (NIST) by employees of the Federal Government in the
course of their official duties. Pursuant to Title 17 Section 105
of the United States Code, this software is not subject to copyright
protection and is in the public domain. NIST assumes no responsibility
whatsoever for use by other parties of its source code or open source
server, and makes no guarantees, expressed or implied, about its quality,
reliability, or any other characteristic.