# Pass Rate
This notebook creates a pass rate report for test results. It ties into the **Test Monitor Service** for retrieving filtered test results, the **Notebook Execution Service** for running outside of Jupyterhub, and the **Test Monitor Reports page** at #testmonitor/reports for displaying results.

The parameters and output use a schema recognized by the Test Monitor Reports page, which can be implemented by various report types. The Pass Rate notebook produces data that is best shown in a bar graph.

### Imports
Import Python modules for executing the notebook. Pandas is used for building and handling dataframes, and Papermill is used for running notebooks and recording data for the Notebook Execution Service. The SystemLink Test Monitor Client provides access to test result data for processing.

In [None]:
import copy
import datetime
import dateutil.parser
import pandas as pd
import papermill as pm
from dateutil import tz

from systemlink.testmonclient import TestMonitorClient, testmon_messages

### Parameters
- `group_by`: The dimension along which to reduce; what each bar in the output graph represents  
  Options: Day, System, Test Program, Operator, Product  
  Default: Day
- `query_by`: Filter for test results from the Test Monitor Service  
  Options: Any valid Test Monitor filter  
  Default: `{ 'startedWithin': { 'unit': 'DAYS', 'value': 30 } }`

Parameters are also listed in the metadata for the parameters cell, along with their default values. The Notebook Execution services uses that metadata to pass parameters from the Test Monitor Reports page to this notebook. Available `group_by` and `query_by` options are listed in the metadata as well; the Test Monitor Reports page uses these to validate inputs sent to the notebook.

To see the metadata, select the code cell and click the wrench icon in the far left panel.

In [None]:
group_by = 'Day'
query_by = {
    'startedWithin': {
        'unit': 'DAYS',
        'value': 30
    }
}

### Mapping from grouping options to Test Monitor terminology
Translate the grouping options shown in the Test Monitor Reports page to keywords recognized by the Test Monitor API.

In [None]:
groups_map = {
    'Day': 'startedAt',
    'System': 'systemId',
    'Test Program': 'programName',
    'Operator': 'operator',
    'Product': 'product'
}
grouping = groups_map[group_by]

### Create Test Monitor client
Establish a connection to SystemLink over AMQP.

In [None]:
testmonclient = TestMonitorClient(service_name='TestMonitorClient')

### Query for results
Query the Test Monitor Service for results matching the `query_by` parameter.

In [None]:
query_by.update({
    'sortBy': [{
        'field': 'STARTED_AT',
        'orderByDescending': False
    }]
})
query = testmon_messages.ResultQuery.from_dict(query_by)
results, _ = testmonclient.query_results(query)

results_list = []
for result in results:
    results_list.append(result.to_dict())

### Get group names
Collect the group name for each result based on the `group_by` parameter.

In [None]:
group_names = []
for result in results_list:
    if grouping in result:
        group_names.append(result[grouping])

### Create pandas dataframe
Put the data into a dataframe whose columns are test result id, status, and group name.

In [None]:
formatted_results = {
    'id': [result['id'] for result in results_list],
    'status': [result['status']['statusType'] for result in results_list],
    grouping: group_names
}

df_results = pd.DataFrame.from_dict(formatted_results)

### Handle grouping by day
If the grouping is by day, the group name is the date and time when the test started in UTC. To group all test results from a single day together, convert to server time and remove time information from the group name.

In [None]:
df_results_copy = copy.copy(df_results)
df_results_copy.fillna(value='', inplace=True)

if grouping is 'startedAt':
    truncated_times = []
    for val in df_results_copy[grouping]:
        utc = dateutil.parser.parse(val)
        local_time = utc.astimezone(tz.tzlocal())
        truncated_times.append(str(datetime.date(local_time.year, local_time.month, local_time.day)))
    df_results_copy[grouping] = truncated_times

### Aggregate results into groups
Aggregate the data for each unique group and status.

*See documentation for [size](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.size.html) and [unstack](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.unstack.html) here.*

In [None]:
df_grouped = df_results_copy.groupby([grouping, 'status']).size().unstack(fill_value=0)
if 'PASSED' not in df_grouped:
    df_grouped['PASSED'] = 0
if 'FAILED' not in df_grouped:
    df_grouped['FAILED'] = 0
if 'ERRORED' not in df_grouped:
    df_grouped['ERRORED'] = 0

### Pass Rate calculation
Divide the number of passed tests by the total number of tests.

In [None]:
df_pass_rate = pd.DataFrame(100 * df_grouped['PASSED'] / (df_grouped['FAILED'] + df_grouped['ERRORED'] + df_grouped['PASSED']))

### Convert the dataframe to the SystemLink reports output format
For a pass rate bar graph grouped by day, the output is an array containing one dataframe. The dataframe contains ISO-8601 date strings as x values and the pass rate as y values. Because the data includes x values, the data format is XY.
```
[{'plot_style': 'BAR',
  'data_format': 'XY',
  'data_frame': {
     'data': [
         ['2018-11-17T00:00:00', '2018-11-18T00:00:00', ...],
         [94.0, 89.9, ...]
     ]
  }
}]
```

For a pass rate bar graph grouped by any other grouping option, the output is an array of _n_ dataframes, where *n* is the number of data points. Each dataframe contains a single x value and a single y value, representing one bar in the graph. Because the data includes x values, the data format is XY.
```
[{'plot_style': 'BAR',
  'data_format': 'XY',
  'data_frame': {'data': [[0], [95.3]]}},
  ...
]
```

In [None]:
df_pass_rate = df_pass_rate.transpose()
df_pass_rate_dict = df_pass_rate.to_dict('split')

result = []
if grouping is 'startedAt':
    date_values = []
    for date in df_pass_rate_dict['columns']:
        converted = datetime.datetime.strptime(date, '%Y-%m-%d').isoformat()
        date_values.append(converted)
    result.append({
        'plot_style': 'BAR',
        'data_format': 'XY',
        'data_frame': {'data': [date_values , df_pass_rate_dict['data'][0]]}
    })
else:
    i = 0
    for data_member in df_pass_rate_dict['data'][0]:
        result.append({
            'plot_style': 'BAR',
            'data_format': 'XY',
            'data_frame': {'data': [[i], [data_member]]}
        })
        i += 1

### Get tick labels from dataframe column names
Providing x-axis tick labels is optional. If none are provided, the Test Monitor Reports page will use the x values in the dataframe if they are provided, or an index starting at 0 if no x values are provided. If tick labels are provided, they must be formatted as `[{'x': 0, 'label': 'label_name'}, {'x': 1, 'label': 'label_name'}, ...]`, and there must be one tick for every x value.

For the pass rate bar graph, tick labels are generated if the result is not grouped by day. Results grouped by day return time x values, which the Test Monitor Reports page translates into a time axis.

In [None]:
ticks = []
if grouping is not 'startedAt':
    i = 0
    for label in df_pass_rate_dict['columns']:
        ticks.append({
            'x': i,
            'label': label or 'No ' + group_by
        })
        i += 1

### Record results with Papermill
For optimal parsing by the Test Monitor Reports page, results should include
- `title`: The title of the result graph
- `axis_labels`: The x-axis label and y-axis label
- `tick_labels`: Labels for the ticks along the x-axis
- `orientation`: 'HORIZONTAL' or 'VERTICAL'
- `result_type`: The type of data returned by the notebook ('DEFAULT' is the only option supported by the Test Monitor Reports page; it represents graph data)
- `result`: The calculated pass rate data

In [None]:
pm.record('title', 'Pass Rate by {}'.format(group_by))
pm.record('axis_labels', [group_by, 'Pass Rate'])
pm.record('tick_labels', ticks)
pm.record('orientation', 'VERTICAL')
pm.record('result_type', 'DEFAULT')
pm.record('result', result)