# Conformance Test Suite results

This notebook has been used to analyze and compare the results from various Conformance Test Suite runs, primarily to investigate the general (mean) response times of various methods implemented by the repository via the MetadataCollection interfaces.

## Mean execution times

Following calculates the mean execution times, per method, across 3 different runs of the repository connector at the same scale factor.

In [None]:
results_locations = []
results_locations.append("2.7-SNAPSHOT_21.02-1.15.0-beta/2/a")
results_locations.append("2.7-SNAPSHOT_21.02-1.15.0-beta/2/b")
results_locations.append("2.7-SNAPSHOT_21.02-1.15.0-beta/2/c")

In [None]:
# This will simply print out and validate that the location defined exists
import os

def validateProfileResultsLocation(location):
    profile_details_location = location + os.path.sep + "profile-details"
    if os.path.isdir(profile_details_location):
        print("Location exists:", profile_details_location)
    else:
        print("ERROR: Location does NOT exist:", profile_details_location)

for location in results_locations:
    validateProfileResultsLocation(location)

Now that we have our locations defined from which to calculate overall statistics, we will define some functions that parse through the profile details of each CTS result for us and build up a data frame with all of the results of interest (methods, elapsed times, etc).

In [None]:
import json
import pandas as pd

# Given a profileResult.requirementResults object, parse all of its positiveTestEvidence
# and group the results by methodName
def parseEvidence(df, repositoryName, requirementResults):
    if (requirementResults is not None and 'positiveTestEvidence' in requirementResults):
        print("Parsing evidence for:", requirementResults['name'], "(" + repositoryName + ")")
        data_array = []
        for evidence in requirementResults['positiveTestEvidence']:
            if ('methodName' in evidence and 'elapsedTime' in evidence):
                data = {
                    'repo': repositoryName,
                    'method_name': evidence['methodName'],
                    'elapsed_time': evidence['elapsedTime'],
                    'test_case_id': evidence['testCaseId'],
                    'assertion_id': evidence['assertionId']
                }
                data_array.append(data)
        df = df.append(pd.read_json(json.dumps(data_array), orient='records'), ignore_index=True)
    return df

# Given a profile detail JSON file, retrieve all of its profileResult.requirementResults[] objects
def parseRequirementResults(profileFile):
    with open(profileFile) as f:
        profile = json.load(f)
    # This first case covers files retrieved via API
    if ('profileResult' in profile and 'requirementResults' in profile['profileResult']):
        return profile['profileResult']['requirementResults']
    # This second case covers files created by the CLI client
    elif ('requirementResults' in profile):
        return profile['requirementResults']
    else:
        return None

# Retrieve a listing of all of the profile detail JSON files
def getAllProfiles(profileLocation):
    detailsLocation = profileLocation + os.path.sep + "profile-details"
    _, _, filenames = next(os.walk(detailsLocation))
    full_filenames = []
    for filename in filenames:
        full_filenames.append(detailsLocation + os.path.sep + filename)
    return full_filenames

# Parse all of the provided profile file's details into the provided dataframe
def parseProfileDetailsIntoDF(df, profileFile, qualifier):
    profileResults = parseRequirementResults(profileFile)
    if profileResults is not None:
        for result in profileResults:
            df = parseEvidence(df, qualifier, result)
    return df

In [None]:
dfAll = pd.DataFrame({'repo': [], 'method_name': [], 'elapsed_time': [], 'test_case_id': [], 'assertion_id': []})
for location in results_locations:
    profile_files = getAllProfiles(location)
    for profile_file in profile_files:
        dfAll = parseProfileDetailsIntoDF(dfAll, profile_file, 'Crux')

Now that all of the details are in a data frame, we can use Python to quickly calculate various statistics on a method-by-method basis:

In [None]:
pd.set_option('display.max_rows', None)
dfAll.groupby('method_name').median()