# Data Parser and Logger

Given a list of test cases that are in JSON format, the task is to parse the data and compute the average collision coordinate and the speed for each test case. The output should be a dictionary that maps the test case name to the average error for that test case. Also, the output should have an average of all the test cases combined. A particular test case can have multiple collisions and the speed at each collision. Therefore, the average coordinate of a test case would be different than the average coordinate of the entire list of test cases.

## Input

The input is a list of test cases in JSON format. Each test case is a dictionary with the following keys:

- `name`: The name of the test case.
- `result`: The result of the test case. It can be either `success` or `failure`.
- `success`: Contains the metadata for a successful test case. It is a dictionary with the following keys:
  - `collisions`: A list of collision coordinates. Each coordinate is a list of three floating-point numbers.
  - `speeds`: A list of speeds. Each speed is a floating-point number.
- `failure`: Contains the metadata for a failed test case. It is a dictionary with the following keys:
  - `collisions`: A list of collision coordinates. Each coordinate is a list of three floating-point numbers.
  - `speeds`: A list of speeds. Each speed is a floating-point number.

There may be empty values as well within speeds and collisions. For example, if a test case has no collisions, the `collisions` key will be an empty list. Similarly, if a test case has no speeds, the `speeds` key will be an empty list. In such a case the average coordinate and speed for that test case will be `None`. 

In [360]:
from collections import defaultdict
import numpy as np
import random 
import json
from json import JSONDecodeError
from typing import Dict, List

def generate_sample_data(ntests:int=1,num_entries_per_test:int=3,minval:float=-100,maxval:float=100)->List[Dict]:
    """Generates sample data for testing the json data parser. The format is as follows:
    [{
        "name": test_type,
        "result": success or failure
        "success or failure": {
            "collisions": [[x1,y1,z1],[x2,y2,z2],[x3,y3,z3]],
            "speeds": [speed1,speed2,speed3]
        }
    },
    ...
    ]

    Args:
        ntests (int, optional): Number of samples to generate. Defaults to 1.
        num_entries_per_test (int, optional): Number of incidents per test. Defaults to 3.
        minval (float, optional): minimum value for any coordinate or speed. Defaults to -100.
        maxval (float, optional): maximum value for any coordinate or speed. Defaults to 100.

    Returns:
        List[Dict]: list of dictionaries containing the sample data.
    """
    test_types = [f"test{i}" for i in range(ntests)]
    results = ["success", "failure"]

    data = []
    for i,test_type in enumerate(test_types):
        entry = {
            "name": test_type,
            "result": random.choice(results)
        }
        
        if entry["result"] == "success":
            entry["success"] = {
                "collisions":  [[str(random.uniform(minval, maxval)),str(random.uniform(minval, maxval)),str(random.uniform(minval, maxval))] for _ in range(num_entries_per_test)],
                "speeds": [str(random.uniform(0, 10)) for _ in range(num_entries_per_test)]
            }
        else:
            entry["failure"] = {
                "collisions": [[str(random.uniform(minval, maxval)),str(random.uniform(minval, maxval)),str(random.uniform(minval, maxval))] for _ in range(num_entries_per_test)],
                "speeds": [str(random.uniform(0, 10)) for _ in range(num_entries_per_test)]
            }
        data.append(entry)
    return data

def jsonify_dictionary(dictionary: Dict) -> str:
    """converts a dictionary to a json string

    Args:
        dictionary (Dict): _description_

    Returns:
        str: _description_
    """
    try:
        return json.dumps(dictionary,indent=4)
    except JSONDecodeError as e:
        print(e)
    except TypeError as e:
        print(e)
        
class JsonDataParser:
    """Class for parsing json data

    Args:
        data (List[Dict]): Sample data in the format described in the docstring of generate_sample_data.
    """
    def __init__(self,data) -> None:
        self.fails = defaultdict(str)
        self.success = defaultdict(str)
        self.data = data
        self.test_names = [self.data[i].get("name",None) for i in range(len(self.data))]

    def compute_metrics(self):
        """Computes all possible metrics for each test case within the data and summarizes the failure and success cases
        """
        for data in self.data:
            name = data.get("name",None)
            result = data.get("result",None)
            metadata = data.get(result,None)
            if metadata:
                collisions = [[float(coordinate) for coordinate in collision] for collision in metadata.get("collisions",[])]
                speeds = [float(speed) for speed in metadata.get("speeds",[])]
            else:
                collisions = []
                speeds = []
            if name and result == "failure":
                self.fails[name] = {"collisions":collisions,"speeds":speeds}
            if name and result == "success":
                self.success[name] = {"collisions":collisions,"speeds":speeds}
                
        self.compute_averages()
        self.avg_error_combined(failures=True)
        self.avg_error_combined(failures=False)
        
        
        if self.fails:
            print("Fails:")
            for i in self.fails.keys():
                print(i)
                print("\tdata:\t",self.fails.get(i))
        
        if self.success:
            print("Successes:")
            for i in self.success.keys():
                print(i)
                print("\tdata:\t",self.success.get(i))
                
        self.print_average_for_each_test()
        
    def dump_data(self):
        """Store failures and successes in json files
        """
        with open("data_success.json","w") as f:
            json.dump(self.success,f)
        with open("data_fail.json","w") as f:
            json.dump(self.fails,f)
        
    def print_average_for_each_test(self):
        """Prints the average coordinate and speed for each test case
        """
        if self.fails:
            for test in self.fails:
                if (test in self.test_names):
                    print(f"Averages for {test} failure:")
                    print(f"Average coordinate for failure: {self.fails[test].get('average_coordinate',None)}")
                    print(f"Average speed for failure: {self.fails[test].get('average_speed',None)}")
            
        if self.success:
            for test in self.success:
                if (test in self.test_names):
                    print(f"Averages for {test} success:")
                    print(f"Average coordinate for success: {self.success[test].get('average_coordinate',None)}")
                    print(f"Average speed for success: {self.success[test].get('average_speed',None)}")

    def get_fails(self)->Dict:
        """Fetch the failure cases

        Returns:
            Dict: Failure cases consolidated in a dictionary
        """
        print("Number of fails:",len(self.fails))
        return self.fails

    def get_successes(self)->Dict:
        """Fetch the sucess cases

        Returns:
            Dict: Success cases consolidated in a dictionary
        """
        print(f"Number of successes: {len(self.success)}")
        return self.success
    
    def compute_averages(self):
        """Computes the average coordinate and speed for each test case
        """
        self._avg_error_successes()
        self._avg_error_failures()
        
    def _avg_error_successes(self):
        """Computes the average coordinate and speed for success cases

        Returns:
            _type_: _description_
        """
        if self.success:
            for test in self.success:
                self.avg_error_success(test_name=test)
        return self.success if self.success else None
            
    def avg_error_success(self,test_name:str):
        """Computes the average coordinate and speed for a success case

        Args:
            test_name (str): Test case name or ID
        """
        assert test_name in self.success, f"'{test_name}' not in successes"
        average_coordinates = np.array(self.success[test_name].get("collisions",[])).mean(axis=0)        
        average_speed = sum(self.success[test_name].get("speeds",[]))/len(self.success[test_name]["speeds"]) if len(self.success[test_name]["speeds"]) else 0
        self.success[test_name]["average_coordinate"] = list(average_coordinates)
        self.success[test_name]["average_speed"] = average_speed

    def _avg_error_failures(self):
        """Computes the average coordinate and speed for failure cases

        Returns:
            _type_: _description_
        """
        if self.fails:
            for test in self.fails:
                self.avg_error_fail(test_name=test)
        return self.fails if self.fails else None
            
    def avg_error_fail(self,test_name:str):
        """Computes the average coordinate and speed for a failure case

        Args:
            test_name (str): Test case name or ID
        """
        assert test_name in self.fails,f"'{test_name}' not in failures"
        average_coordinates = np.array(self.fails[test_name].get("collisions",[])).mean(axis=0)
        average_speed = sum(self.fails[test_name].get("speeds",[]))/len(self.fails[test_name]["speeds"]) if len(self.fails[test_name]["speeds"]) else 0
        self.fails[test_name]["average_coordinate"] = list(average_coordinates)
        self.fails[test_name]["average_speed"] = average_speed
        return self.fails[test_name].get("collisions",[]), self.fails[test_name].get("speeds",[])
    
    def avg_error_combined(self,failures:bool=True):
        """Computes all the averages for all failure and success cases if there are any.

        Args:
            failures (bool, optional): Computes the averages for failure cases if true else computes for success cases. Defaults to True.

        Returns:
            _type_: _description_
        """
        all_collisions = []
        all_speeds = []
        if failures:
            for test_type in self.fails:
                all_collisions.append(self.fails[test_type]["collisions"])  # Only take the collision coordinates, not the speed
                all_speeds.extend(self.fails[test_type]["speeds"])
        else:
            for test_type in self.success:
                all_collisions.append(self.success[test_type]["collisions"])
                all_speeds.extend(self.success[test_type]["speeds"])
        if all_collisions and all_speeds:
            all_collisions = np.array(all_collisions)
            all_speeds = np.array(all_speeds)
            reshaped_collisions = all_collisions.reshape(-1,3)
            # Compute the mean for each coordinate
            mean_x = np.mean(reshaped_collisions[:, 0])
            mean_y = np.mean(reshaped_collisions[:, 1])
            mean_z = np.mean(reshaped_collisions[:, 2])
            mean_speed = np.mean(all_speeds)
            if failures:
                self.fails["total_average_error"] = {"collisions":(mean_x,mean_y,mean_z),"speeds":mean_speed}
            else:
                self.success["total_average_error"] = {"collisions":(mean_x,mean_y,mean_z),"speeds":mean_speed}
            return [[mean_x,mean_y,mean_z],mean_speed]
        else:
            return [[],]
            




In [361]:
data = generate_sample_data(ntests=3,num_entries_per_test=1)
print(jsonify_dictionary(data))



[
    {
        "name": "test0",
        "result": "success",
        "success": {
            "collisions": [
                [
                    "-51.78802679270456",
                    "-95.71099948945793",
                    "-65.36007319805775"
                ]
            ],
            "speeds": [
                "9.94498658698442"
            ]
        }
    },
    {
        "name": "test1",
        "result": "failure",
        "failure": {
            "collisions": [
                [
                    "57.896817577733714",
                    "30.06145280546383",
                    "34.317298853901036"
                ]
            ],
            "speeds": [
                "8.662075536734555"
            ]
        }
    },
    {
        "name": "test2",
        "result": "failure",
        "failure": {
            "collisions": [
                [
                    "-75.86158109728531",
                    "11.947862861406747",
                    "-48.293150398973

In [362]:
json_parser = JsonDataParser(data)
json_parser.compute_metrics()
json_parser.dump_data()

Fails:
test1
	data:	 {'collisions': [[57.896817577733714, 30.06145280546383, 34.317298853901036]], 'speeds': [8.662075536734555], 'average_coordinate': [57.896817577733714, 30.06145280546383, 34.317298853901036], 'average_speed': 8.662075536734555}
test2
	data:	 {'collisions': [[-75.86158109728531, 11.947862861406747, -48.29315039897326]], 'speeds': [4.687790665665944], 'average_coordinate': [-75.86158109728531, 11.947862861406747, -48.29315039897326], 'average_speed': 4.687790665665944}
total_average_error
	data:	 {'collisions': (-8.982381759775798, 21.00465783343529, -6.987925772536112), 'speeds': 6.674933101200249}
Successes:
test0
	data:	 {'collisions': [[-51.78802679270456, -95.71099948945793, -65.36007319805775]], 'speeds': [9.94498658698442], 'average_coordinate': [-51.78802679270456, -95.71099948945793, -65.36007319805775], 'average_speed': 9.94498658698442}
total_average_error
	data:	 {'collisions': (-51.78802679270456, -95.71099948945793, -65.36007319805775), 'speeds': 9.9449