# Descriptive Statistics
Next, we want to calculate the statistics of the different QARTOD flags for the different tests that are applied to the different parameters in the dataset. The example ```qartod_results_summary``` below simply counts the total number of different flags (e.g 1, 3, 4) and their relative percentages for each test (gross range, climatology, etc) for each parameter that the tests area applied to. 

### Import modules used in this notebook

In [None]:
# Import libraries
import os
import re
import gc
import io
import ast
import pandas as pd
import numpy as np
import xarray as xr
import warnings
warnings.filterwarnings("ignore")
import sys

In [None]:
# Import dask tools and ProgressBar
import dask
from dask.diagnostics import ProgressBar

In [None]:
def qartod_results_summary(ds, params, tests):
    """
    Calculate the statistics for parameter qartod flags.
    
    This function takes in a list of the parameters and
    the associated QARTOD tests to calculate the number
    of each flag and the percent of the flag.
    
    Parameters
    ----------
    ds: xarray.DataSet
        An xarray dataset which contains the data
    params: list[strings]
        A list of the variables/parameters in the given
        dataset that have been tested with QARTOD
    tests: list[strings]
        A list of the QARTOD test names which to parse
        for the given parameters.
        
    Returns
    -------
    results: dict
        A dictionary which contains the number of each
        QARTOD flag and the percent of the total flags
        for each test applied to each parameter in the
        given dataset.
        
        results = {'parameter':
                        {'test_name':
                            {'total data points': int,
                            'good data points': (int, %),
                            'suspect data points': (int, %),
                            'bad data points': (int, %)}
                            },
                        }
    """
    # Check that the inputs are a list
    if type(params) is not list:
        params = [params]
        
    if type(tests) is not list:
        tests = [tests]
    
    # Initialize the result dictionary and iterate 
    # through the parameters for each test
    results = {}
    for param in params:
        
        # Now iterate through each test
        test_results = {}
        for test in tests:
            
            # First, check that the test was applied
            test_name = f"{param}_qartod_{test}_test"
            if test_name not in ds.variables:
                continue
                
            # Count the total number of values
            n = ds[test_name].count().compute().values
            
            # First calculate the gross range results
            good = np.where(ds[test_name] == "1")[0]

            # Count the number of suspect/interesting
            suspect = np.where(ds[test_name] == "3")[0]
    
            # Count the number of fails
            bad = np.where(ds[test_name] == "4'")[0]
    
            test_results.update({test :{
                     "total": int(n),
                     "good": (len(good), np.round(len(good)/n*100, 2)),
                     "suspect": (len(suspect), np.round(len(suspect)/n*100, 2)),
                     "fail": (len(bad), np.round(len(bad)/n*100, 2))
                    }
                }
            )
        
        # Save the test results for each parameter
        results.update({
            param: test_results
        })
    
    return results

In [None]:
qartod_results = qartod_results_summary(ds, parameters, ["gross_range", "climatology"])
qartod_results