# DataSpace Analysis Example

This notebook will analyze the parametric data in a dataspace to calculate statistics.

### Imports

Import Python modules for executing the notebook. The ni_data_space_analyzer is used for performing the analysis. Pandas is used for building and handling dataframe. Scrapbook is used for running notebooks and recording data for the SystemLink Notebook Execution Service.

In [1]:
from typing import List, Dict, Union

import pandas as pd
import scrapbook as sb

from ni_data_space_analyzer import DataSpaceAnalyzer
from ni_data_space_analyzer import DatasetLoader
from ni_data_space_analyzer.exception import DataSpaceAnalyzerError

### Parameters

#### Channels

Channels are a collection of input data traces in serialized format. Each
channel is an object with the following keys.

- `name`: Name of the channel.
- `data`: Data of the channel.
- `x`: Array of `x-axis` values of a data trace.
- `y`: Array of `y-axis` integer values corresponding to `x` of a data trace.
- `low_limits`: Array of `low_limit` integer values corresponding to the low
  limit value of a data trace.
- `high_limits`: Array of `high_limit` integer values corresponding to the high
  limit value of a data trace.

Example:

    ```json
    channels = """[{
        "name": "Input Voltage",
        "data":
        {
            "x": [1,1,2,2,3],
            "y": [6.93,6.9,6.1,6.2,9],
            "high_limits": [11,11,11,11,11],
            "low_limits": [1,1,1,1,1]
        }
    },
    {
        "name": "Input Current",
        "data":
        {
            "x": [1,1,2,2,3],
            "y": [6.93,6.9,6.1,6.2,9],
            "high_limits": [11,11,11,11,11],
            "low_limits": [1,1,1,1,1]
        }
    }]"""
    ```

#### Analysis Options

The `analysis_options` is a list of analyses that the Notebook should perform.

### Metadata

These are the parameters that the notebook expects to be passed in by SystemLink. For notebooks designed to be perform analysis inside a dataspace, must tag the cell with 'parameters' and at minimum specify the following in the cell metadata using the JupyterLab Property Inspector (double gear icon):

```json
{
  "papermill": {
    "parameters": {
      "analysis_options": [],
      "channels": ""
    }
  },
  "systemlink": {
    "interfaces": [],
    "outputs": [
      {
        "display_name": "Min",
        "id": "min",
        "type": "scalar"
      },
      {
        "display_name": "Max",
        "id": "max",
        "type": "scalar"
      },
      {
        "display_name": "Mean",
        "id": "mean",
        "type": "scalar"
      },
      {
        "display_name": "2 STD",
        "id": "2std",
        "type": "scalar"
      },
      {
        "display_name": "-2 STD",
        "id": "-2std",
        "type": "scalar"
      },
      {
        "display_name": "Moving Mean",
        "id": "moving_mean",
        "type": "vector"
      },
      {
        "display_name": "CP",
        "id": "cp",
        "type": "vector"
      },
      {
        "display_name": "CPK",
        "id": "cpk",
        "type": "vector"
      }
    ],
    "parameters": [
      {
        "display_name": "Channels",
        "id": "channels",
        "type": "string"
      },
      {
        "display_name": "Analysis Options",
        "id": "analysis_options",
        "type": "string[]"
      }
    ],
    "version": 2
  },
  "tags": ["parameters"]
}
```

For more information on how parameterization works, review the [papermill documentation](https://papermill.readthedocs.io/en/latest/usage-parameterize.html#how-parameters-work).


In [2]:
channels = ""
analysis_options = []

### Supported Input analysis options and their output types

In [3]:
supported_analysis = [
    {"id": "min", "type": "scalar"},
    {"id": "max", "type": "scalar"},
    {"id": "mean", "type": "scalar"},
    {"id": "2std", "type": "scalar"},
    {"id": "-2std", "type": "scalar"},
    {"id": "moving_mean", "type": "vector"},
    {"id": "cp", "type": "vector"},
    {"id": "cpk", "type": "vector"},
]

supported_analysis_options = list(map(lambda x: x["id"], supported_analysis))

### Utility Functions

#### Validating Analysis options

In [4]:
def validate_analysis_options(analysis_options) -> None:
    analysis_options = list(map(str.strip, analysis_options))

    invalid_options = list(set(analysis_options) - set(supported_analysis_options))

    if invalid_options:
        raise DataSpaceAnalyzerError(
            "The analysis failed because the following options are not supported: {0}.".format(
                ", ".join(invalid_options)
            )
        )

#### Loading channels Data

In [5]:
def load_channels() -> List[Dict[str, Union[str, pd.DataFrame]]]:
    data = DatasetLoader().load_dataset(channels)
    return data

#### Analyzing channel data

In [6]:
def analyze_channel_data(channel_data: pd.DataFrame) -> pd.DataFrame:
    data_space_analyzer = DataSpaceAnalyzer(dataframe=channel_data)

    for option in analysis_options:
        if option == "min":
            data_space_analyzer.compute_min()
        elif option == "max":
            data_space_analyzer.compute_max()
        elif option == "mean":
            data_space_analyzer.compute_mean()
        elif option == "2std":
            data_space_analyzer.compute_2std()
        elif option == "-2std":
            data_space_analyzer.compute_negative_2std()
        elif option == "moving_mean":
            data_space_analyzer.compute_moving_mean()
        elif option == "cp":
            data_space_analyzer.compute_cp()
        elif option == "cpk":
            data_space_analyzer.compute_cpk()

    return data_space_analyzer.generate_analysis_output(
        analysis_options=analysis_options, supported_analysis=supported_analysis
    )

### Validating and Analyzing Channels

In [7]:
analysis_options = list(map(str.lower, analysis_options))
final_result = []

try:
    validate_analysis_options(analysis_options)
    channels = load_channels()

    for channel in channels:
        channel_name = channel["name"]
        channel_data = channel["data"]

        analysis_results = analyze_channel_data(channel_data)
        
        final_result.append({"plot_label": channel_name, "data": analysis_results})

except DataSpaceAnalyzerError as e:
    raise Exception(e) from None

### Store the result information so that SystemLink can access it

SystemLink uses scrapbook to store result information from each notebook execution to display to the user in the Execution Details slide-out.
   

In [None]:
sb.glue("result", final_result)

# Next Steps

1. Publish this notebook to SystemLink by right-clicking it in the JupyterLab File Browser with the interface as DataSpace Analysis.
1. Manually Analyze the parametric data inside the dataspace by clicking analyze button.
   