# Create an edge configuration package 

In this notebook, the main goal is to create a pipeline with all of the contents that are necessary for the execution of the model on an Industrial Edge device.  
In order to put the elements together, this example collects files

from [10-CreateClusteringModel](10-CreateClusteringModel.ipynb) notebook:  
- **clustering-model.joblib**: the model itself, created in  notebook.

from [20-CreateInferenceWrapper](20-CreateInferenceWrapper.ipynb) notebook:  
- **entrypoint.py**: the script that is called by the runtime to execute the model on the Edge side
- **src** folder: all the other Python scripts necessary to execute the model

### Imports  

In [None]:
import yaml
from simaticai import deployment

import sys
from pathlib import Path
sys.path.insert(0, str(Path('../src').resolve()))

#### Create an inference component

In the [10-CreateClusteringModel](10-CreateClusteringModel.ipynb) notebook we created our model with the proper parameters and trained network. In this step, we create a `PythonComponent` from the saved model.  
We need to
- include the model in the package as a resource file
- define the input variables
- define an output variable

In [None]:
COMPONENT_DESCRIPTION ="""\
This component receives data rows of measured energy consumption data (ph1, ph2, ph3) from SIMATIC S7 Connector and predicts a cluster for every 'step_size' number of data rows.
"""

# Create a PythonComponent to use the saved model.
component = deployment.PythonComponent(name='inference', desc=COMPONENT_DESCRIPTION, version='1.0.0', python_version='3.11')

#### Add entrypoint

The entry point acts as an interface between your code and the AI Inference Server. 
This Python code will unwrap the incoming data to a dictionary for your code and wrap your answer to a formatted response for the Runtime.  
The AI Inference Server expects the entry point to be located in the component root directory.  
You need to add this file as a resource and set the component's entry point to its file name.

In [None]:
component.add_resources("..", "entrypoint.py")
component.add_resources("..", "models/clustering-model.joblib")
component.set_entrypoint('entrypoint.py')  # Defining the entry point which will be triggered through its `run(..)` method

#### Add Inputs and Output

Defining input and output variables of the component.

In [None]:
INPUT_DESCRIPTION1="""\
Measured energy consumption on phase 1
"""
INPUT_DESCRIPTION2="""\
Measured energy consumption on phase 2
"""
INPUT_DESCRIPTION3="""\
Measured energy consumption on phase 3
"""
OUTPUT_DESCRIPTION="""\
Predicted cluster of the datapoint (0, 1 or 2)
"""

component.add_input("ph1", "Double",INPUT_DESCRIPTION1)
component.add_input("ph2", "Double",INPUT_DESCRIPTION2)
component.add_input("ph3", "Double",INPUT_DESCRIPTION3)

component.add_output("prediction", "Integer",OUTPUT_DESCRIPTION)

#### Add metrics
It can be useful to monitor the pipeline, e.g. watch how some features change. In this example we expose the minimum, maximum and mean values. The metric name must contain an underscore (`_`), because the part before the underscore is used to group custom metrics on the dashboard.

**&#9888; Remember!**
You have to use the same names here and in the inference wrapper script.

In [None]:
component.add_metric("model_input_min")
component.add_metric("model_input_max")
component.add_metric("model_input_mean")

#### Add other resources

Your code might require additional resources, such as config files, models (as seen above), further Python sources or reference data files.   
You can simply add them using `add_resources(base_dir, resources)`.  
As base_dir, you have to pass the local directory that you want to correspond to the component root on the runtime. As resources, you can pass a list of paths relative to base_dir. The referred files and directories will be packaged. On the runtime, the resources will be available under the same paths relative to the component root directory.

In [None]:
component.add_resources("..", ["src/si"])
component

#### Create a pipeline with the inference component

Now you can use the `component` to create an edge configuration package.  
During its initialization, it will build the connections between the model and its environment, in particular the wiring of the pipeline inputs and outputs.  
The given name and version will form the name of the pipeline and the folder where the necessary configuration files and folder structure will be created.  
Also, set the inter-signal alignment periodicity if necessary.

In [None]:
PIPELINE_DESCRIPTION ="""\
This pipeline runs an Clustering Model on an Industrial Edge device.
The aim of the model is to distinguishes 3 operation states of a machine based on its energy consumption.

This model was trained by K-Means clustering on energy consumption data measured on 3 phases of electrical current (ph1, ph2, ph3).

The pipeline receives data from SIMATIC S7 Connector using AI Inference Server's Inter Signal Alignment feature.
The intersignal alignment must be set to the same sampling rate of 250ms that the model was trained on.

Data goes through a scikit-learn pipeline consisting of 2 stages, a preprocessing and a clustering.
(Please note that even though the scikit pipeline has 2 stages it will be executed as a single component on AI Inference Server)

The preprocessing step of the scikit-learn pipeline groups input data rows into data windows, 300 data rows each.
This window is advanced according to the 'step_size' parameter.
If 'step_size' is set to the window size (300 in this case) the windows will be adjacent.
If 'step_size' is set to be smaller than the window size, the windows will overlap.
The preprocessing of the scikit-learn pipeline calculates a series of basic statistical features for each window (e.g.: Min, Max, Mean).

The model itself is fed with these statistical features, producing a predicted class for each window.
As a result the pipeline produces a single output for every 'step_size' number of data rows.
The first output is produced after consuming a complete window (300 data rows).
"""

#To assure compatibility with older versions of AI SDK (<v1.5.0), you must define the `version` parameter in the `from_components()` method
pipeline = deployment.Pipeline.from_components([component], name='State Identifier', desc=PIPELINE_DESCRIPTION)

pipeline.add_parameter("step_size", 300, "Integer")
pipeline.set_timeshifting_periodicity(250)
pipeline

To make sure everything went as expected, examine the resulting metadata structure.

In [None]:
print(yaml.dump(pipeline.get_datalink_metadata()))

#### Define necessary python modules

To execute the model with `inference.py` the python runtime environment must contain all the required packages.
To do so the AI SDK's `convert_package` method will download the dependencies into the edge configuration package, and here you need to define all the dependencies of your pipeline and model.  
While the installation does not execute any post installation script, only those modules can be used which are available in wheel binary format.

In [None]:
pipeline.add_dependencies([
    ('tsfresh','0.17.0'), # Runtime needs it to depickle.
    ('scikit-learn','1.3.2'),
    ('scipy','1.10.1'),
    ('statsmodels','0.14.0'),
])

### Build the edge package

This step creates the proper content in the defined target folder and creates the edge configuration package as a zip file.  
The `export()` method first validates the pipeline and raises an error if it finds any problems. Manual validation is also possible with the `pipeline.validate()` method.

Edge packages are identified by their `package id` and `version` attributes, and will be grouped in AI Inference Server and in other Edge applications by `package id`.
When saving a pipeline - in the `export()` method - you can specify a `pacakge id` in a UUID 4 compliant format, or an automatically generated one will be assigned. 
If no `package id` is defined in the `export()` method, and AI SDK finds an already assigned `package id` in a previously generated, and similarly named package, the `package id` found in the latest package will be used.

AI SDK will automatically assign and increase the version number of a pipeline every time a package is saved, unless a new package id is assigned in the `export()` method or an explicit version number is defined without package id either in the `export()` method or in the pipeline constructor.

Restrictions: 
- You can not overwrite a previously saved package with the same `package id` if the `package id` is explicitly assigned in `export()` method 
- Packages generated with older versions of AI SDK (without `package id`) will be overwritten
- If a new `package id` is assigned to an existing version of the package it will overwrite the old one
- If no previous version of a package (with a generated or explicitly assigned `package id`) found, AI SDK will assign the version `1` to the created package
- Version defined in the `export()` method takes precedence over the version assigned on constructor level

Now you are ready to bring your model to the shopfloor by building the edge configuration package. You can achieve this by the following step.

In [None]:
import uuid

edge_package_path = pipeline.export('../packages')
# edge_package_path = pipeline.export(destination='../packages', version="1")
# edge_package_path = pipeline.export(destination='../packages', package_id=uuid.uuid4())
# edge_package_path = pipeline.export(destination='../packages', package_id=uuid.uuid4(), version="1")
edge_package_path

### Test the edge configuration package locally

We suggest you test your edge configuration package on your computer before you deploy it to an Edge device. It is possible to do so using notebook [40-TestPipelineLocally](40-TestPipelineLocally.ipynb).

The package is now ready to be imported on AI Inference Server.