# Create an edge configuration package 

In this notebook, the main goal is to create an edge package that can be executed by an Inference Server running on an Industrial Edge device.

In order to put the elements together, this example collects the following:
- `models/bsi-model.joblib`: the model itself, created in [10-CreateModel](10-CreateModel.ipynb) notebook.
- `entrypoint.py`: the script that is called by the runtime to execute the model on the Edge side
- `src` folder: all the other Python scripts necessary to execute the model

## Create a component

AI Inference Server pipelines consist of components. A `PythonComponent` can be used to package the python wrapper script and the model for execution on an Industrial Edge device.

The entrypoint of a component acts as an interface between the python code and the AI Python Runtime. The AI Python Runtime expects the entry point file to be located in the component root directory.

Your code might require additional resources, such as config files, the model or models, further Python sources or reference data files. On the runtime, the resources will be available under the same paths relative to the component root directory.

In [None]:
from simaticai.deployment import PythonComponent

component_desc = """
Process batches of incoming measurements of energy consumption data (ph1, ph2, ph3) and predicts which cluster it belongs to.
"""
component = PythonComponent(name='inference', version='1.0.0', desc=component_desc, python_version='3.11')
component.add_resources("..", ["entrypoint.py", "models/bsi-model.joblib", "src/si"])
component.set_entrypoint("entrypoint.py")

component.add_input("json_data", "String", desc="A json string holding a batch of 300 data points.")
component.add_output("prediction", "String", desc="The predicted cluster of the batch.")
component

## Package the pipeline

This step creates the proper content in the defined target folder and creates the edge configuration package as a zip file.  
The save method first validates the pipeline and raises an error if it finds any problems. Manual validation is also possible with the `pipeline.validate()` method.

Edge packages are identified by their `package id` and `version` attributes, and will be grouped in AI Inference Server and in other Edge applications by `package id`.
When saving a pipeline - in the `export()` method - you can specify a `pacakge id` in a UUID 4 compliant format, or an automatically generated one will be assigned. 
If no `package id` is defined in the `export()` method, and AI SDK finds an already assigned `package id` in a previously generated, and similarly named package, the `package id` found in the latest package will be used.

AI SDK will automatically assign and increase the version number of a pipeline every time a package is saved, unless a new package id is assigned in the `export()` method or an explicit version number is defined without package id either in the `export()` method or in the pipeline constructor.

Restrictions: 
- You can not overwrite a previously saved package with the same `package id` if the `package id` is explicitly assigned in save() method 
- Packages generated with older versions of AI SDK (without `package id`) will be overwritten
- If a new `package id` is assigned to an existing version of the package it will overwrite the old one
- If no previous version of a package (with a generated or explicitly assigned `package id`) found, AI SDK will assign the version `1` to the created package
- Version defined in the save() method takes precedence over the version assigned on constructor level

In [None]:
from simaticai.deployment import Pipeline

pipeline_desc ="""\
This pipeline runs a Batch State Identifier model on an Industrial Edge device.
The aim of the model is to distinguishes 3 operation states of a machine based on its energy consumption.

This model was trained by K-Means clustering on energy consumption data measured on 3 phases of electrical current (ph1, ph2, ph3).

Data goes through a scikit-learn pipeline consisting of 2 stages, a preprocessing and a clustering.
(Please note that even though the scikit pipeline has 2 stages it will be executed as a single component on AI Inference Server)

Eahc bathc of the incoming data contains 300 data rows.
The preprocessing of the scikit-learn pipeline calculates a series of basic statistical features for each window (e.g.: Min, Max, Mean).
Theese features are also exported as metrics.

The model itself is fed with these statistical features, producing a predicted class for each batch.
"""
#To assure compatibility with older versions of AI SDK (<v1.5.0), you must define the `version` parameter in the `from_components()` method
pipeline = Pipeline.from_components([component], name='Batch State Identifier', desc=pipeline_desc)
pipeline

#### Define necessary python modules

To execute the model with `inference.py` the python runtime environment must contain all the required packages.
To do so the AI Model Deployer will download the dependencies into the edge configuration package, and here you need to define all the dependencies of your pipeline and model.  
While the installation does not execute any post installation script, only those modules can be used which are available in wheel binary format.

In [None]:
pipeline.add_dependencies([
    ('tsfresh','0.17.0'), # Runtime needs it to depickle.
    ('scikit-learn','1.3.2'),
    ('scipy','1.10.1'),
    ('statsmodels','0.14.0'),
])

#### Save the runnable package

This step creates the proper content in the defined target folder and creates the edge configuration package as a zip file.

In [None]:
edge_package_path = pipeline.export(destination = '../packages')

print("Edge runtime package:", edge_package_path)

## Ready to go

Now you are ready to upload the edge runtime package to the AI Inference Server.