# Part 3: Execute a ML use case with inputs and outputs 

In Part 2, we have built and run our first ML pipeline to retrieve data from the data store, train a model and store it on the data store.

What if we want to use our pipeline to perform predictions with this trained model on new data ? Currently, we can not pass new data to our model. ⇒ We need to add an Input to our pipeline.

Moreover, what if we want to provide these predictions to a final user? ⇒ We need to add an Output to our pipeline.

## Import libraries

In [None]:
from craft_ai_sdk import CraftAiSdk, Input, Output
import dotenv
import os

dotenv.load_dotenv()

## Load environnement variables

In [None]:
CRAFT_AI_SDK_TOKEN = os.environ.get("CRAFT_AI_SDK_TOKEN")
CRAFT_AI_ENVIRONMENT_URL = os.environ.get("CRAFT_AI_ENVIRONMENT_URL")

## SDK instantiation

In [None]:
sdk = CraftAiSdk(sdk_token=CRAFT_AI_SDK_TOKEN, environment_url=CRAFT_AI_ENVIRONMENT_URL)

## Pipeline creation with the SDK

Now, let’s create our pipeline on the platform. Here, since we have inputs and outputs, our pipeline is the combination of three elements: inputs, outputs and the Python function. We will first declare the input and the output. Then, we will use the function sdk.create_pipeline() as in Part 2 to create the whole pipeline.

### Declare Input and Output

To manage inputs and outputs of a pipeline, the platform requires you to declare them using the ``Input`` and ``Output`` classes from the SDK.

For our Iris application, the inputs and outputs declaration would look like below.

Both objects have two main attributes:

- The name
- The data_type describing the type of data it can accept. It can be one of: ``string``, ``number``, ``boolean``, ``json``, ``array``.





#### Input

For the input the name corresponds to the name of an argument of your pipeline's function. In our case name="input_data" and "input_model_path" (as in the first line of function)

In [None]:
prediction_input = Input(
    name="input_data", 
    data_type="json"
)

model_input = Input(
    name="input_model_path", 
    data_type="string"
)

#### Output

For the output the name must be a key in the dictionary returned by your pipeline's function. In our case, name="predictions" as in the last line of function :

In [None]:
prediction_output = Output(
    name="predictions",
    data_type="json"
)

### Create a pipeline

Now, we have everything we need to create, as before, the pipeline corresponding to our new ``predictIris()`` function.
This is exclatly like in part 2 except for two parameters :

- inputs containing the list of Input objects we declared above (here, prediction_input and model_input).
- outputs containing the list of Output objects we declared above (here, prediction_output).


In [None]:
sdk.create_pipeline(
    pipeline_name="part-3-irisio",
    function_path="src/part-3-iris-predict.py",
    function_name="predictIris", 
	description="This function retrieves the trained model and classifies the input data by returning the prediction.",
	container_config={
        "local_folder": "../../get_started",
        "requirements_path": "requirements.txt",
        },
    inputs=[prediction_input, model_input],
    outputs=[prediction_output],
)

### List the pipelines

In [None]:
pipeline_list = sdk.list_pipelines()
pipeline_list

### Get pipeline information

In [None]:
pipeline_info = sdk.get_pipeline("part-3-irisio")
pipeline_info

## Execute the pipeline (RUN)

### Prepare input data

Now, to execute the pipeline we need data as input (formatted as we said above). 

Let’s prepare it, simply by choosing some of the rows of iris dataset we did not use when training our model:

In [None]:
import numpy as np
from sklearn import datasets

np.random.seed(0)
indices = np.random.permutation(150)
iris_X, iris_y = datasets.load_iris(return_X_y=True, as_frame=True)
iris_X_test = iris_X.loc[indices[90:120],:]

new_data = iris_X_test.to_dict(orient="index")
new_data

Finally, we need to encapsulate this dictionary in another dictionary whose keys are "input_data" and "input_model_path" (the names of the inputs of our pipeline, i.e. the names of the arguments of our pipeline's function):

In [None]:
inputs = {
	"input_data": new_data,
    "input_model_path": "get_started/models/iris_knn_model.joblib"
}

### Run the pipeline with inputs

Finally, we can test our pipeline execution with the data we’ve just prepared by calling the sdk.run_pipeline() function almost as in Part 2 except this time we will pass our inputs dictionary in the inputs arguments:

In [None]:
output_predictions = sdk.run_pipeline(pipeline_name="part-3-irisio", inputs=inputs)

We can retrieve the return of our function but getting the item corresponding to the 'predictions' key in the output dictionary.

In [None]:
output_predictions['outputs']['predictions']

## Execution verification

### Display logs

Moreover, we can check the logs of this execution directly on the platform interface or as follows, as in the previous parts:

In [None]:
pipeline_executions = sdk.list_pipeline_executions(pipeline_name="part-3-irisio")

In [None]:
logs = sdk.get_pipeline_execution_logs(execution_id=pipeline_executions[-1]['execution_id'])

print('\n'.join(log["message"] for log in logs))