#### Workflow
1. Initialize Workspace & creat workspace handle
2. Initialize
    - compute Cluster 
    - Environment
3. Create a .py scripts Data Processing & Training Model
4. Create Components
5. Build Pipeline using Components
6. Get Data Path
7. Initiate Pipeline

##### Step 1: Initialize Workspace and Create Workspace handle

In [None]:
from azureml.core import Workspace
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential

# Initialize  workspace
ws = Workspace.from_config()  

# Get a handle to the workspace
credential = DefaultAzureCredential()  # authenticate
ml_client = MLClient( credential=credential,
                      subscription_id=ws.subscription_id,
                      resource_group_name=ws.resource_group,
                      workspace_name=ws.name,
                    )


##### Step 2: Initialize Compute Cluster & Environment

In [None]:
from azure.ai.ml.entities import AmlCompute

# Name assigned to the compute cluster
compute = "ML-Pipeline-Cluster"

try:
    cpu_cluster = ml_client.compute.get(compute)
    print(f"You already have a cluster named {compute}, we'll reuse it as is.")

except Exception:
    print("Creating a new cpu compute target...")
    cpu_cluster = AmlCompute(
        name=compute,
        type="amlcompute",
        size="STANDARD_DS3_V2",
        min_instances=0,
        max_instances=4,
        idle_time_before_scale_down=300,
        tier="Dedicated",
    )
    
    print(f"AMLCompute with name {cpu_cluster.name} will be created, with compute size {cpu_cluster.size}")
    # Now, we pass the object to MLClient's create_or_update method
    cpu_cluster = ml_client.compute.begin_create_or_update(cpu_cluster)

##### Environment

In [None]:
import os
from azure.ai.ml.entities import Environment

custom_env_name  = "ENV-SDKv2"
# dependencies_dir = '../dependencies'
# env = Environment( name=custom_env_name,
#                    description="Evironment for python SDKv2 Execution",
#                    conda_file=os.path.join(dependencies_dir, "conda.yaml"),
#                    image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest",
#                  )
# env = ml_client.environments.create_or_update(env)

# GET ENVIRONMENT
# use 'label' parameter to get latest environment for example label='latest'
# use 'version' parameter to get specific version environment, for example version=2
env = ml_client.environments.get(name=custom_env_name, label='latest') 

print(f"Environment with name {env.name} is registered to workspace, the environment version is {env.version}")

##### Step 3: Create Components to Build Pipeline

Now that you have all assets required to run your pipeline, it's time to build the pipeline itself.

Azure Machine Learning pipelines are reusable ML workflows that usually consist of several components. The typical life of a component is:

- Write the yaml specification of the component, or create it programmatically using `ComponentMethod`.
- Optionally, register the component with a name and version in your workspace, to make it reusable and shareable.
- Load that component from the pipeline code.
- Implement the pipeline using the component's inputs, outputs and parameters.
- Submit the pipeline.

There are two ways to create a component, programmatic and yaml definition. The next two sections walk you through creating a component using programmatic definition

> [!NOTE]
> In this tutorial for simplicity we are using the same compute for all components. However, you can set different computes for each component, for example by adding a line like `train_step.compute = "cpu-cluster"`. To view an example of building a pipeline with different computes for each component, see the [Basic pipeline job section in the cifar-10 pipeline tutorial](https://github.com/Azure/azureml-examples/blob/main/sdk/python/jobs/pipelines/2b_train_cifar_10_with_pytorch/train_cifar_10_with_pytorch.ipynb).

In [None]:
from azure.ai.ml import command
from azure.ai.ml import Input, Output

scripts_dir = "../src"
data_prep_component = command( name="data prep pima diabetes detection",
                               display_name ="Data preparation for inference",
                               description  ="reads input data & preprocesses it",
                               inputs= { "data": Input(type="uri_folder")},

                               outputs=dict( processed_data=Output(type="uri_folder", mode="rw_mount")),
                               code=scripts_dir, # The source folder of the component
                               command="""python pima_inference_dataProcessing_SDKv2.py \
                                        --data ${{inputs.data}} \
                                        --processed_data ${{outputs.processed_data}} \
                                        """,
                               environment=f"{env.name}:{env.version}",
                            )

train_component = command( name="pima diabetes model inference",
                            display_name ="Model inference",
                            inputs= { "processed_data": Input(type="uri_folder"),
                                      "registered_model_name":Input(type='string'),
                                    },
                            outputs=dict(model=Output(type="uri_folder", mode="rw_mount")),
                            code=scripts_dir,
                            command="""python pima_model_inference_SDKv2.py \
                                    --input_data ${{inputs.processed_data}} \
                                    --registered_model_name ${{inputs.registered_model_name}} \
                                    --model ${{outputs.model}} \
                                    """,
                            environment=f"{env.name}:{env.version}",
                            )