# Intro

The following notebook was meant to be for understanding how Vertex AI or Kubeflow Pipelines work, we'll create 2 python functions with a mathematical operation that stores the result ina dictionary and an artifact.

This is a reference for a medium post located [here](https://medium.com/p/7ae08d47cb9a/edit).

![Components](./images/getting_started_1.png)

## Install and Update KFP

In [1]:
!pip install kfp --upgrade --user

[0m

## Set Variables

In [2]:
PIPELINE_ROOT_PATH='gs://vtx-root-path'

## Import Libraries

In [3]:
import warnings
from typing import Dict
from kfp.v2 import compiler
from kfp.v2.dsl import (pipeline, component, Input, Output, OutputPath, Artifact,)
from google.cloud import aiplatform

warnings.filterwarnings('ignore')

## Create Pipeline Components (Functions)

In [4]:
@component(
    base_image='python',             # Use any container
    packages_to_install=[
        'pandas',                    # Packages required
        'gcsfs'
    ])
def function_1(a: int, b: int, output_dict_param_path: OutputPath(Dict[str, int]), dataset: Output[Artifact]):
    
    import pandas as pd              # Import libraries
    import json
    
    sum_dict = {'result': [a+b]}       # Dictionary result
    
    with open(output_dict_param_path, 'w') as file:   # Write Dict
        file.write(json.dumps(sum_dict))
     
    dataframe = pd.DataFrame(sum_dict) # Create Dataframe
    dataframe.to_csv(dataset.path, index=False)     # Store Dataframe
    
@component
def function_2(input_dict: Dict[str, int], dataset: Input[Artifact]) -> int:
    
    import csv
    
    print(input_dict)                    # Print function_1 res
    with open(dataset.path, 'r') as file:    # Dataframe Read
        csvreader = csv.reader(file)
        for row in csvreader:
            print(row)
            
    return int(input_dict['result'][0]) # Return Integer

## Create the Pipeline

In [5]:
@pipeline(name='my-first-pipe')
def pipeline():
    function_1_task = function_1(324,573)
    function_2_task = function_2(function_1_task.outputs['output_dict_param_path'], function_1_task.outputs['dataset'])

## Compile the Pipeline

In [6]:
compiler.Compiler().compile(pipeline_func=pipeline,
        package_path='first_pipe.json')

## Run the Pipeline Job

In [7]:
job = aiplatform.PipelineJob(
    display_name="first_pipe",
    template_path="first_pipe.json",
    pipeline_root=PIPELINE_ROOT_PATH,
    parameter_values={}
)

job.submit()

Creating PipelineJob
PipelineJob created. Resource name: projects/985084735463/locations/us-central1/pipelineJobs/my-first-pipe-20220707134414
To use this PipelineJob in another session:
pipeline_job = aiplatform.PipelineJob.get('projects/985084735463/locations/us-central1/pipelineJobs/my-first-pipe-20220707134414')
View Pipeline Job:
https://console.cloud.google.com/vertex-ai/locations/us-central1/pipelines/runs/my-first-pipe-20220707134414?project=985084735463


## Result

![pipe_result](./images/pipe_result.png)

## Skeleton for an Easier Way

In [None]:
# Create the component

@component
def func_1(x: int):
    print(x)

# Create the pipe
@pipeline(name='pipe')
def pipeline():
    func_1_task = func_1(8)

# Compile it
compiler.Compiler().compile(pipeline_func=pipeline,
        package_path='first_pipe.json')

# Run it
job = aip.PipelineJob(
    display_name="first_pipe",
    template_path="first_pipe.json",
    pipeline_root=PIPELINE_ROOT_PATH,
    parameter_values={}
)

job.submit()