Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sdk] Not able to pass Custom Data Types to V2 Pipeline (works with v1) #6390

Closed
alexcpn opened this issue Aug 19, 2021 · 1 comment
Closed

Comments

@alexcpn
Copy link

alexcpn commented Aug 19, 2021

Environment

  • KFP version:
kustomize build apps/pipeline/upstream/env/platform-agnostic-multi-user-pns | grep 1.5.1
2021/08/19 11:19:46 nil value at `valueFrom.configMapKeyRef.name` ignored in mutation attempt
2021/08/19 11:19:46 nil value at `valueFrom.secretKeyRef.name` ignored in mutation attempt
2021/08/19 11:19:46 well-defined vars that were never replaced: kfp-app-name,kfp-app-version
  appVersion: 1.5.1
        image: gcr.io/ml-pipeline/cache-deployer:1.5.1
        image: gcr.io/ml-pipeline/cache-server:1.5.1
      - image: gcr.io/ml-pipeline/metadata-envoy:1.5.1
        image: gcr.io/ml-pipeline/metadata-writer:1.5.1
        image: gcr.io/ml-pipeline/api-server:1.5.1
        image: gcr.io/ml-pipeline/persistenceagent:1.5.1
        image: gcr.io/ml-pipeline/scheduledworkflow:1.5.1
        image: gcr.io/ml-pipeline/frontend:1.5.1
        image: gcr.io/ml-pipeline/viewer-crd-controller:1.5.1
      - image: gcr.io/ml-pipeline/visualization-server:1.5.1
  • KFP SDK version:
build version dev_local
  • All dependencies version:
kfp                      1.6.3
kfp-pipeline-spec        0.1.8
kfp-server-api           1.6.0

Steps to reproduce

For V1 pipeline the following works

from typing import NamedTuple
from typing import TypeVar
from kfp.components import InputPath, OutputPath
PandasDataFrame = TypeVar('pandas.core.frame.DataFrame')
#def readdata(url,out: OutputPath(PandasDataFrame)):
def readdata(url:str,out: OutputPath(PandasDataFrame)):    
    import pandas as pd
    from collections import namedtuple
    df = pd.read_csv(url)
    print("No of records",df.index)
    df.to_parquet(out)        
-------------------------------------    
read_data = create_component_from_func(readdata,base_image='tensorflow/tensorflow:2.6.0', packages_to_install=['pandas==0.24','sklearn','numpy','pyarrow'])

---------------------------------------
import kfp.dsl as dsl
@dsl.pipeline(
  name='Get and Process Training Data',
  description='Get and Process Training data'
)
def getdata_and_process_pipeline(
  a:str="https://raw.githubusercontent.com/alexcpn/neuralnetwork_learn/main/data/heart-attack-prediction/heart.csv"
):
  
  model_path = create_nn_model().output
  pd_as_parquet = read_data(url=a).output
  process_task =process_data(pandas_parqute)
-----------------
client.create_run_from_pipeline_func(getdata_and_process_pipeline, arguments={})
-----------------------

However for V2, the same function is giving an execution error

read_data = create_component_from_func_v2(readdata,base_image='tensorflow/tensorflow:2.6.0', packages_to_install=['pandas==0.24','sklearn','numpy','pyarrow'])

-----
import kfp.dsl as dsl
@dsl.pipeline(
  name='Get and Process Training Data',
  description='Get and Process Training data'
)
def getdata_and_process_pipeline(
  a:str="https://raw.githubusercontent.com/alexcpn/neuralnetwork_learn/main/data/heart-attack-prediction/heart.csv"
):
  
  model_path = create_nn_model().output
  pd_as_parquet = read_data(url=a).output
----------------
client.create_run_from_pipeline_func(getdata_and_process_pipeline,mode=kfp.dsl.PipelineExecutionMode.V2_COMPATIBLE, arguments={})
----------
Logs for readdata component

NameError: name 'PandasDataFrame' is not defined
F0818 13:29:15.365216      37 main.go:56] Failed to execute component: exit status 1

Expected result

Custom data types should work a in v2 as in v1

Materials and Reference

Full v1 code -

https://colab.research.google.com/drive/1f_p4EVKReT57J4Maz4vRfhccJ_qVv03W?usp=sharing


Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants