New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[frontend] artifact preview & visualization breaks with argo v3.1+ #5930
Comments
/assign @zijianjoy |
Pasting an example full argo workflow: apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
annotations:
pipelines.kubeflow.org/kfp_sdk_version: 1.6.4
pipelines.kubeflow.org/pipeline_compilation_time: 2021-06-29T11:08:44.423700
pipelines.kubeflow.org/pipeline_spec: '{"inputs": [{"default": "", "name": "pipeline-output-directory"},
{"default": "two_step_pipeline", "name": "pipeline-name"}], "name": "two_step_pipeline"}'
pipelines.kubeflow.org/run_name: two_step_pipeline 2021-06-29 11-08-44
pipelines.kubeflow.org/v2_pipeline: "true"
creationTimestamp: "2021-06-29T11:08:44Z"
generateName: two-step-pipeline-
generation: 7
labels:
pipeline/persistedFinalState: "true"
pipeline/runid: ceb88c92-21a9-4797-a68b-bb27c6183d59
pipelines.kubeflow.org/kfp_sdk_version: 1.6.4
pipelines.kubeflow.org/v2_pipeline: "true"
workflows.argoproj.io/completed: "true"
workflows.argoproj.io/phase: Succeeded
manager: workflow-controller
operation: Update
time: "2021-06-29T11:10:01Z"
name: two-step-pipeline-94mfz
namespace: kubeflow
resourceVersion: "30684140"
selfLink: /apis/argoproj.io/v1alpha1/namespaces/kubeflow/workflows/two-step-pipeline-94mfz
uid: cef9e9db-28ae-4e16-8e4e-fcf4d960e967
spec:
arguments:
parameters:
- name: pipeline-output-directory
value: gs://gongyuan-dev/v2-sample-test/data/samples_config-loop-item
- name: pipeline-name
value: two_step_pipeline
entrypoint: two-step-pipeline
serviceAccountName: pipeline-runner
templates:
- container:
args:
- sh
- -ec
- |
program_path=$(mktemp)
printf "%s" "$0" > "$program_path"
python3 -u "$program_path" "$@"
- |
def _make_parent_dirs_and_return_path(file_path: str):
import os
os.makedirs(os.path.dirname(file_path), exist_ok=True)
return file_path
def preprocess(
uri, some_int, output_parameter_one,
output_dataset_one
):
'''Dummy Preprocess Step.'''
with open(output_dataset_one, 'w') as f:
f.write('Output dataset')
with open(output_parameter_one, 'w') as f:
f.write("{}".format(1234))
import argparse
_parser = argparse.ArgumentParser(prog='Preprocess', description='Dummy Preprocess Step.')
_parser.add_argument("--uri", dest="uri", type=str, required=True, default=argparse.SUPPRESS)
_parser.add_argument("--some-int", dest="some_int", type=int, required=True, default=argparse.SUPPRESS)
_parser.add_argument("--output-parameter-one", dest="output_parameter_one", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)
_parser.add_argument("--output-dataset-one", dest="output_dataset_one", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)
_parsed_args = vars(_parser.parse_args())
_outputs = preprocess(**_parsed_args)
- --uri
- '{{$.inputs.parameters[''uri'']}}'
- --some-int
- '{{$.inputs.parameters[''some_int'']}}'
- --output-parameter-one
- '{{$.outputs.parameters[''output_parameter_one''].output_file}}'
- --output-dataset-one
- '{{$.outputs.artifacts[''output_dataset_one''].path}}'
command:
- /kfp-launcher/launch
- --mlmd_server_address
- $(METADATA_GRPC_SERVICE_HOST)
- --mlmd_server_port
- $(METADATA_GRPC_SERVICE_PORT)
- --runtime_info_json
- $(KFP_V2_RUNTIME_INFO)
- --container_image
- $(KFP_V2_IMAGE)
- --task_name
- preprocess
- --pipeline_name
- '{{inputs.parameters.pipeline-name}}'
- --pipeline_run_id
- $(WORKFLOW_ID)
- --pipeline_task_id
- $(KFP_POD_NAME)
- --pipeline_root
- '{{inputs.parameters.pipeline-output-directory}}'
env:
- name: KFP_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: KFP_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: WORKFLOW_ID
valueFrom:
fieldRef:
fieldPath: metadata.labels['workflows.argoproj.io/workflow']
- name: KFP_V2_IMAGE
value: python:3.9
- name: KFP_V2_RUNTIME_INFO
value: '{"inputParameters": {"some_int": {"type": "INT", "value": "BEGIN-KFP-PARAM[12]END-KFP-PARAM"},
"uri": {"type": "STRING", "value": "BEGIN-KFP-PARAM[uri-to-import]END-KFP-PARAM"}},
"inputArtifacts": {}, "outputParameters": {"output_parameter_one": {"type":
"INT", "path": "/tmp/outputs/output_parameter_one/data"}}, "outputArtifacts":
{"output_dataset_one": {"schemaTitle": "system.Dataset", "instanceSchema":
"", "metadataPath": "/tmp/outputs/output_dataset_one/data"}}}'
envFrom:
- configMapRef:
name: metadata-grpc-configmap
optional: true
image: python:3.9
name: ""
resources: {}
volumeMounts:
- mountPath: /kfp-launcher
name: kfp-launcher
initContainers:
- command:
- /bin/mount_launcher.sh
image: gcr.io/gongyuan-dev/v2-sample-test/kfp-launcher@sha256:55d2af7c8f37515f745dea578ffa76af749e99474af29157474ea88ce0249d17
mirrorVolumeMounts: true
name: kfp-launcher
resources: {}
inputs:
parameters:
- name: pipeline-name
- name: pipeline-output-directory
metadata:
annotations:
pipelines.kubeflow.org/arguments.parameters: '{"some_int": "12", "uri": "uri-to-import"}'
pipelines.kubeflow.org/component_ref: '{}'
pipelines.kubeflow.org/v2_component: "true"
sidecar.istio.io/inject: "false"
labels:
pipelines.kubeflow.org/cache_enabled: "true"
pipelines.kubeflow.org/kfp_sdk_version: 1.6.4
pipelines.kubeflow.org/pipeline-sdk-type: kfp
pipelines.kubeflow.org/v2_component: "true"
name: preprocess
outputs:
artifacts:
- name: preprocess-output_dataset_one
path: /tmp/outputs/output_dataset_one/data
- name: preprocess-output_parameter_one
path: /tmp/outputs/output_parameter_one/data
parameters:
- name: preprocess-output_parameter_one
valueFrom:
path: /tmp/outputs/output_parameter_one/data
volumes:
- name: kfp-launcher
- container:
args:
- sh
- -ec
- |
program_path=$(mktemp)
printf "%s" "$0" > "$program_path"
python3 -u "$program_path" "$@"
- |
def _make_parent_dirs_and_return_path(file_path: str):
import os
os.makedirs(os.path.dirname(file_path), exist_ok=True)
return file_path
def train_op(
dataset,
model,
num_steps = 100
):
'''Dummy Training Step.'''
with open(dataset, 'r') as input_file:
input_string = input_file.read()
with open(model, 'w') as output_file:
for i in range(num_steps):
output_file.write(
"Step {}\n{}\n=====\n".format(i, input_string)
)
import argparse
_parser = argparse.ArgumentParser(prog='Train op', description='Dummy Training Step.')
_parser.add_argument("--dataset", dest="dataset", type=str, required=True, default=argparse.SUPPRESS)
_parser.add_argument("--num-steps", dest="num_steps", type=int, required=False, default=argparse.SUPPRESS)
_parser.add_argument("--model", dest="model", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)
_parsed_args = vars(_parser.parse_args())
_outputs = train_op(**_parsed_args)
- --dataset
- '{{$.inputs.artifacts[''dataset''].path}}'
- --num-steps
- '{{$.inputs.parameters[''num_steps'']}}'
- --model
- '{{$.outputs.artifacts[''model''].path}}'
command:
- /kfp-launcher/launch
- --mlmd_server_address
- $(METADATA_GRPC_SERVICE_HOST)
- --mlmd_server_port
- $(METADATA_GRPC_SERVICE_PORT)
- --runtime_info_json
- $(KFP_V2_RUNTIME_INFO)
- --container_image
- $(KFP_V2_IMAGE)
- --task_name
- train-op
- --pipeline_name
- '{{inputs.parameters.pipeline-name}}'
- --pipeline_run_id
- $(WORKFLOW_ID)
- --pipeline_task_id
- $(KFP_POD_NAME)
- --pipeline_root
- '{{inputs.parameters.pipeline-output-directory}}'
env:
- name: KFP_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: KFP_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: WORKFLOW_ID
valueFrom:
fieldRef:
fieldPath: metadata.labels['workflows.argoproj.io/workflow']
- name: KFP_V2_IMAGE
value: python:3.7
- name: KFP_V2_RUNTIME_INFO
value: '{"inputParameters": {"num_steps": {"type": "INT", "value": "BEGIN-KFP-PARAM[{{inputs.parameters.preprocess-output_parameter_one}}]END-KFP-PARAM"}},
"inputArtifacts": {"dataset": {"metadataPath": "/tmp/inputs/dataset/data",
"schemaTitle": "system.Dataset", "instanceSchema": ""}}, "outputParameters":
{}, "outputArtifacts": {"model": {"schemaTitle": "system.Model", "instanceSchema":
"", "metadataPath": "/tmp/outputs/model/data"}}}'
envFrom:
- configMapRef:
name: metadata-grpc-configmap
optional: true
image: python:3.7
name: ""
resources: {}
volumeMounts:
- mountPath: /kfp-launcher
name: kfp-launcher
initContainers:
- command:
- /bin/mount_launcher.sh
image: gcr.io/gongyuan-dev/v2-sample-test/kfp-launcher@sha256:55d2af7c8f37515f745dea578ffa76af749e99474af29157474ea88ce0249d17
mirrorVolumeMounts: true
name: kfp-launcher
resources: {}
inputs:
artifacts:
- name: preprocess-output_dataset_one
path: /tmp/inputs/dataset/data
parameters:
- name: pipeline-name
- name: pipeline-output-directory
- name: preprocess-output_parameter_one
metadata:
annotations:
pipelines.kubeflow.org/arguments.parameters: '{"num_steps": "{{inputs.parameters.preprocess-output_parameter_one}}"}'
pipelines.kubeflow.org/component_ref: '{}'
pipelines.kubeflow.org/v2_component: "true"
sidecar.istio.io/inject: "false"
labels:
pipelines.kubeflow.org/cache_enabled: "true"
pipelines.kubeflow.org/kfp_sdk_version: 1.6.4
pipelines.kubeflow.org/pipeline-sdk-type: kfp
pipelines.kubeflow.org/v2_component: "true"
name: train-op
outputs:
artifacts:
- name: train-op-model
path: /tmp/outputs/model/data
volumes:
- name: kfp-launcher
- dag:
tasks:
- arguments:
parameters:
- name: pipeline-name
value: '{{inputs.parameters.pipeline-name}}'
- name: pipeline-output-directory
value: '{{inputs.parameters.pipeline-output-directory}}'
name: preprocess
template: preprocess
- arguments:
artifacts:
- from: '{{tasks.preprocess.outputs.artifacts.preprocess-output_dataset_one}}'
name: preprocess-output_dataset_one
parameters:
- name: pipeline-name
value: '{{inputs.parameters.pipeline-name}}'
- name: pipeline-output-directory
value: '{{inputs.parameters.pipeline-output-directory}}'
- name: preprocess-output_parameter_one
value: '{{tasks.preprocess.outputs.parameters.preprocess-output_parameter_one}}'
dependencies:
- preprocess
name: train-op
template: train-op
inputs:
parameters:
- name: pipeline-name
- name: pipeline-output-directory
metadata:
annotations:
sidecar.istio.io/inject: "false"
labels:
pipelines.kubeflow.org/cache_enabled: "true"
name: two-step-pipeline
outputs: {}
status:
artifactRepositoryRef:
default: true
conditions:
- status: "False"
type: PodRunning
- status: "True"
type: Completed
finishedAt: "2021-06-29T11:10:01Z"
nodes:
two-step-pipeline-94mfz:
children:
- two-step-pipeline-94mfz-2926751466
displayName: two-step-pipeline-94mfz
finishedAt: "2021-06-29T11:10:01Z"
id: two-step-pipeline-94mfz
inputs:
parameters:
- name: pipeline-name
value: two_step_pipeline
- name: pipeline-output-directory
value: gs://gongyuan-dev/v2-sample-test/data/samples_config-loop-item
name: two-step-pipeline-94mfz
outboundNodes:
- two-step-pipeline-94mfz-1801497614
phase: Succeeded
progress: 2/2
resourcesDuration:
cpu: 37
memory: 18
startedAt: "2021-06-29T11:08:44Z"
templateName: two-step-pipeline
templateScope: local/two-step-pipeline-94mfz
type: DAG
two-step-pipeline-94mfz-1801497614:
boundaryID: two-step-pipeline-94mfz
displayName: train-op
finishedAt: "2021-06-29T11:09:58Z"
hostNodeName: gke-kfp-std-default-pool-1c1207aa-2eyx
id: two-step-pipeline-94mfz-1801497614
inputs:
artifacts:
- name: preprocess-output_dataset_one
path: /tmp/inputs/dataset/data
s3:
key: artifacts/two-step-pipeline-94mfz/2021/06/29/two-step-pipeline-94mfz-2926751466/preprocess-output_dataset_one.tgz
parameters:
- name: pipeline-name
value: two_step_pipeline
- name: pipeline-output-directory
value: gs://gongyuan-dev/v2-sample-test/data/samples_config-loop-item
- name: preprocess-output_parameter_one
value: "1234"
name: two-step-pipeline-94mfz.train-op
outputs:
artifacts:
- name: train-op-model
path: /tmp/outputs/model/data
s3:
key: artifacts/two-step-pipeline-94mfz/2021/06/29/two-step-pipeline-94mfz-1801497614/train-op-model.tgz
- name: main-logs
s3:
key: artifacts/two-step-pipeline-94mfz/2021/06/29/two-step-pipeline-94mfz-1801497614/main.log
exitCode: "0"
phase: Succeeded
progress: 1/1
resourcesDuration:
cpu: 16
memory: 7
startedAt: "2021-06-29T11:09:25Z"
templateName: train-op
templateScope: local/two-step-pipeline-94mfz
type: Pod
two-step-pipeline-94mfz-2926751466:
boundaryID: two-step-pipeline-94mfz
children:
- two-step-pipeline-94mfz-1801497614
displayName: preprocess
finishedAt: "2021-06-29T11:09:14Z"
hostNodeName: gke-kfp-std-default-pool-1c1207aa-2eyx
id: two-step-pipeline-94mfz-2926751466
inputs:
parameters:
- name: pipeline-name
value: two_step_pipeline
- name: pipeline-output-directory
value: gs://gongyuan-dev/v2-sample-test/data/samples_config-loop-item
name: two-step-pipeline-94mfz.preprocess
outputs:
artifacts:
- name: preprocess-output_dataset_one
path: /tmp/outputs/output_dataset_one/data
s3:
key: artifacts/two-step-pipeline-94mfz/2021/06/29/two-step-pipeline-94mfz-2926751466/preprocess-output_dataset_one.tgz
- name: preprocess-output_parameter_one
path: /tmp/outputs/output_parameter_one/data
s3:
key: artifacts/two-step-pipeline-94mfz/2021/06/29/two-step-pipeline-94mfz-2926751466/preprocess-output_parameter_one.tgz
- name: main-logs
s3:
key: artifacts/two-step-pipeline-94mfz/2021/06/29/two-step-pipeline-94mfz-2926751466/main.log
exitCode: "0"
parameters:
- name: preprocess-output_parameter_one
value: "1234"
valueFrom:
path: /tmp/outputs/output_parameter_one/data
phase: Succeeded
progress: 1/1
resourcesDuration:
cpu: 21
memory: 11
startedAt: "2021-06-29T11:08:44Z"
templateName: preprocess
templateScope: local/two-step-pipeline-94mfz
type: Pod
phase: Succeeded
progress: 2/2
resourcesDuration:
cpu: 37
memory: 18
startedAt: "2021-06-29T11:08:44Z" |
The only information related to artifact repository seems to be:
|
The PR which removed |
How argo retrieve file from artifact: https://github.com/argoproj/argo-workflows/blob/43212590d4579c821280fd482b960934139eac2f/ui/src/app/workflows/components/workflow-node-info/workflow-node-info.tsx#L365 How backend server reads from provided info: https://github.com/argoproj/argo-workflows/blob/0e94283aea641c6c927c9165900165a72022124f/server/artifacts/artifact_server.go#L143 |
Looks like that what we are missing |
I think the "easiest" workaround without waiting for argo upstream changes is to adjust the UI server artifacts API endpoint: pipelines/frontend/server/handlers/artifacts.ts Lines 65 to 66 in 647bed7
Changes:
What do you think? |
Thank you @Bobgy for the suggestion! I will look into it. After the update of argoproj/argo-workflows#6255, I can retrieve the
However, KFP API response didn't have this info under |
@zijianjoy yes, it's expected that we need to update kfp api server and kfp persistence agent to make the new fields show up in the response to KFP UI. Try:
|
@Bobgy Thank you Yuan! I replaced both images on a |
I might need to replace |
Yes, you need to replace workflow-controller too. Then edit images manually for api server and persistence agent. I didn't have time to test yesterday, will have a try now |
I just tested this, and confirmed artifactRepositoryRef now contains the full spec for responses to UI. See the example I got (edited version):
|
Environment
kfp standalone
Steps to reproduce
Expected result
artifacts should show preview, visualizations should show up
Materials and Reference
Root cause: argo removed some information in workflow status.
Previously, artifacts contain an object of information, but now artifacts only contain their keys.
See workflows.argoproj.io/outputs annotation
Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.
The text was updated successfully, but these errors were encountered: