Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[frontend] artifact preview & visualization breaks with argo v3.1+ #5930

Closed
Tracked by #5718
Bobgy opened this issue Jun 28, 2021 · 13 comments · Fixed by #6039
Closed
Tracked by #5718

[frontend] artifact preview & visualization breaks with argo v3.1+ #5930

Bobgy opened this issue Jun 28, 2021 · 13 comments · Fixed by #6039

Comments

@Bobgy
Copy link
Contributor

Bobgy commented Jun 28, 2021

Environment

  • How did you deploy Kubeflow Pipelines (KFP)?

kfp standalone

  • KFP version: 1.7.0-alpha.1

Steps to reproduce

image

Expected result

artifacts should show preview, visualizations should show up

Materials and Reference

Root cause: argo removed some information in workflow status.
Previously, artifacts contain an object of information, but now artifacts only contain their keys.
See workflows.argoproj.io/outputs annotation

workflows.argoproj.io/outputs: >-
{"artifacts":[{"name":"main-logs","s3":{"key":"artifacts/file-passing-pipelines-xz8xs/2021/06/28/file-passing-pipelines-xz8xs-3422213888/main.log"}}]}


Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

@Bobgy
Copy link
Contributor Author

Bobgy commented Jun 28, 2021

/assign @zijianjoy

@Bobgy Bobgy added this to Pending Triage in KFP v2 compatible mode via automation Jun 28, 2021
@Bobgy Bobgy moved this from Pending Triage to P0 in KFP v2 compatible mode Jun 28, 2021
@Bobgy Bobgy moved this from P0 to CP 7.2 in KFP v2 compatible mode Jun 29, 2021
@Bobgy
Copy link
Contributor Author

Bobgy commented Jun 29, 2021

Pasting an example full argo workflow:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  annotations:
    pipelines.kubeflow.org/kfp_sdk_version: 1.6.4
    pipelines.kubeflow.org/pipeline_compilation_time: 2021-06-29T11:08:44.423700
    pipelines.kubeflow.org/pipeline_spec: '{"inputs": [{"default": "", "name": "pipeline-output-directory"},
      {"default": "two_step_pipeline", "name": "pipeline-name"}], "name": "two_step_pipeline"}'
    pipelines.kubeflow.org/run_name: two_step_pipeline 2021-06-29 11-08-44
    pipelines.kubeflow.org/v2_pipeline: "true"
  creationTimestamp: "2021-06-29T11:08:44Z"
  generateName: two-step-pipeline-
  generation: 7
  labels:
    pipeline/persistedFinalState: "true"
    pipeline/runid: ceb88c92-21a9-4797-a68b-bb27c6183d59
    pipelines.kubeflow.org/kfp_sdk_version: 1.6.4
    pipelines.kubeflow.org/v2_pipeline: "true"
    workflows.argoproj.io/completed: "true"
    workflows.argoproj.io/phase: Succeeded
    manager: workflow-controller
    operation: Update
    time: "2021-06-29T11:10:01Z"
  name: two-step-pipeline-94mfz
  namespace: kubeflow
  resourceVersion: "30684140"
  selfLink: /apis/argoproj.io/v1alpha1/namespaces/kubeflow/workflows/two-step-pipeline-94mfz
  uid: cef9e9db-28ae-4e16-8e4e-fcf4d960e967
spec:
  arguments:
    parameters:
    - name: pipeline-output-directory
      value: gs://gongyuan-dev/v2-sample-test/data/samples_config-loop-item
    - name: pipeline-name
      value: two_step_pipeline
  entrypoint: two-step-pipeline
  serviceAccountName: pipeline-runner
  templates:
  - container:
      args:
      - sh
      - -ec
      - |
        program_path=$(mktemp)
        printf "%s" "$0" > "$program_path"
        python3 -u "$program_path" "$@"
      - |
        def _make_parent_dirs_and_return_path(file_path: str):
            import os
            os.makedirs(os.path.dirname(file_path), exist_ok=True)
            return file_path

        def preprocess(
            uri, some_int, output_parameter_one,
            output_dataset_one
        ):
            '''Dummy Preprocess Step.'''
            with open(output_dataset_one, 'w') as f:
                f.write('Output dataset')
            with open(output_parameter_one, 'w') as f:
                f.write("{}".format(1234))

        import argparse
        _parser = argparse.ArgumentParser(prog='Preprocess', description='Dummy Preprocess Step.')
        _parser.add_argument("--uri", dest="uri", type=str, required=True, default=argparse.SUPPRESS)
        _parser.add_argument("--some-int", dest="some_int", type=int, required=True, default=argparse.SUPPRESS)
        _parser.add_argument("--output-parameter-one", dest="output_parameter_one", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)
        _parser.add_argument("--output-dataset-one", dest="output_dataset_one", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)
        _parsed_args = vars(_parser.parse_args())

        _outputs = preprocess(**_parsed_args)
      - --uri
      - '{{$.inputs.parameters[''uri'']}}'
      - --some-int
      - '{{$.inputs.parameters[''some_int'']}}'
      - --output-parameter-one
      - '{{$.outputs.parameters[''output_parameter_one''].output_file}}'
      - --output-dataset-one
      - '{{$.outputs.artifacts[''output_dataset_one''].path}}'
      command:
      - /kfp-launcher/launch
      - --mlmd_server_address
      - $(METADATA_GRPC_SERVICE_HOST)
      - --mlmd_server_port
      - $(METADATA_GRPC_SERVICE_PORT)
      - --runtime_info_json
      - $(KFP_V2_RUNTIME_INFO)
      - --container_image
      - $(KFP_V2_IMAGE)
      - --task_name
      - preprocess
      - --pipeline_name
      - '{{inputs.parameters.pipeline-name}}'
      - --pipeline_run_id
      - $(WORKFLOW_ID)
      - --pipeline_task_id
      - $(KFP_POD_NAME)
      - --pipeline_root
      - '{{inputs.parameters.pipeline-output-directory}}'
      env:
      - name: KFP_POD_NAME
        valueFrom:
          fieldRef:
            fieldPath: metadata.name
      - name: KFP_NAMESPACE
        valueFrom:
          fieldRef:
            fieldPath: metadata.namespace
      - name: WORKFLOW_ID
        valueFrom:
          fieldRef:
            fieldPath: metadata.labels['workflows.argoproj.io/workflow']
      - name: KFP_V2_IMAGE
        value: python:3.9
      - name: KFP_V2_RUNTIME_INFO
        value: '{"inputParameters": {"some_int": {"type": "INT", "value": "BEGIN-KFP-PARAM[12]END-KFP-PARAM"},
          "uri": {"type": "STRING", "value": "BEGIN-KFP-PARAM[uri-to-import]END-KFP-PARAM"}},
          "inputArtifacts": {}, "outputParameters": {"output_parameter_one": {"type":
          "INT", "path": "/tmp/outputs/output_parameter_one/data"}}, "outputArtifacts":
          {"output_dataset_one": {"schemaTitle": "system.Dataset", "instanceSchema":
          "", "metadataPath": "/tmp/outputs/output_dataset_one/data"}}}'
      envFrom:
      - configMapRef:
          name: metadata-grpc-configmap
          optional: true
      image: python:3.9
      name: ""
      resources: {}
      volumeMounts:
      - mountPath: /kfp-launcher
        name: kfp-launcher
    initContainers:
    - command:
      - /bin/mount_launcher.sh
      image: gcr.io/gongyuan-dev/v2-sample-test/kfp-launcher@sha256:55d2af7c8f37515f745dea578ffa76af749e99474af29157474ea88ce0249d17
      mirrorVolumeMounts: true
      name: kfp-launcher
      resources: {}
    inputs:
      parameters:
      - name: pipeline-name
      - name: pipeline-output-directory
    metadata:
      annotations:
        pipelines.kubeflow.org/arguments.parameters: '{"some_int": "12", "uri": "uri-to-import"}'
        pipelines.kubeflow.org/component_ref: '{}'
        pipelines.kubeflow.org/v2_component: "true"
        sidecar.istio.io/inject: "false"
      labels:
        pipelines.kubeflow.org/cache_enabled: "true"
        pipelines.kubeflow.org/kfp_sdk_version: 1.6.4
        pipelines.kubeflow.org/pipeline-sdk-type: kfp
        pipelines.kubeflow.org/v2_component: "true"
    name: preprocess
    outputs:
      artifacts:
      - name: preprocess-output_dataset_one
        path: /tmp/outputs/output_dataset_one/data
      - name: preprocess-output_parameter_one
        path: /tmp/outputs/output_parameter_one/data
      parameters:
      - name: preprocess-output_parameter_one
        valueFrom:
          path: /tmp/outputs/output_parameter_one/data
    volumes:
    - name: kfp-launcher
  - container:
      args:
      - sh
      - -ec
      - |
        program_path=$(mktemp)
        printf "%s" "$0" > "$program_path"
        python3 -u "$program_path" "$@"
      - |
        def _make_parent_dirs_and_return_path(file_path: str):
            import os
            os.makedirs(os.path.dirname(file_path), exist_ok=True)
            return file_path

        def train_op(
            dataset,
            model,
            num_steps = 100
        ):
            '''Dummy Training Step.'''

            with open(dataset, 'r') as input_file:
                input_string = input_file.read()
                with open(model, 'w') as output_file:
                    for i in range(num_steps):
                        output_file.write(
                            "Step {}\n{}\n=====\n".format(i, input_string)
                        )

        import argparse
        _parser = argparse.ArgumentParser(prog='Train op', description='Dummy Training Step.')
        _parser.add_argument("--dataset", dest="dataset", type=str, required=True, default=argparse.SUPPRESS)
        _parser.add_argument("--num-steps", dest="num_steps", type=int, required=False, default=argparse.SUPPRESS)
        _parser.add_argument("--model", dest="model", type=_make_parent_dirs_and_return_path, required=True, default=argparse.SUPPRESS)
        _parsed_args = vars(_parser.parse_args())

        _outputs = train_op(**_parsed_args)
      - --dataset
      - '{{$.inputs.artifacts[''dataset''].path}}'
      - --num-steps
      - '{{$.inputs.parameters[''num_steps'']}}'
      - --model
      - '{{$.outputs.artifacts[''model''].path}}'
      command:
      - /kfp-launcher/launch
      - --mlmd_server_address
      - $(METADATA_GRPC_SERVICE_HOST)
      - --mlmd_server_port
      - $(METADATA_GRPC_SERVICE_PORT)
      - --runtime_info_json
      - $(KFP_V2_RUNTIME_INFO)
      - --container_image
      - $(KFP_V2_IMAGE)
      - --task_name
      - train-op
      - --pipeline_name
      - '{{inputs.parameters.pipeline-name}}'
      - --pipeline_run_id
      - $(WORKFLOW_ID)
      - --pipeline_task_id
      - $(KFP_POD_NAME)
      - --pipeline_root
      - '{{inputs.parameters.pipeline-output-directory}}'
      env:
      - name: KFP_POD_NAME
        valueFrom:
          fieldRef:
            fieldPath: metadata.name
      - name: KFP_NAMESPACE
        valueFrom:
          fieldRef:
            fieldPath: metadata.namespace
      - name: WORKFLOW_ID
        valueFrom:
          fieldRef:
            fieldPath: metadata.labels['workflows.argoproj.io/workflow']
      - name: KFP_V2_IMAGE
        value: python:3.7
      - name: KFP_V2_RUNTIME_INFO
        value: '{"inputParameters": {"num_steps": {"type": "INT", "value": "BEGIN-KFP-PARAM[{{inputs.parameters.preprocess-output_parameter_one}}]END-KFP-PARAM"}},
          "inputArtifacts": {"dataset": {"metadataPath": "/tmp/inputs/dataset/data",
          "schemaTitle": "system.Dataset", "instanceSchema": ""}}, "outputParameters":
          {}, "outputArtifacts": {"model": {"schemaTitle": "system.Model", "instanceSchema":
          "", "metadataPath": "/tmp/outputs/model/data"}}}'
      envFrom:
      - configMapRef:
          name: metadata-grpc-configmap
          optional: true
      image: python:3.7
      name: ""
      resources: {}
      volumeMounts:
      - mountPath: /kfp-launcher
        name: kfp-launcher
    initContainers:
    - command:
      - /bin/mount_launcher.sh
      image: gcr.io/gongyuan-dev/v2-sample-test/kfp-launcher@sha256:55d2af7c8f37515f745dea578ffa76af749e99474af29157474ea88ce0249d17
      mirrorVolumeMounts: true
      name: kfp-launcher
      resources: {}
    inputs:
      artifacts:
      - name: preprocess-output_dataset_one
        path: /tmp/inputs/dataset/data
      parameters:
      - name: pipeline-name
      - name: pipeline-output-directory
      - name: preprocess-output_parameter_one
    metadata:
      annotations:
        pipelines.kubeflow.org/arguments.parameters: '{"num_steps": "{{inputs.parameters.preprocess-output_parameter_one}}"}'
        pipelines.kubeflow.org/component_ref: '{}'
        pipelines.kubeflow.org/v2_component: "true"
        sidecar.istio.io/inject: "false"
      labels:
        pipelines.kubeflow.org/cache_enabled: "true"
        pipelines.kubeflow.org/kfp_sdk_version: 1.6.4
        pipelines.kubeflow.org/pipeline-sdk-type: kfp
        pipelines.kubeflow.org/v2_component: "true"
    name: train-op
    outputs:
      artifacts:
      - name: train-op-model
        path: /tmp/outputs/model/data
    volumes:
    - name: kfp-launcher
  - dag:
      tasks:
      - arguments:
          parameters:
          - name: pipeline-name
            value: '{{inputs.parameters.pipeline-name}}'
          - name: pipeline-output-directory
            value: '{{inputs.parameters.pipeline-output-directory}}'
        name: preprocess
        template: preprocess
      - arguments:
          artifacts:
          - from: '{{tasks.preprocess.outputs.artifacts.preprocess-output_dataset_one}}'
            name: preprocess-output_dataset_one
          parameters:
          - name: pipeline-name
            value: '{{inputs.parameters.pipeline-name}}'
          - name: pipeline-output-directory
            value: '{{inputs.parameters.pipeline-output-directory}}'
          - name: preprocess-output_parameter_one
            value: '{{tasks.preprocess.outputs.parameters.preprocess-output_parameter_one}}'
        dependencies:
        - preprocess
        name: train-op
        template: train-op
    inputs:
      parameters:
      - name: pipeline-name
      - name: pipeline-output-directory
    metadata:
      annotations:
        sidecar.istio.io/inject: "false"
      labels:
        pipelines.kubeflow.org/cache_enabled: "true"
    name: two-step-pipeline
    outputs: {}
status:
  artifactRepositoryRef:
    default: true
  conditions:
  - status: "False"
    type: PodRunning
  - status: "True"
    type: Completed
  finishedAt: "2021-06-29T11:10:01Z"
  nodes:
    two-step-pipeline-94mfz:
      children:
      - two-step-pipeline-94mfz-2926751466
      displayName: two-step-pipeline-94mfz
      finishedAt: "2021-06-29T11:10:01Z"
      id: two-step-pipeline-94mfz
      inputs:
        parameters:
        - name: pipeline-name
          value: two_step_pipeline
        - name: pipeline-output-directory
          value: gs://gongyuan-dev/v2-sample-test/data/samples_config-loop-item
      name: two-step-pipeline-94mfz
      outboundNodes:
      - two-step-pipeline-94mfz-1801497614
      phase: Succeeded
      progress: 2/2
      resourcesDuration:
        cpu: 37
        memory: 18
      startedAt: "2021-06-29T11:08:44Z"
      templateName: two-step-pipeline
      templateScope: local/two-step-pipeline-94mfz
      type: DAG
    two-step-pipeline-94mfz-1801497614:
      boundaryID: two-step-pipeline-94mfz
      displayName: train-op
      finishedAt: "2021-06-29T11:09:58Z"
      hostNodeName: gke-kfp-std-default-pool-1c1207aa-2eyx
      id: two-step-pipeline-94mfz-1801497614
      inputs:
        artifacts:
        - name: preprocess-output_dataset_one
          path: /tmp/inputs/dataset/data
          s3:
            key: artifacts/two-step-pipeline-94mfz/2021/06/29/two-step-pipeline-94mfz-2926751466/preprocess-output_dataset_one.tgz
        parameters:
        - name: pipeline-name
          value: two_step_pipeline
        - name: pipeline-output-directory
          value: gs://gongyuan-dev/v2-sample-test/data/samples_config-loop-item
        - name: preprocess-output_parameter_one
          value: "1234"
      name: two-step-pipeline-94mfz.train-op
      outputs:
        artifacts:
        - name: train-op-model
          path: /tmp/outputs/model/data
          s3:
            key: artifacts/two-step-pipeline-94mfz/2021/06/29/two-step-pipeline-94mfz-1801497614/train-op-model.tgz
        - name: main-logs
          s3:
            key: artifacts/two-step-pipeline-94mfz/2021/06/29/two-step-pipeline-94mfz-1801497614/main.log
        exitCode: "0"
      phase: Succeeded
      progress: 1/1
      resourcesDuration:
        cpu: 16
        memory: 7
      startedAt: "2021-06-29T11:09:25Z"
      templateName: train-op
      templateScope: local/two-step-pipeline-94mfz
      type: Pod
    two-step-pipeline-94mfz-2926751466:
      boundaryID: two-step-pipeline-94mfz
      children:
      - two-step-pipeline-94mfz-1801497614
      displayName: preprocess
      finishedAt: "2021-06-29T11:09:14Z"
      hostNodeName: gke-kfp-std-default-pool-1c1207aa-2eyx
      id: two-step-pipeline-94mfz-2926751466
      inputs:
        parameters:
        - name: pipeline-name
          value: two_step_pipeline
        - name: pipeline-output-directory
          value: gs://gongyuan-dev/v2-sample-test/data/samples_config-loop-item
      name: two-step-pipeline-94mfz.preprocess
      outputs:
        artifacts:
        - name: preprocess-output_dataset_one
          path: /tmp/outputs/output_dataset_one/data
          s3:
            key: artifacts/two-step-pipeline-94mfz/2021/06/29/two-step-pipeline-94mfz-2926751466/preprocess-output_dataset_one.tgz
        - name: preprocess-output_parameter_one
          path: /tmp/outputs/output_parameter_one/data
          s3:
            key: artifacts/two-step-pipeline-94mfz/2021/06/29/two-step-pipeline-94mfz-2926751466/preprocess-output_parameter_one.tgz
        - name: main-logs
          s3:
            key: artifacts/two-step-pipeline-94mfz/2021/06/29/two-step-pipeline-94mfz-2926751466/main.log
        exitCode: "0"
        parameters:
        - name: preprocess-output_parameter_one
          value: "1234"
          valueFrom:
            path: /tmp/outputs/output_parameter_one/data
      phase: Succeeded
      progress: 1/1
      resourcesDuration:
        cpu: 21
        memory: 11
      startedAt: "2021-06-29T11:08:44Z"
      templateName: preprocess
      templateScope: local/two-step-pipeline-94mfz
      type: Pod
  phase: Succeeded
  progress: 2/2
  resourcesDuration:
    cpu: 37
    memory: 18
  startedAt: "2021-06-29T11:08:44Z"

@Bobgy
Copy link
Contributor Author

Bobgy commented Jun 29, 2021

The only information related to artifact repository seems to be:

  artifactRepositoryRef:
    default: true

@zijianjoy
Copy link
Collaborator

The PR which removed s3 from workflow template: argoproj/argo-workflows#3377

@zijianjoy
Copy link
Collaborator

Looks like that what we are missing bucket and endpoint: endpoint tells us what kind of storage platform to use, bucket tells us the top directory to look for artifact with key.

@Bobgy
Copy link
Contributor Author

Bobgy commented Jul 3, 2021

I think the "easiest" workaround without waiting for argo upstream changes is to adjust the UI server artifacts API endpoint:

const source = useParameter ? req.params.source : req.query.source;
const bucket = useParameter ? req.params.bucket : req.query.bucket;

Changes:

  • Make source & bucket args as optional, they default to the default artifact repository in the cluster
  • We can adjust manifests to make sure UI server knows what default artifact repository is
  • UI code no longer knows what source / bucket is, but it can just use the object key to get data from UI server.

What do you think?

@Bobgy Bobgy moved this from P0 to Release! 7.9 in KFP v2 compatible mode Jul 6, 2021
@Bobgy Bobgy changed the title [frontend] artifact preview & visualization breaks with argo v3.1.0 [frontend] artifact preview & visualization breaks with argo v3.1+ Jul 6, 2021
@zijianjoy
Copy link
Collaborator

Thank you @Bobgy for the suggestion! I will look into it.

After the update of argoproj/argo-workflows#6255, I can retrieve the s3 artifact detail in status. For example: for KFP tutorial [Tutorial] Data passing in python components (ac491):

status:
  artifactRepositoryRef:
    artifactRepository:
      archiveLogs: true
      s3:
        accessKeySecret:
          key: accesskey
          name: mlpipeline-minio-artifact
        bucket: mlpipeline
        endpoint: minio-service.kubeflow:9000
        insecure: true
        keyFormat: artifacts/{{workflow.name}}/{{workflow.creationTimestamp.Y}}/{{workflow.creationTimestamp.m}}/{{workflow.creationTimestamp.d}}/{{pod.name}}
        secretKeySecret:
          key: secretkey
          name: mlpipeline-minio-artifact
    default: true

However, KFP API response didn't have this info under pipeline_runtime -> workflow_manifest field. We might need to make some adjustment to expose this information, and update KFP UI to read new artifact bucket and endpoint accordingly. internal link: go/paste/4581363291258880

@Bobgy
Copy link
Contributor Author

Bobgy commented Jul 13, 2021

@zijianjoy yes, it's expected that we need to update kfp api server and kfp persistence agent to make the new fields show up in the response to KFP UI.
I've included the changes in #6027. You can first test by editing images for ml-pipeline and ml-pipeline-persistence-agent deployments.

Try:

  • gcr.io/ml-pipeline-test/d4e24cc2de6be0448c67508198f3163fb261c9a0/api-server
  • gcr.io/ml-pipeline-test/d4e24cc2de6be0448c67508198f3163fb261c9a0/persistenceagent

@zijianjoy
Copy link
Collaborator

@Bobgy Thank you Yuan! I replaced both images on a 1.7.0-alpha.2 cluster, and I ran the v1 pipeline [Tutorial] Data passing in python component. But the workflow output is still the same as before. Am I missing some other steps for applying this change?

@zijianjoy
Copy link
Collaborator

@Bobgy
Copy link
Contributor Author

Bobgy commented Jul 14, 2021

Yes, you need to replace workflow-controller too.
I'd recommend install by checking out my PR locally and kubectl apply -k manifests/kustomize/env/dev, it will have all the latest images for argo workflow controller.

Then edit images manually for api server and persistence agent.

I didn't have time to test yesterday, will have a try now

@Bobgy
Copy link
Contributor Author

Bobgy commented Jul 14, 2021

I just tested this, and confirmed artifactRepositoryRef now contains the full spec for responses to UI.

See the example I got (edited version):

{"status":{"artifactRepositoryRef":{"default":true,"artifactRepository":{"archiveLogs":true,"s3":{"endpoint":"minio-service.kubeflow:9000","bucket":"mlpipeline","insecure":true,"accessKeySecret":{"name":"mlpipeline-minio-artifact","key":"accesskey"},"secretKeySecret":{"name":"mlpipeline-minio-artifact","key":"secretkey"},"keyFormat":"artifacts/{{workflow.name}}/{{workflow.creationTimestamp.Y}}/{{workflow.creationTimestamp.m}}/{{workflow.creationTimestamp.d}}/{{pod.name}}"}}}}}

Bobgy added a commit that referenced this issue Jul 14, 2021
* feat: use argo v3.1.1-patch

* chore: also upgrade argo go modules to patch

* add comment

* fix download

* fix licenses

* go mod tidy
Bobgy added a commit that referenced this issue Jul 15, 2021
google-oss-robot pushed a commit that referenced this issue Jul 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
KFP v2 compatible mode
1.7 release (ETA Aug 20th)
Development

Successfully merging a pull request may close this issue.

2 participants