Expose step id and step name #1191

cheyang · 2019-04-19T09:31:45Z

Need expose step id and step name when running pipelines.

This change is

cheyang · 2019-04-19T09:32:19Z

Ark-kun · 2019-04-19T19:15:51Z

Can you please explain the purpose of this change? Why do you need to pass step ID and name?
I see that you're passing workflow ID and name which is not the same as task/pod id/name.
I also see that you're using Argo placeholders in the code. I'd like to limit their usage since this will prevent us from switching orchestrators if we decide to do so.

cheyang · 2019-04-19T23:16:46Z

Yes, the requirement is from our customer. They want to persistent the logs and some outputs to distributed storage so they need the unique name of the step. And I think we can change if the orchestrators are changed.

What's your suggestion on that? If the user want to get the unique id and name of the step.

animeshsingh · 2019-04-20T00:44:33Z

Hi @cheyang - why do they want to persist at step level? Argo has a setting where you can configure logs to be stored on S3 for archival for a pipeline run. Also then what will be needed is UI change to display the logs from the S3 if its stored there. cc @vicaire

cheyang · 2019-04-20T01:10:21Z

Hi @cheyang - why do they want to persist at step level? Argo has a setting where you can configure logs to be stored on S3 for archival for a pipeline run. Also then what will be needed is UI change to display the logs from the S3 if its stored there. cc @vicaire

Thanks for response.

In fact, the purpose is to persistent or organize the logs and some outputs(for example, the result of feature extraction) in specific format in the distributed storage so it can be easily used to trace the process
of experiment even a long time after the run (even the step container is cleaned up). A sample format is {experiment_name}/{RUN_ID}/{STEP_ID}, so the user can easily build the relationship between each step of run and the result of the step.

And for on-prem user and the user in China, S3 is impossible to use.

Ark-kun · 2019-04-22T21:46:04Z

@cheyang Is it possible to move these to the pipeline level? The higher it is - the easier it would be to change later.
The pipeline step may receive a URI or a unique ID (in contrast with explicitly receiving "step name" and "step id"). See

pipelines/samples/kubeflow-tf/kubeflow-training-classification.py

Line 44 in b292663

output_template = str(output) + '/{{workflow.uid}}/{{pod.name}}/data'

Our current way of passing data (until artifact passing support) is to give some step a URI template which gets resolved at run time and the step outputs that URI to pass it forward. So a train step outputs the model URI and not step name or ID. This is also easier for the consuming step since you pass it a URI instead of step name and ID that it needs to combine to build that URI.

cheyang · 2019-04-22T22:12:55Z

@Ark-kun our users will build a lot of pipelines, so it is a duplicate work to set output_template = str(output) + '/{{workflow.uid}}/{{pod.name}}/data for them and expose more details to them. I think the concept step is in Kubeflow. It's fine for the user to understand. But workflow and pod is implementation layer concept, I prefer to make them transparent to the pipeline authors.

I think the benefits of doing this in abstraction layer(arena op) are only changes in one place if we are not using Argo any more. The end user should be not aware of this change, they just update the SDK.

WDYT?

vicaire · 2019-04-24T23:13:10Z

@Ark-kun, do you have additional comments on this one?

animeshsingh

@vicaire @Ark-kun @cheyang am still struggling to see the overall value here. We can archive logs from a pipeline - Argo supports that
https://github.com/argoproj/argo/blob/master/docs/workflow-controller-configmap.yaml#L40-L76

Now wouldnt it be easier to look in Argo and see if they support a backend other than S3?

vicaire · 2019-04-25T00:01:19Z

@Ark-kun, any further comments on this one?

Ark-kun · 2019-04-25T04:56:12Z

@Ark-kun, do you have additional comments on this one?

I'm handling this PR and I'm also chatting with @cheyang on Hangouts.

Ark-kun · 2019-04-25T04:58:27Z

@cheyang

A sample format is {experiment_name}/{RUN_ID}/{STEP_ID}

Are you sure your code is doing what you want? I do not see it to be using step/pod ID. It only uses workflow.uid which is the same for all steps. Did you mean pod.name?

Ark-kun · 2019-04-25T20:15:05Z

I see you've fixed the pod.name issue

/lgtm
/approve

k8s-ci-robot · 2019-04-25T20:15:13Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Ark-kun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~components/OWNERS~~ [Ark-kun]
~~samples/OWNERS~~ [Ark-kun]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2019-04-25T20:15:13Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Ark-kun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~components/OWNERS~~ [Ark-kun]
~~samples/OWNERS~~ [Ark-kun]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

vicaire · 2019-04-25T20:46:08Z

@animeshsingh, I saw your comment only after entering mine. I agree on the idea of using Argo to backup the logs to an object store. (exposing step ID and step name might be useful regardless to, for instance, call the future ML Metadata service and store metadata about an artifact, including which pipeline and which pipeline step computed it).

cheyang · 2019-04-25T23:27:07Z

I see you've fixed the pod.name issue

/lgtm
/approve

Thank you!

cheyang added 4 commits April 19, 2019 16:43

add step id and step name

e0ef3a9

add step id and step name

afa7ff9

update python sdk version

b9ea1cb

fix samples

2db7ce5

k8s-ci-robot requested review from gaoning777, hongye-sun and animeshsingh April 19, 2019 09:31

k8s-ci-robot added the size/L label Apr 19, 2019

k8s-ci-robot assigned Ark-kun Apr 19, 2019

cheyang added 2 commits April 19, 2019 17:34

fix typo

c69da5f

add env step id

b631d29

cheyang changed the title ~~Expose step uid and step name~~ Expose step id and step name Apr 19, 2019

pass the name to next step

62a83d8

animeshsingh reviewed Apr 24, 2019

View reviewed changes

change to wf name

6e5277d

k8s-ci-robot added the lgtm label Apr 25, 2019

k8s-ci-robot added the approved label Apr 25, 2019

k8s-ci-robot merged commit 8c8e505 into kubeflow:master Apr 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose step id and step name #1191

Expose step id and step name #1191

cheyang commented Apr 19, 2019 •

edited

Loading

cheyang commented Apr 19, 2019

Ark-kun commented Apr 19, 2019

cheyang commented Apr 19, 2019 •

edited

Loading

animeshsingh commented Apr 20, 2019

cheyang commented Apr 20, 2019 •

edited

Loading

Ark-kun commented Apr 22, 2019 •

edited

Loading

cheyang commented Apr 22, 2019

vicaire commented Apr 24, 2019

animeshsingh left a comment

vicaire commented Apr 25, 2019

Ark-kun commented Apr 25, 2019

Ark-kun commented Apr 25, 2019

Ark-kun commented Apr 25, 2019

k8s-ci-robot commented Apr 25, 2019

k8s-ci-robot commented Apr 25, 2019

vicaire commented Apr 25, 2019

cheyang commented Apr 25, 2019

Expose step id and step name #1191

Expose step id and step name #1191

Conversation

cheyang commented Apr 19, 2019 • edited Loading

cheyang commented Apr 19, 2019

Ark-kun commented Apr 19, 2019

cheyang commented Apr 19, 2019 • edited Loading

animeshsingh commented Apr 20, 2019

cheyang commented Apr 20, 2019 • edited Loading

Ark-kun commented Apr 22, 2019 • edited Loading

cheyang commented Apr 22, 2019

vicaire commented Apr 24, 2019

animeshsingh left a comment

Choose a reason for hiding this comment

vicaire commented Apr 25, 2019

Ark-kun commented Apr 25, 2019

Ark-kun commented Apr 25, 2019

Ark-kun commented Apr 25, 2019

k8s-ci-robot commented Apr 25, 2019

k8s-ci-robot commented Apr 25, 2019

vicaire commented Apr 25, 2019

cheyang commented Apr 25, 2019

cheyang commented Apr 19, 2019 •

edited

Loading

cheyang commented Apr 19, 2019 •

edited

Loading

cheyang commented Apr 20, 2019 •

edited

Loading

Ark-kun commented Apr 22, 2019 •

edited

Loading