Pipeline with volume does not work with "k8sapi" argo executor #4326

nakfour · 2020-08-05T19:55:12Z

What steps did you take:

As part of Open Data Hub we are trying to get KFP 1.0 working on Openshift 4.x. We must use the "k8sapi" executor since docker is not present. Provided KFP python examples work as long as there are no passing parameters. We are trying to convert the KFP examples with passing parameters to use volumes to get them to work on OCP. We tried one of the simple volume creation examples as pointed here (https://github.com/kubeflow/pipelines/blob/master/samples/core/volume_ops/volume_ops.py) This caused an error stating we need to use volumes or emtpyDir. Looking at the generated yaml file we notice these lines

outputs:
      parameters:
      - name: create-pvc-manifest
        valueFrom: {jsonPath: '{}'}
      - name: create-pvc-name
        valueFrom: {jsonPath: '{.metadata.name}'}
      - name: create-pvc-size
        valueFrom: {jsonPath: '{.status.capacity.storage}'}

These need to be placed in emptyDir or volumes. We modified the generated yaml file to exclude the output and that worked.

Is there a way we can use volumes or emptydir in KFP python without generating these output parameters?

Environment:

KFP 1.0 on OCP 4.4

The text was updated successfully, but these errors were encountered:

Bobgy · 2020-08-06T12:59:41Z

/assign @Ark-kun @neuromage @chensun @numerology

Ark-kun · 2020-08-11T21:44:44Z

I think this is a know upstream issue: argoproj/argo-workflows#2679

We must use the "k8sapi" executor since docker is not present.

The pns executor might be the more future-proof way.

As a workaround: All VolumeOp does is just pass the volume YAML to Argo and argo just calls kubectl create using that YAML. So, you could just create a custom component that uses kubectl create to create a volume. This way you control the outputs.

nakfour · 2020-08-12T13:31:29Z

@Ark-kun we documented why we are not using the pns executer here: https://github.com/opendatahub-io/odh-manifests/issues/117
The issue is with the DSL generated volume yaml that includes these lines as mentioned before

outputs:
      parameters:
      - name: create-pvc-manifest
        valueFrom: {jsonPath: '{}'}
      - name: create-pvc-name
        valueFrom: {jsonPath: '{.metadata.name}'}
      - name: create-pvc-size
        valueFrom: {jsonPath: '{.status.capacity.storage}'}

These lines are not generated by Argo, they are generated by the python DSL SDK when creating a volume. Is there a way to create a volume in python without any outputs that require EmptyDir?

Ark-kun · 2020-08-19T09:07:28Z

You might want to make sure your Kubernetes environment supports Argo. I was under the impression that several of our users have used OpenShift semi-successfully.

Provided KFP python examples work as long as there are no passing parameters.

I think this is a standard feature of Argo. Can you please check whether any of Argo's parameter or artifact passing examples work for your setup? Data passing is a core feature of Argo and KFP. Without making it work, the product might not be usable.

The issue is with the DSL generated volume yaml that includes these lines as mentioned before

I think this is a standard feature of Argo resource templates. You might want to make sure your Kubernetes environment supports Argo.

These need to be placed in emptyDir or volumes.

Can you explain what needs to be placed in the volumes?

We must use the "k8sapi" executor since docker is not present.

I think we had several of our users use KFP in non-docker environments. Argo developers propose PNS executor as the optimal one for the future.

We tried one of the simple volume creation examples

I'd consider the VolumeOp to be an experimental community-driven feature. Modifying global state (such as creating Kubernetes objects) is somewhat frowned upon compared to pure components that only use their inputs.

Is there a way to create a volume

You can use kubectl create .... You can create container component that runs kubectl create .... In fast that's what VolumeOp and Argo do.

without any outputs that require EmptyDir

I'm not sure I understand how EmptyDir comes into play here. Can you please help me understand.

Ark-kun · 2020-08-19T09:08:16Z

I think the best course of action is to first try to make the underlying orchestrator (Argo) work for you.

Ark-kun · 2020-08-19T09:24:02Z

I think it might be best for you to avoid VolumeOp for now. Try making the containers and data passing work. If the system throws error asking to create emptyDir volumes under the output paths, then it might be possible to post-process the compiled workflow to add them. We've suggested to the Argo developers that they auto-mount the emptyDir volumes if Argo needs them for operation.

nakfour · 2020-08-19T13:46:14Z

@Ark-kun yes Argo works, we have been using it in our production cluster actually for quite some time. The key is to use volumes in passing parameters, once that is done, it all works. We are trying to find a way to get KFP python code work by allowing users to create volumes in the python code, instead of telling them dooc create pv first then write your python KFP code.

nakfour · 2020-08-19T13:53:22Z

@Ark-kun is it possible to even use emptyDir from python KFP?

nakfour · 2020-08-20T13:59:48Z

@Ark-kun thanks for component suggestion in the KFP weekly for a component to create a volume from python. Is it possible to collaborate on creating this feature?

eterna2 · 2020-09-30T20:51:51Z

Hey @nakfour

I am trying to setup kubeflow in our openshift cluster too. I would n't mind helping out too.

Would be cool if u can tag me also for any openshift related issue too? (So I can watch out for it as I am still at a very early stage of getting kf to work on our cluster).

Ark-kun · 2020-09-30T22:51:03Z

yes Argo works

I was interested whether the whole Argo works (e.g. whether the artifact and parameter passing examples work). Data passing is a core Argo feature that distinguishes it from many other orchestrators.

We are trying to find a way to get KFP python code work by allowing users to create volumes in the python code, instead of telling them dooc create pv first then write your python KFP code.

We're somewhat discouraging volumes due to them often making components non-reusable (make them depend on the cluster state). Also, volumes a global (like global variables) which makes components less pure and less reproducible. We also discourage volume creation as part of the pipeline, because it's affecting global state. If some pipeline task deletes a volume while some future task will need it, there would be problems.

is it possible to even use emptyDir from python KFP?

Every KFP task has task.add_volume() and task.add_volume_mount(). They support all Kubernetes volumes including emptyDir.

The key is to use volumes in passing parameters, once that is done, it all works.

AFAIK, for container components/templates mounting emptyDir volumes under the outputs should work around this Argo issue.

BTW, We've developed an experimental way to switch KFP's data-passing from Argo's artifact repository to Volumes. Please check

pipelines/sdk/python/kfp/dsl/_pipeline.py

Line 157 in 2377955

def data_passing_method(self, value):

        from kfp.dsl import PipelineConf, data_passing_methods
        from kubernetes.client.models import V1Volume, V1PersistentVolumeClaim
        pipeline_conf = PipelineConf()
        pipeline_conf.data_passing_method = data_passing_methods.KubernetesVolume(
            volume=V1Volume(
                name='data',
                persistent_volume_claim=V1PersistentVolumeClaim('data-volume'),
            ),
            path_prefix='artifact_data/',
        )

nakfour · 2020-10-01T13:31:34Z

@Ark-kun wanted to see if there us any update on this issue? Thanks

elikatsis · 2020-10-02T16:10:01Z

Hi @nakfour, this issue had totally slipped from me!

If your only problem using VolumeOps is the outputs, then you can disable them in your DSL code as follows:

vop = kfp.dsl.VolumeOp(...)
vop.outputs = {}

If this helps you, we could implement an argument in the constructor to disable outputs and all that should be required will be:

vop = kfp.dsl.VolumeOp(..., disable_outputs=True)

We also discourage volume creation as part of the pipeline, because it's affecting global state. If some pipeline task deletes a volume while some future task will need it, there would be problems.

Creating resources as part of the pipeline is essential to ML and there are components and samples that actually do similar things. If I'm not wrong, the taxi cab example creates deployments and services using pure components, doesn't it? The bad thing about these components is that users don't really have a say on the resources created 😄
Creating volumes is similar to that and comes with lots of benefits (I'll omit them since this issue isn't a place to discuss about them).

Regarding the experimental data passing method, this code is wrong:

pipeline_conf.data_passing_method = data_passing_methods.KubernetesVolume(
    volume=V1Volume(
        name='data',
        persistent_volume_claim=V1PersistentVolumeClaim('data-volume'),
    ),
    path_prefix='artifact_data/',
)

The correct way to do it is

pipeline_conf.data_passing_method = data_passing_methods.KubernetesVolume(
    volume=V1Volume(
        name='data',
        persistent_volume_claim=V1PersistentVolumeClaimVolumeSource(claim_name='data-volume'),
    ),
    path_prefix='artifact_data/',
)

But all of this is too much boilerplate code. You can always use a PipelineVolume which makes things much easier since it is a V1Volume with some extra magic:

pipeline_conf.data_passing_method = data_passing_methods.KubernetesVolume(
    volume=kfp.dsl.PipelineVolume(pvc="data-volume"),
    path_prefix='artifact_data/',
)

However, this data passing method is not easy to maintain. All your pipelines will be using the same PVC, or you will have your users ask which PVC to use every time they want to run a pipeline. And what size should these PVCs have? How can you know how much data will your users be passing around?
Creating PVCs using VolumeOps in the beginning of the pipeline and (optionally) deleting them in the end seems like the easiest, fastest, and most sane way to go.

zsaladin · 2020-10-08T14:02:56Z

In my case I need the outputs.parameters.
Because other containers depend on the pvc name.

So I changed my codes.

Old Codes

volume = dsl.VolumeOp(
    name="volume",
    resource_name="volume",
    modes=dsl.VOLUME_MODE_RWO,
    size="1Gi"
)
merge_op = func_to_container_op(merge)
merge_task = merge_op()
merge_task.apply(
    mount_pvc(
        pvc_name=volume .volume.persistent_volume_claim.claim_name,
        volume_mount_path="/mnt"
    )
)

New Codes

volume = dsl.VolumeOp(
    name="volume",
    resource_name="volume",
    modes=dsl.VOLUME_MODE_RWO,
    size="1Gi"
)
volume.outputs = {}  # To exclude outputs

merge_op = func_to_container_op(merge)
merge_task = merge_op()
merge_task.apply(
    mount_pvc(
        pvc_name="{{workflow.name}}-volume",  # Hard coded name instead of outputs.parameter 
        volume_mount_path="/mnt"
    )
)

The name of pvc consists of workflow.name and volume.name.
This one works.

nakfour · 2020-12-09T14:45:31Z

Thanks @elikatsis I will try your comment on disabling the outputs, if that works and we can create volumes with VolumeOp then we should be good. @zsaladin also thanks for the code example

stale · 2021-06-09T02:26:49Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale · 2022-04-19T08:27:10Z

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

k8s-ci-robot assigned Ark-kun, chensun, neuromage and numerology Aug 6, 2020

nakfour mentioned this issue Aug 25, 2020

Adding nakfour from redhat to org kubeflow/internal-acls#316

Merged

stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jun 9, 2021

stale bot closed this as completed Apr 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline with volume does not work with "k8sapi" argo executor #4326

Pipeline with volume does not work with "k8sapi" argo executor #4326

nakfour commented Aug 5, 2020 •

edited

Bobgy commented Aug 6, 2020

Ark-kun commented Aug 11, 2020

nakfour commented Aug 12, 2020

Ark-kun commented Aug 19, 2020 •

edited

Ark-kun commented Aug 19, 2020

Ark-kun commented Aug 19, 2020

nakfour commented Aug 19, 2020

nakfour commented Aug 19, 2020

nakfour commented Aug 20, 2020

eterna2 commented Sep 30, 2020

Ark-kun commented Sep 30, 2020

nakfour commented Oct 1, 2020

elikatsis commented Oct 2, 2020 •

edited

zsaladin commented Oct 8, 2020 •

edited

nakfour commented Dec 9, 2020

stale bot commented Jun 9, 2021

stale bot commented Apr 19, 2022

Pipeline with volume does not work with "k8sapi" argo executor #4326

Pipeline with volume does not work with "k8sapi" argo executor #4326

Comments

nakfour commented Aug 5, 2020 • edited

What steps did you take:

Environment:

Bobgy commented Aug 6, 2020

Ark-kun commented Aug 11, 2020

nakfour commented Aug 12, 2020

Ark-kun commented Aug 19, 2020 • edited

Ark-kun commented Aug 19, 2020

Ark-kun commented Aug 19, 2020

nakfour commented Aug 19, 2020

nakfour commented Aug 19, 2020

nakfour commented Aug 20, 2020

eterna2 commented Sep 30, 2020

Ark-kun commented Sep 30, 2020

nakfour commented Oct 1, 2020

elikatsis commented Oct 2, 2020 • edited

zsaladin commented Oct 8, 2020 • edited

Old Codes

New Codes

nakfour commented Dec 9, 2020

stale bot commented Jun 9, 2021

stale bot commented Apr 19, 2022

nakfour commented Aug 5, 2020 •

edited

Ark-kun commented Aug 19, 2020 •

edited

elikatsis commented Oct 2, 2020 •

edited

zsaladin commented Oct 8, 2020 •

edited