Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sdk] Task.after() on Condition OpsGroup vs ContainerOp #5422

Closed
Tomcli opened this issue Apr 3, 2021 · 9 comments
Closed

[sdk] Task.after() on Condition OpsGroup vs ContainerOp #5422

Tomcli opened this issue Apr 3, 2021 · 9 comments
Assignees
Labels
area/sdk/dsl/compiler kind/question lifecycle/stale The issue / pull request is stale, any activities remove this label. upstream_issue

Comments

@Tomcli
Copy link
Member

Tomcli commented Apr 3, 2021

We have few questions on how the condition is constructed in Argo. e.g. using the below condition example
https://github.com/kubeflow/pipelines/blob/master/samples/core/condition/condition.py

When we try to define a containerOp that run after dsl.condition or containerOp inside the condition block, it creates the same result when compiles to Argo. We are wondering is it the expected DSL behavior where the task dependencies should point to the Argo condition task in both cases?

@dsl.pipeline(
    name='Conditional execution pipeline',
    description='Shows how to use dsl.Condition().'
)
def flipcoin_pipeline():
    flip = flip_coin_op()
    with dsl.Condition(flip.output == 'heads') as cond_a:
        random_num_head = random_num_op(0, 9)
        random_num_head2 = random_num_op(0, 9)

    with dsl.Condition(flip.output == 'tails') as cond_b:
        random_num_tail = random_num_op(10, 19)
        random_num_tail2 = random_num_op(10, 19)

    # print_output and print_output2 produce the same dependencies
    print_output = print_op('after cond_a cond_b').after(cond_a).after(cond_b)
    print_output2 = print_op('after random_num_head random_num_tail').after(random_num_head).after(random_num_tail)

In Argo dag:

    dag:
      tasks:
      - name: condition-1
        template: condition-1
        when: '"{{tasks.flip-coin.outputs.parameters.flip-coin-output}}" == "heads"'
        dependencies: [flip-coin]
      - name: condition-2
        template: condition-2
        when: '"{{tasks.flip-coin.outputs.parameters.flip-coin-output}}" == "tails"'
        dependencies: [flip-coin]
      - {name: flip-coin, template: flip-coin}
      - name: print
        template: print
        dependencies: [condition-1, condition-2]
      - name: print-2
        template: print-2
        dependencies: [condition-1, condition-2]

Thanks.

related docs: https://docs.google.com/document/d/1QPWKoeiPFDcI1JWH-nMe7x_xIJekL11bhu7Fp3fNT24/edit#

@Tomcli
Copy link
Member Author

Tomcli commented Apr 3, 2021

/kind question

@Tomcli
Copy link
Member Author

Tomcli commented Apr 3, 2021

cc @Udiknedormin

@Udiknedormin
Copy link
Contributor

@Tomcli Per my understanding:

  • running an op after condition group should run whenever the predicate was passed or not, as I want run it after the whole DAG
  • running an op after some op in the group should only run if the predicate was passed, as I only want to run it if the actual conditional task was executed

Of course it should work with data-passing too, in the op-after-op case (the op-after-DAG cannot use it, for obvious reasons). That is --- unless condition's status is exposed as a pipelineparam, which would introduce additional interesting possibilities. A component which runs after the condition could check if the condition DAG was run or not.

@zijianjoy zijianjoy assigned zijianjoy and unassigned zijianjoy Apr 9, 2021
@zijianjoy zijianjoy added this to Needs triage in KFP SDK Triage via automation Apr 9, 2021
@capri-xiyue
Copy link
Contributor

/assign @chensun

@capri-xiyue
Copy link
Contributor

/assign @neuromage

@Ark-kun
Copy link
Contributor

Ark-kun commented Apr 10, 2021

We are wondering is it the expected DSL behavior where the task dependencies should point to the Argo condition task in both cases?

This is unexpected behavior.

I did my own experiment and got and found several compiler bugs:

import kfp
from kfp import components

@components.create_component_from_func
def produce_str() -> str:
    return "Hello world"

@components.create_component_from_func
def consume_str(message: str):
    print(message)

def my_pipeline():
    hello_str = produce_str().output
    with kfp.dsl.Condition(hello_str == "Hello world"):
        produce_task = produce_str()
    consume_str(produce_task.output)

if __name__ == '__main__':
    kfp.compiler.Compiler().compile(my_pipeline, '2021-04-09 Check contitionals.yaml')

Error: invalid spec: templates.my-pipeline.tasks.condition-1 templates.condition-1.outputs failed to resolve {{tasks.produce-str-2.outputs.parameters.produce-str-2-Output}}
This is a compiler bug - the data passing rewriter did not detect that produce-str-2-Output was used as parameter downstream, so only artifact remained.

Now trying the same with artifacts:

from kfp.components import InputPath, OutputPath, create_component_from_func
@create_component_from_func
def produce_file(output_path: OutputPath()):
    from pathlib import Path
    Path(output_path).write_text(output_path)

@create_component_from_func
def consume_file(message_path: InputPath()):
    from pathlib import Path
    message = Path(message_path).read_text()
    print(message)

def my_file_pipeline():
    hello_str = produce_str().output
    with kfp.dsl.Condition(hello_str == "Hello world"):
        produce_file_task = produce_file()
    consume_file(produce_file_task.output)

    with kfp.dsl.Condition(hello_str == "Tails"):
        produce_file2_task = produce_file()
    consume_file(produce_file2_task.output)

    
if __name__ == '__main__':
    import kfp
    kfp.compiler.Compiler().compile(my_file_pipeline, '2021-04-09 Check conditionals.arifacts.yaml')

The first part works fine, but the second consume task fails with This step is in Error state with this message: Unable to resolve: {{tasks.condition-2.outputs.artifacts.produce-file-2-output}}

To sum it up, I've found 3 issues:

  1. Argo seems to treat Skipped tasks same as Succeeded when evaluating dependencies. I consider this to be surprising and against the spirit of the dependency concept. We can try contacting Argo and changing this on their side. Meanwhile, as a workaround, we should fix that at compiler level. Dependency should mean "dependency on success". (Issue 3 complicates the fix though.)
  2. KFP compiler (data passing rewriter) seems to miss output value usage in when. This is a bug.
  3. KFP compiler seems to compile dependency on conditional task as dependency on condition DAG. This does not seem to be correct.

P.S. I'm not sure using .after on OpsGroup/Condition was intended to be supported.

@Tomcli
Copy link
Member Author

Tomcli commented Apr 12, 2021

Thanks @Ark-kun. Good to know this is a bug in the KFP Argo compiler. We are able to work around on the KFP Tekton compiler because we decided not to put conditional tasks into a sub-dag to optimize it for Tekton.

We want to make sure KFP Argo also compiles dependency on conditional task, so we are not diverging the runtime behavior with the same KFP DSL.

@stale
Copy link

stale bot commented Jul 11, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jul 11, 2021
@stale
Copy link

stale bot commented Apr 18, 2022

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

@stale stale bot closed this as completed Apr 18, 2022
KFP SDK Triage automation moved this from Needs triage to Closed Apr 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/sdk/dsl/compiler kind/question lifecycle/stale The issue / pull request is stale, any activities remove this label. upstream_issue
Projects
Development

No branches or pull requests

8 participants