Client.wait_for_run_completion and "Terminate" in the UI #3576

RunOrVeith · 2020-04-21T17:07:14Z

What steps did you take:

Start a pipeline via python code, and wait for it's execution:

client = Client()
# fill this with whatever works for you, just start a random pipeline from python that does not return immediately
run = client.run_pipeline(experiment_id="test",
                                        job_name="test job",
                                        pipeline_id="123456")   
 # we don't care how long it takes, just wait
completed_run = client.wait_for_run_completion(run_id=run.id, timeout=float("inf"))

Once it runs, click "Terminate" inside the kubeflow pipelines UI on the top right.
For me, I hit terminate while the run was stuck on "ImagePullBackoff" (since some permissions in our docker registry changed), so before anything was actually being run, but while the step was in "Pending State".

What happened:

The code is now blocked forever, and does not receive knowledge of the termination.

What did you expect to happen:

The code returns with a message that the run was terminated, or crashes with the same message. Either is fine.

I am not sure if this is the same bug as #1992 and #2588, since this also applies (the run is still in pending state after hitting terminate), nevertheless I wanted to state that it does not trigger "wait_for_run_completion" either.

Environment:

KFP version: Build commit: ee207f2

KFP SDK version: 0.4.0

/kind bug
/area backend
/area sdk

The text was updated successfully, but these errors were encountered:

Bobgy · 2020-05-07T06:07:27Z

@RunOrVeith do you know if this still exists in latest KFP version?
looks like your deployment is very old (Sep.25th 2019)

RunOrVeith · 2020-05-07T07:43:24Z

@Bobgy Not sure, we upgraded our deployment last weekend to ca58b22, but I can't manually trigger the ImagePullbackError to check.
If you tell me how to do that I can try it again.

Bobgy · 2020-05-07T08:27:12Z

@RunOrVeith You can follow doc in https://www.kubeflow.org/docs/pipelines/sdk/build-component/#create-a-python-function-to-wrap-your-component to create a container op with invalid image

RunOrVeith · 2020-05-29T15:30:00Z

I just tried it again (or rather I just got stuck in the pending state again), and it seems to work now.
I'll close this issue.

k8s-ci-robot added kind/bug area/backend area/sdk labels Apr 21, 2020

Bobgy added status/triaged Whether the issue has been explicitly triaged priority/p1 labels May 7, 2020

Bobgy added this to To do in KFP Post 1.0 Backlog via automation May 7, 2020

RunOrVeith closed this as completed May 29, 2020

KFP Post 1.0 Backlog automation moved this from To do to Done May 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Client.wait_for_run_completion and "Terminate" in the UI #3576

Client.wait_for_run_completion and "Terminate" in the UI #3576

RunOrVeith commented Apr 21, 2020

Bobgy commented May 7, 2020

RunOrVeith commented May 7, 2020

Bobgy commented May 7, 2020

RunOrVeith commented May 29, 2020

Client.wait_for_run_completion and "Terminate" in the UI #3576

Client.wait_for_run_completion and "Terminate" in the UI #3576

Comments

RunOrVeith commented Apr 21, 2020

What steps did you take:

What happened:

What did you expect to happen:

Environment:

Bobgy commented May 7, 2020

RunOrVeith commented May 7, 2020

Bobgy commented May 7, 2020

RunOrVeith commented May 29, 2020