KF 1.7 Run metadata not updating #7579

savemuri · 2024-05-17T16:32:22Z

/kind bug

What steps did you take and what happened:
I am running kubeflow 1.7 and when I clone an existing run and open /pipeline/?ns=${SOME_NAMESPACE}#/runs/details/${UUID}, it takes too long to update run status (around 300s). Until then it is stuck in Unknown status

Note:

The workflow pod does get created and executes correctly.
Runs created by scheduled workflow does not have the same issue.

What did you expect to happen:
Expect to see run status updated with task information.

Anything else you would like to add:
Tried scaling up ml-pipeline and persistence-agent replicas but it did not help.

Environment:

Kubeflow version: (version number can be found at the bottom left corner of the Kubeflow dashboard): 1.7
kfctl version: (use kfctl version): n/a
Kubernetes platform: (e.g. minikube): EKS
Kubernetes version: (use kubectl version):
OS (e.g. from /etc/os-release):

The text was updated successfully, but these errors were encountered:

savemuri · 2024-05-23T14:03:40Z

Resolved after cleaning up old workflows that were not deleted. Persistence agent was becoming a bottleneck because workflow GC did not work as intended.

google-oss-prow bot added the kind/bug label May 17, 2024

kubeflow-bot added this to To Do in Needs Triage May 17, 2024

savemuri closed this as completed May 23, 2024

Needs Triage automation moved this from To Do to Closed May 23, 2024

kubeflow-bot removed this from Closed in Needs Triage May 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KF 1.7 Run metadata not updating #7579

KF 1.7 Run metadata not updating #7579

savemuri commented May 17, 2024 •

edited

savemuri commented May 23, 2024

KF 1.7 Run metadata not updating #7579

KF 1.7 Run metadata not updating #7579

Comments

savemuri commented May 17, 2024 • edited

savemuri commented May 23, 2024

savemuri commented May 17, 2024 •

edited