Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix sync PodGroup logic #1012

Merged
merged 1 commit into from May 28, 2019
Merged

fix sync PodGroup logic #1012

merged 1 commit into from May 28, 2019

Conversation

wackxu
Copy link
Contributor

@wackxu wackxu commented May 27, 2019

fix #1011

/assign @gaocegege @richardsliu


This change is Reviewable

@coveralls
Copy link

coveralls commented May 27, 2019

Coverage Status

Coverage remained the same at 76.744% when pulling 36dd05a on wackxu:fixevent into e031485 on kubeflow:master.

@johnugeorge
Copy link
Member

Are events regenerated again for terminated job during reconcile because of #965 ?

@wackxu
Copy link
Contributor Author

wackxu commented May 27, 2019

@johnugeorge when we enable gang-scheduling in tf-operator, for every tfjob that is success or failure, in a single sync, the podgroup is created first and then get delete, and also regenerated two events. that is to say we need at least four api request that is useless and it increase the burden of api-server.

@johnugeorge
Copy link
Member

/lgtm

Copy link
Member

@gaocegege gaocegege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Can you do the similar change to v1beta2?

@wackxu
Copy link
Contributor Author

wackxu commented May 28, 2019

@gaocegege Yes, done

Copy link
Member

@gaocegege gaocegege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gaocegege

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@wackxu
Copy link
Contributor Author

wackxu commented May 28, 2019

/test kubeflow-tf-operator-presubmit

@k8s-ci-robot k8s-ci-robot merged commit 63a3c3c into kubeflow:master May 28, 2019
@gaocegege
Copy link
Member

Thanks for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Podgroup is constantly created and deleted after tfjob is success or failure
6 participants