-
Notifications
You must be signed in to change notification settings - Fork 143
use init container for worker pod to wait master pod ready #187
Conversation
Hi @zlcnju. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
add a init container default use busybox image, add wait master running by check dns master addr. And this initcontainer config can be configured in the pytorch configmap. If aggred, I will add the configmap config in the kubeflow project |
pkg/common/config/config.go
Outdated
- name: init-pytorch | ||
image: busybox | ||
command: ['sh', '-c', 'until nslookup {{.MasterAddr}}; do echo waiting for master; sleep 2; done;']` | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Define the default init container template as a const string will be better, no need to add a function here.
var initContainerTemplate string | ||
|
||
func init() { | ||
bytes, err := ioutil.ReadFile("/etc/config/initContainer.yaml") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why read init container template from a file of hard coded path? Is it better to add a new arg to the cmd to accept the init container template file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modified
/ok-to-test |
var initContainerTemplate string | ||
|
||
func init() { | ||
bytes, err := ioutil.ReadFile("/etc/config/initContainer.yaml") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to expose InitContainerTemplate to user? Any particular use case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least, the image used in init container will be configured in the private cloud. And to avoid the other use case not in mind, expose InitContainerTemplate.
/cc @gaocegege |
pkg/common/config/config.go
Outdated
|
||
var initContainerTemplate = ` | ||
- name: init-pytorch | ||
image: busybox |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a tag for the image and set IfNotPresent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modified
@johnugeorge @gaocegege any other suggestions? |
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
@zlcnju @tossmilestone Thanks for your review! |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: richardsliu The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
May I ask which release will include this fix? |
Can you use latest 0.7 release |
I use the 0.7.0 image from gcr.io, do this image contain this fix? |
you can check the pod create by pyjob, If init container exists, this fix included |
it do contain this fix, thank you very much |
use init container for worker pod to wait master pod ready
fix 186