-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[openmpi] Introduce a sidecar container for inter-pod synchronization
* openmpi-controller monitors the master pod's status and creates a semaphore file "term.sig" to signal openmpi-job to terminate * openmpi-job is now decoupled from kubernetes * openmpi-controller and openmpi-job shares a volume for inter-container communication * openmpi-controller can be extended in the future to support data snapshot
- Loading branch information
Jie Zhang
committed
Apr 23, 2018
1 parent
5eb0d72
commit 3abe08d
Showing
12 changed files
with
229 additions
and
137 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
.openmpi-controller | ||
env |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
FROM python:3.6 | ||
|
||
USER root | ||
|
||
ENV HOME /root | ||
|
||
ADD requirements.txt $HOME | ||
ADD controller $HOME/controller | ||
|
||
RUN pip3 install -r $HOME/requirements.txt |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
# TODO: move to kubeflow | ||
IMAGE=jiez/openmpi-controller | ||
TAG=latest | ||
|
||
build: | ||
docker build --pull -t ${IMAGE}:${TAG} . | ||
|
||
push: build | ||
docker push ${IMAGE}:${TAG} | ||
|
||
.PHONY: build push |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
from time import sleep | ||
from pathlib import Path | ||
|
||
from retrying import retry | ||
from kubernetes import client, config | ||
from kubernetes.client.rest import ApiException | ||
from kubernetes.config.config_exception import ConfigException | ||
|
||
|
||
SIG_DIR = '.openmpi-controller' | ||
SIG_TERM = f'{SIG_DIR}/term.sig' | ||
POD_MASTER = 'openmpi-master' | ||
POLL_STATUS_INTERVAL = 10 | ||
TERMINATED_PHASES = ('Succeeded', 'Failed') | ||
|
||
|
||
class Controller: | ||
""" | ||
Controller is a sidecar container that extends the "main" container (openmpi-job). | ||
It communicates with the main container using a shared volume mounted at the working directory. | ||
Right before it finishes its work, it creates a semaphore file "term.sig" to signal the main container to terminate. | ||
""" | ||
|
||
def __init__(self, namespace): | ||
self.namespace = namespace | ||
Path(SIG_DIR).mkdir() | ||
|
||
def __enter__(self): | ||
log('controller entered') | ||
try: | ||
config.load_incluster_config() | ||
except ConfigException: | ||
config.load_kube_config() | ||
|
||
self.api = client.CoreV1Api() | ||
return self | ||
|
||
def __exit__(self, exc_type, exc_val, exc_tb): | ||
log('controller exited') | ||
Path(SIG_TERM).touch() | ||
|
||
def wait_master_terminated(self): | ||
while True: | ||
phase = self._get_master_phase() | ||
log(f'{POD_MASTER} is in "{phase}" phase') | ||
if phase in TERMINATED_PHASES: | ||
break | ||
|
||
sleep(POLL_STATUS_INTERVAL) | ||
|
||
@retry(stop_max_attempt_number=5, | ||
wait_exponential_multiplier=1000, | ||
retry_on_exception=lambda e: isinstance(e, ApiException),) | ||
def _get_master_phase(self): | ||
pod = self.api.read_namespaced_pod(POD_MASTER, self.namespace) | ||
return pod.status.phase | ||
|
||
|
||
def log(msg): | ||
print(msg, flush=True) | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
from argparse import ArgumentParser | ||
|
||
from controller import Controller | ||
|
||
|
||
def main(): | ||
parser = ArgumentParser() | ||
parser.add_argument('--namespace', type=str, required=True) | ||
args = parser.parse_args() | ||
|
||
with Controller(args.namespace) as ctl: | ||
ctl.wait_master_terminated() | ||
|
||
|
||
if __name__ == '__main__': | ||
main() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
kubernetes==6.0.0 | ||
retrying==1.3.3 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.