This project prepares a container that can run on OLCF's Kubernetes infrastructure and provides yaml pod specification templates, that can be used to spawn pods that mount OLCF's GPFS filesystem and provide access to the batch schedulers of Summit, RHEA and the DTN.
bootstrap.sh. This script generates the personalized Dockerfile and Kubernetes pod and service specifications for your deployment. It updates the template files with your automation user acount details, and saves them under the Docker and the Specs folders.
Docker/Dockerfile. Dockerfile used to prepare a container with Pegasus and Condor, ommiting Pegasus' R support.
Specs/pegasus-submit-build.yml. Contains Kubernetes build specifications for the pegasus-olcf image.
Specs/pegasus-submit-service.yml. Contains Kubernetes service specification that can be used to spawn a Nodeport service that exposes the HTCondor Gridmanager Service running in your submit pod, to outside world.
Specs/pegasus-submit-pod.yml. Contains Kubernetes pod specification that can be used to spawn a pegasus/condor pod that has access to Summits's GPFS filesystem and its batch scheduler.
- Openshift's origin client https://github.com/openshift/origin/releases
- A working RSA Token to access OLCF's systems
- An automation user for OLCF's systems
- Allocation on OLCF's Kubernetes Cluster
In bootstrap.sh update the section "ENV Variables For User and Group" with your automation user's name, id, group name, group id and the Gridmanager Service Port, which must be in the range 30000-32767.
More specifically replace:
- USER, with the username of your automation user (eg. csc001_auser)
- USER_ID, with the user id of your automation user (eg. 20001)
- USER_GROUP, with the project name your automation user belongs to (eg. csc001)
- USER_GROUP_ID, with the project group id your automation user belongs to (eg. 10001)
- GRIDMANAGER_SERVICE_PORT, with the Kubernetes Nodeport port number the Gridmanager Service should use (eg. 32752)
Generate the Dockerfile and the Spec files for your deployment.
bash bootstrap.sh
oc login -u YOUR_USERNAME https://marble.ccs.ornl.gov/
oc create -f Specs/pegasus-submit-build.yml
oc start-build pegasus-olcf --from-file=Docker/Dockerfile
You can trace the log of the build by running:
oc logs -f build/pegasus-olcf-1
oc create -f Specs/pegasus-submit-service.yml
oc create -f Specs/pegasus-submit-pod.yml
oc exec -it pegasus-submit /bin/bash
cd $HOME #Execute this in the interactive shell
If this is the first time you are using the service, configure the batch submissions by running the following command.
bash /opt/remote_bosco_setup.sh #Execute this in the interactive shell
In order to delete the pod, exit the interactive shell by typing "exit" and then use the following command.
oc delete pod pegasus-submit
To delete the service use:
oc delete svc pegasus-submit-service