Skip to content

Commit

Permalink
Adds docs for k8s deployment
Browse files Browse the repository at this point in the history
  • Loading branch information
pritchardn committed Mar 14, 2022
1 parent 3fb9f0e commit 2493ab6
Showing 1 changed file with 5 additions and 4 deletions.
9 changes: 5 additions & 4 deletions docs/deployment.rst
Original file line number Diff line number Diff line change
Expand Up @@ -66,13 +66,14 @@ Deployment with OpenOnDemand

`OpenOnDemand <https://openondemand.org>`_ (OOD) is a system providing an interactive interface to remote compute resources. It is becoming increasingly popular with a number of HPC centers around the world. The two Australian research HPC centers Pawsey and NCI are planning to roll it out for their users. Independently we had realized that |daliuge| is missing a authentication, authorization and session management system and started looking into OOD as a solution for this. After a short evaluation we have started integrating OOD into the deployment for our small in-house compute cluster. In order to make this work we needed to implement an additional interface between the translator running on an external server (e.g. AWS) and OOD and then further on into the (SLURM) batch job system. This interface code is currently in a separate private git repository, but will be released as soon as we have finished testing it. The code mimics the |daliuge| data island manager's REST interface, but instead of launching the workflow directly it prepares a SLURM job submission script and places it into the queue. Users can then use the standard OOD web-pages to monitor the jobs and get access to the logs and results of the workflow execution. OOD allows the integration of multiple compute resources, including Kubernetes and also (to a certain degree) GCP, AWS and Azure. Once configured, users can choose to submit their jobs to any of those. Our OOD interface code has been implemented as an OOD embedded `Phusion Passenger <https://www.phusionpassenger.com/>`_ `Flask <https://flask.palletsprojects.com/en/2.0.x/>`_ application, which is `WSGI <https://wsgi.readthedocs.io>`_ compliant. Very little inside that application is OOD specific and can thus be easily ported to other deployment scenarios.

Deployment with Kubernetes (Coming Soon)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Deployment with Kubernetes (Experimental)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Kubernetes is a canonical container orchestration system.
We are building support to deploy workflows as helm charts which will enable easier and more reliably deployments across more computing facilities.
Support is currently limited but watch this space.
Multi-node kubernetes clusters are now supported to get started see `start_helm_cluster.py <https://github.com/ICRAR/daliuge/blob/master/daliuge-engine/dlg/deploy/start_helm_cluster.py>`_ for an example usage.
Your environment will need have `kubectl` properly configured to point to your desired cluster.
See `daliuge-k8s/README.md <https://github.com/ICRAR/daliuge/tree/master/daliuge-k8s>`_ for a more detailed setup guide.


Component Deployment
Expand Down

0 comments on commit 2493ab6

Please sign in to comment.