Armada

Armada is a CNCF Sandbox project used in production at G-Research. It was written by G-Research, with contributions from G-Research Open Source members and others.

For an overview of Armada, see these videos:

The Armada Project adheres to the CNCF Code of Conduct.

Armada is a multi-Kubernetes cluster batch job scheduler.

Armada is designed to address the following issues:

A single Kubernetes cluster cannot be scaled indefinitely, and managing very large Kubernetes clusters is challenging. Hence, Armada is a multi-cluster scheduler built on top of several Kubernetes clusters.
Achieving very high throughput using the in-cluster storage backend, etcd, is challenging. Hence, queueing and scheduling is performed partly out-of-cluster using a specialized storage layer.

Armada is designed primarily for machine learning, AI, and data analytics workloads, and to:

Manage compute clusters composed of tens of thousands of nodes in total.
Schedule a thousand or more pods per second, on average.
Enqueue tens of thousands of jobs over a few seconds.
Divide resources fairly between users.
Provide visibility for users and admins.
Ensure near-constant uptime.

Documentation

For an overview of the architecture and design of Armada, and instructions for submitting jobs, see:

We expect readers of the documentation to have a basic understanding of Docker and Kubernetes; see, e.g., the following links:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Armada

Documentation

Pinned

Repositories

People

Top languages

Most used topics