Genie provides scalable, federated job and resource management for users of computational resources.
From the perspective of the end-user, Genie abstracts away the physical details of various (potentially transient) computational resources (like YARN, Spark, Mesos clusters etc.). It then provides APIs to submit and monitor jobs on these clusters without users having to install any clients themselves or know details of the clusters and commands.
Administrators will use the configuration APIs to register clusters and the commands/applications that run on them with Genie. The Genie nodes can have all the clients pre-installed on them or Genie will download and install them at runtime if properly configured. Users can then look up what clusters and commands are available and submit jobs to be processed. Once jobs are submitted users can query Genie for job status and output.
A big advantage of this model is the scalability that it provides for client resources. This solves a very common problem where a single machine is configured as an entry point to submit jobs to large clusters and the machine gets overloaded. Genie allows the use of a group of machines which can increase and decrease in number to handle the increasing load, providing a very scalable solution.
Please browse around this Wiki to get yourself acquainted with Genie or if you want to quickly try it out check out our Docker image to run an example without a full installation.