Andrew Berls edited this page Apr 5, 2016 · 6 revisions
Clone this wiki locally

Getting up and running with Overseer is easy. Here we'll present a sample application that defines some basic tasks to run with dependencies between them, and show how to submit that to Overseer for execution.

First we'll need to get dependencies in place. Overseer stores its operational data in Datomic, so we'll need to include that.

(defproject myapp "0.1.0-SNAPSHOT"
  :aot [myapp.core]
  :main myapp.core
  :dependencies [[org.clojure/clojure "1.7.0"]
                 [com.datomic/datomic-free "0.9.5130"]
                 [io.framed/overseer "0.7.2"]])

This example uses the free edition of Datomic, but Datomic Pro is supported as well; just substitute your Datomic Pro dependency instead of datomic-free, and modify the Overseer dependency shown here with [io.framed/overseer "0.7.2" :exclusions [com.datomic/datomic-free]]. The myapp.core namespace contains our main entry point, so we make sure to AOT-compile it.

Next up we'll set up Datomic and install Overseer's schema, so fire up lein repl. If you already have a Datomic database set up and running, you can substitute your URI here and Overseer will integrate with your existing DB, so you can skip this first step. Otherwise you'll need to make sure Datomic is running and create a database:

myapp.core=> (require '[datomic.api :as d])
myapp.core=> (def uri "datomic:free://localhost:4334/myapp")
myapp.core=> (d/create-database uri)

Next up we'll install Overseer's schema.

myapp.core=> (require '[overseer.schema])
myapp.core=> (overseer.schema/install (d/connect uri))

If everything went smoothly, you should see the :ok return value. At this point, Overseer is fully installed and ready to go. We'll first include the complete code to specify a job dependency graph and some handlers, and then walk through it. Here's our entire example namespace:

(ns myapp.core
  (:require [overseer.api :as overseer])

(def job-graph
  {:start []
   :result1 [:start]
   :result2 [:start]
   :finish [:result1 :result2]})

(def job-handlers
  {:start (fn [job] (println "start"))
   :result1 (fn [job] (println "result1"))
   :result2 (fn [job] (println "result2"))
   :finish (fn [job] (println "finish"))})

(defn -main [& args]
  (let [config {:datomic {:uri "datomic:free://localhost:4334/myapp"}}]
    (overseer/start config job-handlers)))

There are a few important components at play here. First is job-graph: this is an ordinary Clojure map that abstractly describes your jobs and the dependencies between them; you'll see that job types are specified as keywords. Each job describes the jobs it relies on, i.e. its "parents" (this may be somewhat the reverse of other graph notations where each node describes its children, i.e. all arrows going downwards). Here the :start job type has no dependencies, and so is eligible for execution as soon as we create an instance to be run. The :result1 and :result2 jobs both depend on :start, and they will not run until the :start job successfully completes. Similarly, the :finish job depends on both :result1 and :result2. Overseer handles scheduling and execution for you, so if any job fails unexpectedly, its children will not run.

Overseer does some magic behind the scenes - since :result1 and :result2 do not depend on each other, as soon as the :start job finishes, both :result1 and :result2 may start executing immediately in parallel on different machines in your cluster, depending on your configuration!

The next important concept is job-handlers. This is a Clojure map where the keys are job types corresponding to the dependency graph from before, and the values are ordinary Clojure functions to run. Overseer will automatically call these functions and pass in a job argument, which is a map of information about the current job. The example functions just print a simple message to stdout and don't do anything meaningful; real jobs of course will likely perform computation and persist their results to external storage, enabling data dependencies between jobs.

That's all we need to start Overseer running! An instance of Overseer running on a machine is usually called a "worker"; you can scale a cluster to an arbitrary number of workers, who will coordinate through a central Datomic installation. We can now compile our code and start a worker running:

lein uberjar
java -jar target/myapp-0.1.0-SNAPSHOT-standalone.jar myapp.core

You should see a startup message printed to stdout, and then a loop of "waiting messages". When a worker starts up, it connects to Datomic and looks for jobs to run; since we haven't entered any jobs into the system yet it will just constantly loop, check and sleep for a while.

In a separate terminal, we can fire up lein repl and insert some jobs into the system. Jobs are entered with ordinary Datomic transactions, which Overseer provides helpers to construct, and at which point workers will pick up the jobs and execute them. Here's how to insert an entire job graph at once - this will generate unstarted instances of :start, :result1, :result2, and :finish, which will be executed by workers as their dependencies become satisfied.

myapp.core=> (require '[overseer.api :as overseer])
myapp.core=> (require '[myapp.core :as myapp])
myapp.core=> (def txns (overseer/->graph-txn myapp/job-graph))
myapp.core=> @(d/transact (d/connect uri) txns)

That's it! If you return to your worker you should see it executing the sample jobs and printing messages to the console.