Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


What is it?


Hawk is a Hybrid Data Center Scheduler presented at Usenix ATC 2015

It takes the best of both worlds combining centralized and distributed schedulers. It has the following main features:

  1. Hybrid Scheduling. Schedules Long jobs in a centralized way (better scheduling decisions) and Short jobs in a distributed way (better scheduling latency).

  2. Work stealing. To do better load balance when a node is free it will contact another one and 'steal' the short-latency-sensitive jobs in the queue.

  3. Partitioning. It prevents Long jobs from taking all the resources in the cluster so that Short jobs do not experience head-of-line blocking.


Eagle is currently work in progress. A beta version is available here. Eagle aims to avoid the Head-of-Line blocking that short jobs experience in distributed schedulers by providing and approximate/fast view of the Long jobs.


In order to run Hawk and Eagle you need to have Java JDK 1.7 installed and Maven.

This command installs Hawk/Eagle locally.

$ mvn install -DskipTests 

Getting started

Create a configuration file with the following parameters.


deployment.mode = configbased		# currently only this mode is supported
static.node_monitors =<hostname_1>:20502	# comma sepparated list of nodes where jobs will run = spark			# the application name, this can also be changed as a java opt
system.memory =10240000			# 
system.cpus=1				# currently only one slot per machine is supported
sample.ratio=2				# number of probes per task for distributed schedulers, 'power of two'
cancellation=no				# after a job finishes will cancel the rest of the probes, in practice makes no difference
scheduler.centralized=<centralized_scheduler_ip>	# centralized scheduler IP, if no centralized scheduler set
big.partition=80				# the percentage of nodes where Long jobs can run
small.partition=100				# the percentage of nodes where Short jobs can run
nodemonitor.stealing=yes			# enable Hawk stealing
nodemonitor.stealing_attempts=10		# number of stealing attempts
eagle.piggybacking=no			# enable Eagle
eagle.retry_rounds=0			# number of rounds distributed schedulers should try before going to small partition


To enable Eagle you need to change the following parameters

nodemonitor.stealing=no			# enable Hawk stealing
nodemonitor.stealing_attempts=0		# number of stealing attempts
eagle.piggybacking=yes			# enable Eagle
eagle.retry_rounds=3			# number of rounds distributed schedulers should try before going to small partition

After creating the configuration file you can run Hawk/Eagle daemon with the following command (replace JAVA_DIR, EAGLE_JAR and CONF_FILE with their corresponding paths):

$ JAVA_DIR -XX:+UseConcMarkSweepGC -verbose:gc -XX:+PrintGCTimeStamps -Xmx2046m -XX:+PrintGCDetails -cp EAGLE_JAR ch.epfl.eagle.daemon.EagleDaemon -c CONF_FILE

Now you need to run a front end application, you can test it with a Spark program for example.

Spark plugin

We also have a plugin for Spark, you can find it here.

You can compile it using the following command, provided you installed Eagle first.

$ build/sbt assembly

You can run an example with JavaSleep, for that you need to create a file with the jobs sleeping time. The input file should have the following format:

[Each line a job]
Col1: job arrival time
Col2: number of tasks in job
Col3: estimated job runtime (we use normally the mean)
Col4: (and as many cols as needed) the real duration of each task for the job (for the sleep)

570    2 2722 2722 2722 
  1. Start the driver.

    $ spark/bin/spark-run<driver_hostname>
    org.apache.spark.examples.JavaSleep "eagle@$SCHEDULER:20503" 5 3 hostname $SMALL "<path_to_input_file>"

SMALL can take the values: "small" or "big" depending on if its the centralized or the distributed (centralized --> big)

  1. Start the backends. This should run in each of the nodes

    $ spark/bin/spark-run
    org.apache.spark.scheduler.eagle.EagleExecutorBackend --driver-url spark://EagleSchedulerBackend@<driver_hostname>:60501


Hawk and Eagle are meant to improve job completion times in large clusters, to simulate with tens of thousands of nodes we used a simulator. This simulator is in Python, please refer to its README for further information.



No description, website, or topics provided.






No releases published


No packages published