Skip to content
This repository has been archived by the owner on Sep 23, 2020. It is now read-only.

Commit

Permalink
docs for backfill/spot (spot client guide coming next)
Browse files Browse the repository at this point in the history
  • Loading branch information
timf committed Jan 6, 2011
1 parent e23da51 commit 3c422fd
Show file tree
Hide file tree
Showing 2 changed files with 95 additions and 0 deletions.
92 changes: 92 additions & 0 deletions docs/src/admin/reference.html
Original file line number Original file line Diff line number Diff line change
Expand Up @@ -1335,6 +1335,10 @@ <h2>Resource pool and pilot configurations _NAMELINK(resource-pool-and-pilot)</h
The "resource pool" mode is where the service has direct control of a pool The "resource pool" mode is where the service has direct control of a pool
of VMM nodes. The service assumes it can start VMs of VMM nodes. The service assumes it can start VMs
</p> </p>
<p>
These are explained below. To learn about backfill and spot instances, see
<a href="#backfill-and-spot-instances">backfill and spot instances overview</a>
</p>
<p> <p>
The "pilot" mode is where the service makes a request to a cluster's The "pilot" mode is where the service makes a request to a cluster's
Local Resource Management System (LRMS) such as PBS. The VMMs are equipped Local Resource Management System (LRMS) such as PBS. The VMMs are equipped
Expand Down Expand Up @@ -2209,6 +2213,94 @@ <h3>LANTorrent Configuration _NAMELINK(lantorrent-config)</h3>
</li> </li>
</ol> </ol>



<!-- *********************************************************************** -->
<!-- *********************************************************************** -->
<!-- *********************************************************************** -->

<br />

<a name="backfill-and-spot-instances"> </a>
<h2>Backfill and Spot Instances _NAMELINK(backfill-and-spot-instances)</h2>
<p>
Backfill and Spot Instances are two related features, they both deal with
<i>asynchronous</i> instance requests, requests that may only be started
at appropriate times, if ever.
</p>
<p>
Spot instances are requested by remote users with a particular bid
(represented in Nimbus as a discount applied to minutes charged to your
account). Users may consult the spot price history before bidding. If the
bid is accepted (equal or higher to the current spot price), the instances
are started. They may be stopped at a moment's notice. The implementation
is of EC2's 2010-08-31 WSDL, see the
<a href="http://aws.amazon.com/ec2/spot-instances/">Amazon EC2 Spot
Instances</a> guide for more background.
</p>
<p>
Backfill is a mechanism that the <i>administrator</i> configures to keep
idle resources busy. You pick a particular VM image that will be launched
when the nodes would otherwise be idle. This works nicely with systems
such as <a href="http://www.cs.wisc.edu/condor/">Condor</a> that can
gracefully deal with being preempted.
</p>
<p>
To jump to the precise semantics and configurations that are possible, see
the comments in the
<a href="https://github.com/nimbusproject/nimbus/blob/HEAD/service/service/java/source/etc/workspace-service/async.conf">async.conf</a>
file.
</p>
<p>
To start using spot instances as a user, follow the
<a href="../elclients.html#spot">spot instance user's guide</a>.
</p>
<p>
To begin with backfill, read over the above conf file first. Choose a
current administrator account to launch the image from or create one with
the "--dn" flag like so:
</p>

<div class="screen">
<b>$</b> ./bin/nimbus-new-user --dn BACKFILL-SUPERUSER backfill@localhost<br>
</div>

<p>
... where "BACKFILL-SUPERUSER" is the user configured in async.conf (that
is the default value).
</p>
<p>
Backfill responds <i>immediately</i> to changes in the resource pool like
the presence of higher priority requests (which includes spot instances if
they are configured, as well as regular requests of course). This also
includes when the resource pool is fundamentally changed by the
<tt class="literal">nimbus-nodes</tt> program. When you add and remove
nodes, you are changing the overall capacity which is a critical piece
of information for mapping asynchronous requests like backfill and spot
instances.
</p>
<p>
A ramifications of that is that, since adding nodes is always done when the service is running, having backfill enabled while you make adjustments might
get in your way during this period. It can be easier to disable backfill,
make node adjustments, and then re-enable backfill. This applies especially
to <i>removing</i> nodes since the <tt class="literal">nimbus-nodes</tt>
program does not allow nodes to be removed that have instances running on
them.
</p>
<p>
This could be said for spot instances as well. It is slightly trickier with
spot instances because they are remote user requests and you may not want to
disable things abruptly for people. But on the other hand, they are spot
instances so they should be able to deal with sudden terminations.
</p>
<p>
Another thing to watch out for is using the "max.instances=0" configuration.
Zero means as many as possible, but this maximum number is only calculated
at service startup time. So if you add a bunch of new nodes to the resource
pool and the backfill instances do not immediately start consuming them
greedily, this is what is happening. After adding all those nodes, stop and
then start the service and the backfill configuration will recalibrate.
</p>

<br /> <br />
<br /> <br />
<br /> <br />
Expand Down
3 changes: 3 additions & 0 deletions service/service/java/source/etc/workspace-service/async.conf
Original file line number Original file line Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@
# #
################################################################################ ################################################################################


# NOTE: There is extra documentation on these features online, see:
# http://www.nimbusproject.org/docs/current/admin/reference.html#backfill-and-spot-instances



# SI ENABLED # SI ENABLED
# #
Expand Down

0 comments on commit 3c422fd

Please sign in to comment.