It seems that the default behaviour of SGE is to use "load_formula = np_load_avg" (see qconf -ssconf) which will balance jobs across nodes.
1. My cluster currently has three nodes up and the queue is currently empty
2. Three new jobs come in -- these will most likely be spread across each of the three nodes
3. Since all three nodes have processes on them the load balancer will not be able to shut down any of the nodes even though the cluster is under-utilised
I'd suggest modifying the SGE setup to use the "fill up host" configuration according to:
Even better would be to configure SGE to send jobs to the most recently booted node first so that we may shut down older nodes first (hopefully before their hour is up). I'm not yet sure if this is possible.
Example code that applies the "fill up host" change here: