Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Pilot updates #32

Merged
merged 6 commits into from

2 participants

@oldpatricka
Collaborator

This is a set of patches to add three things to Nimbus Pilot:

  • Add an option to disable pilot's memory bubbling. The Xen Best practices wiki page recommends you disable bubbling in a production environment, and we've had some reliability problems with it. So now we have an option to disable it.

  • Allow the number of cores requested for a VM to be forwarded to the PBS ppn request. This overrides the previous behaviour where an admin would specify the ppn in the pilot.conf file, if you wanted to ensure that only one VM ran per node. You can still get this behaviour with a non-zro ppn in the config file. If you set ppn to zero, you'll get this new behaviour where ppn is the number of cores per VM.

  • Configurable PBS accounting strings. Previously, this would always send the user's cert DN. Now you can send either the nimbus display name, the user's group name, or the user's DN. This is configured in pilot.conf

Feel free to ask me any questions or tell me to do something differently. I'm not very comfortable with the Spring stuff, so if there's a better way to implement what I have let me know.

I could also split this into three pull requests if that would be easier.

@priteau
s/reccommends/recommends/
@priteau
Owner

I don't know much about pilot, but I didn't see anything suspicious in the commit series.
Good to know about the bubbling issues by the way.

@oldpatricka oldpatricka merged commit 44aae3e into master
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
View
6 authzdb/src/org/nimbus/authz/UserAlias.java
@@ -50,4 +50,10 @@ public int getAliasType() {
public String getAliasTypeData() {
return aliasTypeData;
}
+
+ public String toString() {
+
+ return "userID: '" + userId + "' aliasName: '" + aliasName + "' friendlyName: '" + friendlyName
+ + "' aliasType: '" + aliasType + "' aliasTypeData: '" + aliasTypeData + "'";
+ }
}
View
94 pilot/workspacepilot.py
@@ -35,27 +35,27 @@
# result of "generate-index.py < workspacepilot.py"
INDEX = """
I. Globals (lines 10-69)
- II. Embedded, default configuration file (lines 71-191)
- III. Imports (lines 193-220)
- IV. Exceptions (lines 222-348)
- V. Logging (lines 350-569)
- VI. Signal handlers (lines 571-673)
- VII. Timer (lines 675-700)
- VIII. Path/system utilities (lines 702-1073)
- IX. Action (lines 1075-1126)
- X. ReserveSlot(Action) (lines 1128-1732)
- XI. KillNine(ReserveSlot) (lines 1734-1812)
- XII. ListenerThread(Thread) (lines 1814-1919)
- XIII. StateChangeListener (lines 1921-2147)
- XIV. XenActions(StateChangeListener) (lines 2149-2877)
- XV. FakeXenActions(XenActions) (lines 2879-2993)
- XVI. XenKillNine(XenActions) (lines 2995-3126)
- XVII. VWSNotifications(StateChangeListener) (lines 3128-3743)
- XVIII. Configuration objects (lines 3745-3981)
- XIX. Convert configurations (lines 3983-4245)
- XX. External configuration (lines 4247-4317)
- XXI. Commandline arguments (lines 4319-4534)
- XXII. Standalone entry and exit (lines 4536-4729)
+ II. Embedded, default configuration file (lines 71-205)
+ III. Imports (lines 207-234)
+ IV. Exceptions (lines 236-362)
+ V. Logging (lines 364-583)
+ VI. Signal handlers (lines 585-687)
+ VII. Timer (lines 689-714)
+ VIII. Path/system utilities (lines 716-1087)
+ IX. Action (lines 1089-1140)
+ X. ReserveSlot(Action) (lines 1142-1746)
+ XI. KillNine(ReserveSlot) (lines 1748-1826)
+ XII. ListenerThread(Thread) (lines 1828-1933)
+ XIII. StateChangeListener (lines 1935-2161)
+ XIV. XenActions(StateChangeListener) (lines 2163-2898)
+ XV. FakeXenActions(XenActions) (lines 2900-3014)
+ XVI. XenKillNine(XenActions) (lines 3016-3153)
+ XVII. VWSNotifications(StateChangeListener) (lines 3155-3770)
+ XVIII. Configuration objects (lines 3772-4011)
+ XIX. Convert configurations (lines 4013-4285)
+ XX. External configuration (lines 4287-4357)
+ XXI. Commandline arguments (lines 4359-4574)
+ XXII. Standalone entry and exit (lines 4576-4769)
"""
RESTART_XEND_SECONDS_DEFAULT = 2.0
@@ -146,6 +146,20 @@
# If unconfigured, default is 2.0 seconds
#restart_xend_secs: 0.3
+
+# This option determines whether pilot will attempt to bubble down memory for
+# VMs. The Xen Best Practices wiki page at
+# http://wiki.xensource.com/xenwiki/XenBestPractices recommends that you set a
+# fixed amount of memory for dom0 because:
+#
+# 1. (dom0) Linux kernel calculates various network related parameters based
+# on the boot time amount of memory.
+# 2. Linux needs memory to store the memory metadata (per page info structures),
+# and this allocation is also based on the boot time amount of memory.
+#
+# Anything that is not 'yes' is taken as a no, and yes is the default
+#bubble_mem: no
+
[systempaths]
# This is only necessary if using SSH as a backup notification mechanism
@@ -2295,8 +2309,11 @@ def reserving(self, timeout=None):
if not self.initialized:
raise ProgrammingError("not initialized")
-
-
+
+ if not self.conf.bubble_mem:
+ log.debug("Memory bubbling disabled. No reservation neccessary.")
+ return
+
memory = self.conf.memory
if self.common.trace:
log.debug("XenActions.reserving(), reserving %d MB" % memory)
@@ -2355,6 +2372,10 @@ def unreserving(self, timeout=None):
if self.common.trace:
log.debug("XenActions.unreserving(), unreserving %d MB" % memory)
+ if not self.conf.bubble_mem:
+ log.debug("Memory bubbling disabled. No unreservation neccessary.")
+ return
+
# Be sure to unlock for every exit point.
lockhandle = _get_lockhandle(self.conf.lockfile)
_lock(lockhandle)
@@ -2450,7 +2471,7 @@ def unreserving(self, timeout=None):
raise UnexpectedError(errmsg)
_unlock(lockhandle)
-
+
if raiseme:
raise raiseme
@@ -3069,6 +3090,7 @@ def unreserving(self, timeout=None):
else:
log.info("XenKillNine unreserving, releasing %d MB" % memory)
+
curmem = self.currentAllocation_MB()
log.info("current memory MB = %d" % curmem)
@@ -3085,6 +3107,10 @@ def unreserving(self, timeout=None):
killedVMs = self.killAll()
if killedVMs:
raiseme = KilledVMs(killedVMs)
+
+ if not self.conf.bubble_mem:
+ log.debug("Memory bubbling disabled. No return of memory neccessary.")
+ return
if memory == XenActionsConf.BESTEFFORT:
targetmem = freemem + curmem
@@ -3120,6 +3146,7 @@ def unreserving(self, timeout=None):
else:
raise UnexpectedError(errmsg)
+
if raiseme:
raise raiseme
@@ -3839,7 +3866,7 @@ class XenActionsConf:
BESTEFFORT = "BESTEFFORT"
- def __init__(self, xmpath, xendpath, xmsudo, sudopath, memory, minmem, xend_secs, lockfile):
+ def __init__(self, xmpath, xendpath, xmsudo, sudopath, memory, minmem, xend_secs, lockfile, bubble_mem):
"""Set the configurations.
Required parameters:
@@ -3861,6 +3888,8 @@ def __init__(self, xmpath, xendpath, xmsudo, sudopath, memory, minmem, xend_secs
* xend_secs -- If xendpath is configured, amount of time to
wait after a restart before checking if it booted.
+
+ * bubble_mem -- If set to False, pilot will not attempt memory bubbling
Raise InvalidConfig if there is a problem with parameters.
@@ -3871,6 +3900,7 @@ def __init__(self, xmpath, xendpath, xmsudo, sudopath, memory, minmem, xend_secs
self.sudopath = sudopath
self.xendpath = xendpath
self.lockfile = lockfile
+ self.bubble_mem = bubble_mem
log.debug("Xenactions lockfile: %s" % lockfile)
if memory == None:
@@ -4116,9 +4146,19 @@ def getXenActionsConf(opts, config):
except:
msg = "restart_xend_secs ('%s') is not a number" % xend_secs
raise InvalidConfig(msg)
+
+ bubble_mem = True
+ try:
+ bubble_mem_val = config.get("xen", "bubble_mem")
+ if bubble_mem_val:
+ if bubble_mem_val.lower() == 'no':
+ bubble_mem = False
+ except Exception, e:
+ log.debug("No bubble_mem attribute set, assuming True ")
+ log.info("Bubbling set to false!")
if not opts.killnine:
- return XenActionsConf(xm, xend, xmsudo, sudo, opts.memory, minmem, xend_secs, lockfile)
+ return XenActionsConf(xm, xend, xmsudo, sudo, opts.memory, minmem, xend_secs, lockfile, bubble_mem)
else:
alt = "going to kill all guest VMs (if they exist) and give dom0 "
alt += "their memory (which may or may not be the maximum available) "
@@ -4146,7 +4186,7 @@ def getXenActionsConf(opts, config):
log.info(msg + ", %s" % alt)
dom0mem = XenActionsConf.BESTEFFORT
- return XenActionsConf(xm, xend, xmsudo, sudo, dom0mem, minmem, xend_secs, lockfile)
+ return XenActionsConf(xm, xend, xmsudo, sudo, dom0mem, minmem, xend_secs, lockfile, bubble_mem)
def getVWSNotificationsConf(opts, config):
"""Return populated VWSNotificationsConf object or raise InvalidConfig
View
23 service/service/java/source/etc/workspace-service/other/resource-locator-pilot.xml
@@ -6,7 +6,27 @@
http://www.springframework.org/schema/beans/spring-beans-2.0.xsd">
<import resource="main.conflocator.xml" />
+ <import resource="authz-callout-ACTIVE.xml" />
+ <bean id="other.AuthzDataSource"
+ class="org.apache.commons.dbcp.BasicDataSource">
+ <property name="driverClassName" value="org.sqlite.JDBC" />
+ <property name="maxActive" value="10" />
+ <property name="maxIdle" value="4" />
+ <property name="maxWait" value="2000" />
+ <property name="poolPreparedStatements" value="true" />
+
+ <property name="url"
+ value="jdbc:sqlite://$CUMULUS{cumulus.authz.db}" />
+ <property name="username" value="nimbus"/>
+ <property name="password" value="nimbus"/>
+ </bean>
+
+
+ <bean id="other.authzDBAdapter" class="org.nimbus.authz.AuthzDBAdapter">
+ <constructor-arg ref="other.AuthzDataSource"/>
+ </bean>
+
<bean id="nimbus-rm.scheduler.SlotManagement"
class="org.globus.workspace.scheduler.defaults.pilot.PilotSlotManagement"
init-method="validate">
@@ -100,6 +120,7 @@
<property name="extraProperties" value="$PILOT{pbs.extra.properties}" />
<property name="destination" value="$PILOT{pbs.destination}" />
<property name="grace" value="$PILOT{pbs.grace}" />
+ <property name="accounting" value="$PILOT{pbs.accounting.type}" />
<!-- Needed workspace service modules -->
@@ -107,6 +128,8 @@
<constructor-arg ref="nimbus-rm.loglevels" />
<constructor-arg ref="other.MainDataSource" />
<constructor-arg ref="other.timerManager" />
+ <constructor-arg ref="other.authzDBAdapter" />
+ <constructor-arg ref="nimbus-rm.service.binding.AuthorizationCallout" />
<!-- set after object creation time to avoid circular dep with home -->
<property name="instHome" ref="nimbus-rm.home.instance" />
View
30 service/service/java/source/etc/workspace-service/pilot.conf
@@ -45,7 +45,13 @@ contact.socket=1.2.3.4:41999
#
################################################################################
-# The path to the pilot program on the VMM nodes:
+# The path to the pilot program on the VMM nodes.
+#
+# If you would like to use a configuration file, rather than the embedded
+# configuration, add the -p and path to your configuration file to your path.
+# For example:
+#
+# pilot.path=/opt/workspacepilot.py -p /etc/workspacepilot.conf
pilot.path=/opt/workspacepilot.py
@@ -79,12 +85,14 @@ pbs.submit.path=qsub
pbs.delete.path=qdel
-# Processors per node, right now this should be set to be the maximum processors
-# on each cluster node. If it set too high, pilot job submissions will fail.
-# If it is set too low, the pilot may end up not being the only LRM job on the
-# node at a time and that is unpredictable/unsupported right now.
-
-pbs.ppn=2
+# Processors per node. If this is set to 0, your pilot job will request
+# as many processors as are requested for a VM. For example, if a user requests
+# a 2 core VM, ppn will be set to 2.
+#
+# On some installations, you may wish to hardcode this to a specific value
+# to ensure that each pilot job reserves a whole node for a VM. In this case,
+# choose a non-zero value.
+pbs.ppn=0
# If the pilot job should be submitted to a special queue/server, configure
@@ -110,6 +118,14 @@ pbs.grace=8
pbs.extra.properties=
+# Optional, if you would like to append an accounting string to your qsub
+# invokation, you can use either the user's certificate DN, the user's display
+# name as shown by nimbus-list-users, or the user's authz DB accounting group.
+#
+# You can select these with 'dn', 'displayname', or 'group'
+
+pbs.accounting.type=
+
# Optional, if configured this is prepended to the pilot exe invocation if
# nodes needed are greater than one. Torque uses pbsdsh for this.
View
2  service/service/java/source/src/org/globus/workspace/cmdutils/TorqueUtil.java
@@ -91,7 +91,7 @@ public ArrayList constructQsub(String destination,
throw new WorkspaceException(err);
}
- if (ppn < 1) {
+ if (ppn < 0) {
final String err = "invalid processors per node " +
"request: " + Integer.toString(ppn);
throw new WorkspaceException(err);
View
3  ...e/service/java/source/src/org/globus/workspace/creation/defaults/CreationManagerImpl.java
@@ -850,6 +850,7 @@ protected Reservation scheduleImpl(VirtualMachine vm,
}
final int memory = dep.getIndividualPhysicalMemory();
+ final int cores = dep.getIndividualCPUCount();
final int duration = dep.getMinDuration();
// list of associations should be in the DB, perpetuation of
@@ -860,7 +861,7 @@ protected Reservation scheduleImpl(VirtualMachine vm,
assocs = assocStr.split(",");
}
- return this.scheduler.schedule(memory, duration, assocs, numNodes,
+ return this.scheduler.schedule(memory, cores, duration, assocs, numNodes,
groupid, coschedid, vm.isPreemptable(), callerID);
}
View
15 service/service/java/source/src/org/globus/workspace/groupauthz/GroupAuthz.java
@@ -370,6 +370,21 @@ public Integer isRootPartitionUnpropTargetPermitted(URI target,
throw new AuthorizationException(NO_POLICIES_MESSAGE);
}
+ public String getGroupName(String caller) {
+
+
+ for (int i = 0; i < this.groups.length; i++) {
+
+ final GroupRights rights = getRights(caller, this.groups[i]);
+ // only first inclusion of DN is considered
+ if (rights != null) {
+ return this.groups[i].getName();
+ }
+ }
+
+ return null;
+ }
+
// -------------------------------------------------------------------------
// FOR CLOUD AUTOCONFIG
View
2  service/service/java/source/src/org/globus/workspace/scheduler/Scheduler.java
@@ -38,6 +38,7 @@
* @see #proceedCoschedule for handling separate requests together
*
* @param memory MB needed
+ * @param CPU cores needed
* @param duration seconds needed
* @param neededAssociations networks needed
* @param numNodes number needed
@@ -49,6 +50,7 @@
* @throws SchedulingException internal problem
*/
public Reservation schedule(int memory,
+ int cores,
int duration,
String[] neededAssociations,
int numNodes,
View
3  ...vice/java/source/src/org/globus/workspace/scheduler/defaults/DefaultSchedulerAdapter.java
@@ -224,6 +224,7 @@ public long getSweeperDelay() {
}
public Reservation schedule(int memory,
+ int cores,
int duration,
String[] neededAssociations,
int numNodes,
@@ -263,7 +264,7 @@ public Reservation schedule(int memory,
this.creationPending.pending(ids);
final NodeRequest req =
- new NodeRequest(ids, memory, duration, assocs, groupid, creatorDN);
+ new NodeRequest(ids, memory, cores, duration, assocs, groupid, creatorDN);
try {
View
15 service/service/java/source/src/org/globus/workspace/scheduler/defaults/NodeRequest.java
@@ -19,6 +19,7 @@
public class NodeRequest {
private int memory; // MBs
+ private int cores;
private int duration; // seconds
private int[] ids = null;
@@ -41,12 +42,14 @@ public NodeRequest(int memory,
public NodeRequest(int[] ids,
int memory,
+ int cores,
int duration,
String[] neededAssociations,
String groupid,
String creatorDN) {
this(memory, duration);
+ this.cores = cores;
this.ids = ids;
this.neededAssociations = neededAssociations;
this.groupid = groupid;
@@ -80,6 +83,18 @@ public int getNumNodes() {
return this.ids.length;
}
+ public int getCores() {
+ // Java sets ints to 0 if they're never initialized
+ if (this.cores == 0) {
+ return 1;
+ }
+ return this.cores;
+ }
+
+ public void setCores(int cores) {
+ this.cores = cores;
+ }
+
public int getMemory() {
return this.memory;
}
View
125 ...ce/java/source/src/org/globus/workspace/scheduler/defaults/pilot/PilotSlotManagement.java
@@ -20,12 +20,16 @@
import edu.emory.mathcs.backport.java.util.concurrent.ExecutorService;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
+import org.globus.workspace.groupauthz.GroupAuthz;
import org.globus.workspace.scheduler.NodeExistsException;
import org.globus.workspace.scheduler.NodeInUseException;
import org.globus.workspace.scheduler.NodeManagement;
import org.globus.workspace.scheduler.NodeManagementDisabled;
import org.globus.workspace.scheduler.NodeNotFoundException;
import org.globus.workspace.scheduler.defaults.ResourcepoolEntry;
+import org.globus.workspace.service.binding.authorization.CreationAuthorizationCallout;
+import org.nimbus.authz.AuthzDBAdapter;
+import org.nimbus.authz.UserAlias;
import org.nimbustools.api.services.rm.DoesNotExistException;
import org.nimbustools.api.services.rm.ResourceRequestDeniedException;
import org.nimbustools.api.services.rm.ManageException;
@@ -118,6 +122,8 @@
private TorqueUtil torque;
+ private AuthzDBAdapter authzDBAdapter;
+ private CreationAuthorizationCallout authzCallout;
// set from config
private String contactPort;
@@ -138,6 +144,7 @@
private String destination = null; // only one for now
private String extraProperties = null;
private String multiJobPrefix = null;
+ private String accounting;
// -------------------------------------------------------------------------
// CONSTRUCTOR
@@ -146,7 +153,9 @@
public PilotSlotManagement(WorkspaceHome home,
Lager lager,
DataSource dataSource,
- TimerManager timerManager) {
+ TimerManager timerManager,
+ AuthzDBAdapter authz,
+ CreationAuthorizationCallout authzCall) {
if (home == null) {
throw new IllegalArgumentException("home may not be null");
@@ -168,6 +177,9 @@ public PilotSlotManagement(WorkspaceHome home,
throw new IllegalArgumentException("lager may not be null");
}
this.lager = lager;
+
+ this.authzDBAdapter = authz;
+ this.authzCallout = authzCall;
}
@@ -268,6 +280,20 @@ public void setLogdirResource(Resource logdirResource) throws IOException {
this.logdirPath = logdirResource.getFile().getAbsolutePath();
}
+ public AuthzDBAdapter getAuthzDBAdapter() {
+ return authzDBAdapter;
+ }
+
+ public void setAuthzDBAdapter(AuthzDBAdapter authzDBAdapter) {
+ this.authzDBAdapter = authzDBAdapter;
+ }
+
+ public void setAccounting(String accounting) {
+ if (accounting != null && accounting.trim().length() != 0) {
+ this.accounting = accounting;
+ }
+ }
+
// -------------------------------------------------------------------------
// IoC INIT METHOD
// -------------------------------------------------------------------------
@@ -369,8 +395,8 @@ public synchronized void validate() throws Exception {
"Is the configuration present?");
}
- if (this.ppn < 1) {
- throw new Exception("processors per node (ppn) is less than one, " +
+ if (this.ppn < 0) {
+ throw new Exception("processors per node (ppn) is less than zero, " +
"invalid. Is the configuration present?");
}
@@ -492,6 +518,7 @@ public Reservation reserveSpace(NodeRequest request, boolean preemptable)
this.reserveSpace(request.getIds(),
request.getMemory(),
+ request.getCores(),
request.getDuration(),
request.getGroupid(),
request.getCreatorDN());
@@ -520,6 +547,7 @@ public Reservation reserveCoscheduledSpace(NodeRequest[] requests,
// capacity vs. mapping and we will get more sophisticated here later)
int highestMemory = 0;
+ int highestCores = 0;
int highestDuration = 0;
final ArrayList idInts = new ArrayList(64);
@@ -533,6 +561,12 @@ public Reservation reserveCoscheduledSpace(NodeRequest[] requests,
highestMemory = thisMemory;
}
+ final int thisCores = requests[i].getCores();
+
+ if (highestCores < thisCores) {
+ highestCores = thisCores;
+ }
+
final int thisDuration = requests[i].getDuration();
if (highestDuration < thisDuration) {
@@ -563,7 +597,7 @@ public Reservation reserveCoscheduledSpace(NodeRequest[] requests,
// Assume that the creator's DN is the same for each node
final String creatorDN = requests[0].getCreatorDN();
- this.reserveSpace(all_ids, highestMemory, highestDuration, coschedid, creatorDN);
+ this.reserveSpace(all_ids, highestMemory, highestCores, highestDuration, coschedid, creatorDN);
return new Reservation(all_ids, null, all_durations);
}
@@ -579,6 +613,7 @@ public Reservation reserveCoscheduledSpace(NodeRequest[] requests,
* than one VM is mapped to the same node, the returned node
* assignment array will include duplicates.
* @param memory megabytes needed
+ * @param requestedCores needed
* @param duration seconds needed
* @param uuid group ID, can not be null if vmids is length > 1
* @param creatorDN the DN of the user who requested creation of the VM
@@ -587,6 +622,7 @@ public Reservation reserveCoscheduledSpace(NodeRequest[] requests,
*/
private void reserveSpace(final int[] vmids,
final int memory,
+ final int requestedCores,
final int duration,
final String uuid,
final String creatorDN)
@@ -604,6 +640,16 @@ private void reserveSpace(final int[] vmids,
throw new ResourceRequestDeniedException(msg);
}
+ // When there is no core request, the default is -1,
+ // we would actually like one core.
+ int cores;
+ if (requestedCores <= 0) {
+ cores = 1;
+ }
+ else {
+ cores = requestedCores;
+ }
+
if (vmids.length > 1 && uuid == null) {
logger.error("cannot make group space request without group ID");
throw new ResourceRequestDeniedException("internal " +
@@ -628,13 +674,14 @@ private void reserveSpace(final int[] vmids,
}
}
- this.reserveSpaceImpl(memory, duration, slotid, vmids, creatorDN);
+ this.reserveSpaceImpl(memory, cores, duration, slotid, vmids, creatorDN);
// pilot reports hostname when it starts running, not returning an
// exception to signal successful best effort pending slot
}
private void reserveSpaceImpl(final int memory,
+ final int cores,
final int duration,
final String uuid,
final int[] vmids,
@@ -646,20 +693,34 @@ private void reserveSpaceImpl(final int memory,
final int dur = duration + this.padding;
final long wallTime = duration + this.padding;
+
+ // If the pbs.ppn option in pilot.conf is 0, we should send
+ // the number of CPU cores used by the VM as the ppn string,
+ // otherwise, use the defined ppn value
+ int ppnRequested;
+ if (this.ppn == 0) {
+ ppnRequested = cores;
+ }
+ else {
+ ppnRequested = this.ppn;
+ }
+
+ String account = getAccountString(creatorDN, this.accounting);
+
// we know it's torque for now, no casing
final ArrayList torquecmd;
try {
torquecmd = this.torque.constructQsub(this.destination,
memory,
vmids.length,
- this.ppn,
+ ppnRequested,
wallTime,
this.extraProperties,
outputFile,
false,
false,
- creatorDN);
-
+ account);
+
} catch (WorkspaceException e) {
final String msg = "Problem with Torque argument construction";
if (logger.isDebugEnabled()) {
@@ -1670,4 +1731,52 @@ public boolean removeNode(String hostname)
public String getVMMReport() {
return "No VMM report when pilot is configured.";
}
+
+ public String getAccountString(String userDN, String accountingType) {
+
+ String accountString = null;
+ if (accountingType == null) {
+ accountString = null;
+ }
+ else if (accountingType.equalsIgnoreCase("dn")) {
+
+ accountString = userDN;
+ }
+ else if (accountingType.equalsIgnoreCase("displayname")) {
+
+ try {
+ String userID = authzDBAdapter.getCanonicalUserIdFromDn(userDN);
+ final List<UserAlias> aliasList = authzDBAdapter.getUserAliases(userID);
+ for (UserAlias alias : aliasList) {
+ if (alias.getAliasType() == AuthzDBAdapter.ALIAS_TYPE_DN) {
+
+ accountString = alias.getFriendlyName();
+ }
+ }
+ logger.error("Can't find display name for '" + userDN + "'. "
+ + "No accounting string will be sent to PBS.");
+ }
+ catch (Exception e) {
+ logger.error("Can't connect to authzdb db. No accounting string will be sent to PBS.");
+ }
+ }
+ else if (accountingType.equalsIgnoreCase("group")) {
+
+ try {
+ GroupAuthz groupAuthz = (GroupAuthz)this.authzCallout;
+ accountString = groupAuthz.getGroupName(userDN);
+ }
+ catch (Exception e) {
+ logger.error("Problem getting group string. Are you sure you're using Group or SQL authz?");
+ logger.debug("full error: " + e);
+ }
+ }
+ else {
+
+ logger.error("'" + accountingType + "' isn't a valid accounting string type. "
+ + "No accounting string will be sent to PBS.");
+ }
+
+ return accountString;
+ }
}
View
22 ...ice/service/java/tests/suites/basic/home/services/etc/nimbus/workspace-service/pilot.conf
@@ -45,7 +45,13 @@ contact.socket=1.2.3.4:41999
#
################################################################################
-# The path to the pilot program on the VMM nodes:
+# The path to the pilot program on the VMM nodes.
+#
+# If you would like to use a configuration file, rather than the embedded
+# configuration, add the -p and path to your configuration file to your path.
+# For example:
+#
+# pilot.path=/opt/workspacepilot.py -p /etc/workspacepilot.conf
pilot.path=/opt/workspacepilot.py
@@ -79,12 +85,14 @@ pbs.submit.path=qsub
pbs.delete.path=qdel
-# Processors per node, right now this should be set to be the maximum processors
-# on each cluster node. If it set too high, pilot job submissions will fail.
-# If it is set too low, the pilot may end up not being the only LRM job on the
-# node at a time and that is unpredictable/unsupported right now.
-
-pbs.ppn=2
+# Processors per node. If this is set to 0, your pilot job will request
+# as many processors as are requested for a VM. For example, if a user requests
+# a 2 core VM, ppn will be set to 2.
+#
+# On some installations, you may wish to hardcode this to a specific value
+# to ensure that each pilot job reserves a whole node for a VM. In this case,
+# choose a non-zero value.
+pbs.ppn=0
# If the pilot job should be submitted to a special queue/server, configure
View
22 ...ice/java/tests/suites/spotinstances/home/services/etc/nimbus/workspace-service/pilot.conf
@@ -45,7 +45,13 @@ contact.socket=1.2.3.4:41999
#
################################################################################
-# The path to the pilot program on the VMM nodes:
+# The path to the pilot program on the VMM nodes.
+#
+# If you would like to use a configuration file, rather than the embedded
+# configuration, add the -p and path to your configuration file to your path.
+# For example:
+#
+# pilot.path=/opt/workspacepilot.py -p /etc/workspacepilot.conf
pilot.path=/opt/workspacepilot.py
@@ -79,12 +85,14 @@ pbs.submit.path=qsub
pbs.delete.path=qdel
-# Processors per node, right now this should be set to be the maximum processors
-# on each cluster node. If it set too high, pilot job submissions will fail.
-# If it is set too low, the pilot may end up not being the only LRM job on the
-# node at a time and that is unpredictable/unsupported right now.
-
-pbs.ppn=2
+# Processors per node. If this is set to 0, your pilot job will request
+# as many processors as are requested for a VM. For example, if a user requests
+# a 2 core VM, ppn will be set to 2.
+#
+# On some installations, you may wish to hardcode this to a specific value
+# to ensure that each pilot job reserves a whole node for a VM. In this case,
+# choose a non-zero value.
+pbs.ppn=0
# If the pilot job should be submitted to a special queue/server, configure
Something went wrong with that request. Please try again.