New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better task balancing #1482

Merged
merged 73 commits into from Jun 8, 2017

Conversation

Projects
None yet
2 participants
@darcatron
Contributor

darcatron commented Mar 30, 2017

馃毀 This is a WIP for task balancing.

The general idea is that a pendingTask will calculate the best offer by scoring each offer and choosing the best score. Right now, the top score is 1.00 assuming the slave has nothing on it.

Score is weighted based on 2 real criteria: current resource usage for the same request type and current resource availability. I choose weights based on what I thought might be important, but it can be changed. I thought mem would be the most important so I weighted it a bit higher:

requestTypeCpuWeight = 0.20;
requestTypeMemWeight = 0.30;
freeCpuWeight = 0.20;
freeMemWeight = 0.30;

This only scores based on what's running on the slave. It does not look at the acceptedPendingTasks for an offer.

@ssalinas

Show outdated Hide outdated ...n/java/com/hubspot/singularity/mesos/SingularityMesosOfferScheduler.java
double score = score(offerHolder, stateCache, tasksPerOfferPerRequest, taskRequestHolder, getUsagesPerRequestTypePerSlave());
if (score > 0) {
// todo: can short circuit here if score is high enough
scorePerOffer.put(offerHolder, score);

This comment has been minimized.

@darcatron

darcatron Mar 30, 2017

Contributor

Thought we might want to have a value that's definitely good enough to just accept instead of continue evaluating

@darcatron

darcatron Mar 30, 2017

Contributor

Thought we might want to have a value that's definitely good enough to just accept instead of continue evaluating

@ssalinas

This comment has been minimized.

Show comment
Hide comment
@ssalinas

ssalinas Mar 30, 2017

Member

I like the idea of a scoring system overall. Some comments on specific logic things I'll make later since this is still WIP. Overall comments though:

  • The time past due for a task should probably also factor in to the scoring. (i.e. if it was supposed to run 5 minutes ago, the definition of 'good enough' is different than if it was supposed to start 2 seconds ago)
  • There should probably also be a measure of how many offers we have looked at while trying to schedule this task. We may only get 1,2 etc offers at a time and not have a wholistic view of resources. So if we have looked at a number of them already, the bar of 'good enough' should start to get lower.
  • We should definitely be aware of computation time. The offer processing loop is already one of our slower areas. Anything we can do to pre-process this data and require less calculation at offer evaluation time will be a big help
Member

ssalinas commented Mar 30, 2017

I like the idea of a scoring system overall. Some comments on specific logic things I'll make later since this is still WIP. Overall comments though:

  • The time past due for a task should probably also factor in to the scoring. (i.e. if it was supposed to run 5 minutes ago, the definition of 'good enough' is different than if it was supposed to start 2 seconds ago)
  • There should probably also be a measure of how many offers we have looked at while trying to schedule this task. We may only get 1,2 etc offers at a time and not have a wholistic view of resources. So if we have looked at a number of them already, the bar of 'good enough' should start to get lower.
  • We should definitely be aware of computation time. The offer processing loop is already one of our slower areas. Anything we can do to pre-process this data and require less calculation at offer evaluation time will be a big help
Show outdated Hide outdated ...n/java/com/hubspot/singularity/mesos/SingularityMesosOfferScheduler.java
taskManager.createTaskAndDeletePendingTask(zkTask);
private double minScore(SingularityTaskRequest taskRequest) {
double minScore = 0.80;

This comment has been minimized.

@darcatron

darcatron Apr 5, 2017

Contributor

this can be adjusted as necessary. I thought an 80% match might be a good starting point, but we could def reduce it

@darcatron

darcatron Apr 5, 2017

Contributor

this can be adjusted as necessary. I thought an 80% match might be a good starting point, but we could def reduce it

@darcatron

This comment has been minimized.

Show comment
Hide comment
@darcatron

darcatron Apr 5, 2017

Contributor

This has been updated as follows:

Before

All offers would be scored and of all scores above 0, the best scored offer would be accepted by the task

After

All offers are still scored, but the minimum score acceptable depends on the task's overdue milliseconds and number of offers a task has not accepted.

Currently, the overdue time and offer count have a max of 10 min and 20 attempts, respectively. The min score is based on the ratio of curOverdueTime:maxOverdueTime and curAttempts:maxAttempts

The offer attempts count is any offer that was considered, not just offers that scored too low. So an offer that didn't have enough resources to satisfy the task will still be counted.

I picked these numbers based on generalizations. We will likely have to tune these

Contributor

darcatron commented Apr 5, 2017

This has been updated as follows:

Before

All offers would be scored and of all scores above 0, the best scored offer would be accepted by the task

After

All offers are still scored, but the minimum score acceptable depends on the task's overdue milliseconds and number of offers a task has not accepted.

Currently, the overdue time and offer count have a max of 10 min and 20 attempts, respectively. The min score is based on the ratio of curOverdueTime:maxOverdueTime and curAttempts:maxAttempts

The offer attempts count is any offer that was considered, not just offers that scored too low. So an offer that didn't have enough resources to satisfy the task will still be counted.

I picked these numbers based on generalizations. We will likely have to tune these

return SlaveMatchState.SLAVE_ATTRIBUTES_DO_NOT_MATCH;
}
final SlavePlacement slavePlacement = taskRequest.getRequest().getSlavePlacement().or(configuration.getDefaultSlavePlacement());
if (!taskRequest.getRequest().isRackSensitive() && slavePlacement == SlavePlacement.GREEDY) {
// todo: account for this or let this behavior continue?
return SlaveMatchState.NOT_RACK_OR_SLAVE_PARTICULAR;

This comment has been minimized.

@darcatron

darcatron Apr 5, 2017

Contributor

I didn't know if we would need to account for any rack sensitivity outside of the existing checks done in the scheduler

@darcatron

darcatron Apr 5, 2017

Contributor

I didn't know if we would need to account for any rack sensitivity outside of the existing checks done in the scheduler

@darcatron

This comment has been minimized.

Show comment
Hide comment
@darcatron

darcatron Apr 14, 2017

Contributor

@ssalinas Got some specific logic tests in. Let me know if there's a piece I missed that should be added. I'm going to continue to try to get some full logic tests in as well

Contributor

darcatron commented Apr 14, 2017

@ssalinas Got some specific logic tests in. Let me know if there's a piece I missed that should be added. I'm going to continue to try to get some full logic tests in as well

Show outdated Hide outdated ...ityBase/src/main/java/com/hubspot/singularity/SingularitySlaveUsage.java
import com.fasterxml.jackson.annotation.JsonCreator;
import com.fasterxml.jackson.annotation.JsonProperty;
public class SingularitySlaveUsage {
public static final String CPU_USED = "cpusUsed";
public static final String MEMORY_BYTES_USED = "memoryRssBytes";
public static final long BYTES_PER_MEGABYTE = 1024L * 1024L;

This comment has been minimized.

@ssalinas

ssalinas Apr 20, 2017

Member

was about to comment that there must be some type of easy class/enum for this like there is with TimeUnit, but apparently there isn't... weird...

@ssalinas

ssalinas Apr 20, 2017

Member

was about to comment that there must be some type of easy class/enum for this like there is with TimeUnit, but apparently there isn't... weird...

This comment has been minimized.

@darcatron

darcatron Apr 20, 2017

Contributor

Yeah, I was sad to see there wasn't a lib method for this too 馃槩

@darcatron

darcatron Apr 20, 2017

Contributor

Yeah, I was sad to see there wasn't a lib method for this too 馃槩

Show outdated Hide outdated ...ityBase/src/main/java/com/hubspot/singularity/SingularitySlaveUsage.java
}
public double getCpusUsedForRequestType(RequestType type) {
return usagePerRequestType.get(type).get(CPU_USED).doubleValue();

This comment has been minimized.

@ssalinas

ssalinas Apr 20, 2017

Member

Maybe another enum is more appropriate for CPU_USED/MEMORY_BYTES_USED ?

@ssalinas

ssalinas Apr 20, 2017

Member

Maybe another enum is more appropriate for CPU_USED/MEMORY_BYTES_USED ?

This comment has been minimized.

@darcatron

darcatron Apr 20, 2017

Contributor

agreed, mapping would be clearer then too 馃憤

@darcatron

darcatron Apr 20, 2017

Contributor

agreed, mapping would be clearer then too 馃憤

Show outdated Hide outdated ...ityBase/src/main/java/com/hubspot/singularity/SingularitySlaveUsage.java
return usagePerRequestType;
}
public double getCpusUsedForRequestType(RequestType type) {

This comment has been minimized.

@ssalinas

ssalinas Apr 20, 2017

Member

this and getMemBytesUsedForRequestType are unused methods

@ssalinas

ssalinas Apr 20, 2017

Member

this and getMemBytesUsedForRequestType are unused methods

Show outdated Hide outdated .../main/java/com/hubspot/singularity/scheduler/SingularityUsagePoller.java
}
@Override
public void runActionOnPoll() {
final long now = System.currentTimeMillis();
Map<RequestType, Map<String, Number>> usagesPerRequestType = new HashMap<>();

This comment has been minimized.

@ssalinas

ssalinas Apr 20, 2017

Member

wouldn't we want this to be per-slave, not overall?

@ssalinas

ssalinas Apr 20, 2017

Member

wouldn't we want this to be per-slave, not overall?

This comment has been minimized.

@darcatron

darcatron Apr 20, 2017

Contributor

This should be per slave. This poller loops through each slave and creates a new SingularitySlaveUsage with the stats for that slave

@darcatron

darcatron Apr 20, 2017

Contributor

This should be per slave. This poller loops through each slave and creates a new SingularitySlaveUsage with the stats for that slave

Show outdated Hide outdated ...n/java/com/hubspot/singularity/mesos/SingularityMesosOfferScheduler.java
Map<SingularityOfferHolder, Double> scorePerOffer = new HashMap<>();
double minScore = minScore(taskRequestHolder.getTaskRequest(), offerMatchAttemptsPerTask, System.currentTimeMillis());
LOG.info("Minimum score for task {} is {}", taskRequestHolder.getTaskRequest().getPendingTask().getPendingTaskId().getId(), minScore);

This comment has been minimized.

@ssalinas

ssalinas Apr 20, 2017

Member

probably can be lower than info level here

@ssalinas

ssalinas Apr 20, 2017

Member

probably can be lower than info level here

Show outdated Hide outdated ...n/java/com/hubspot/singularity/mesos/SingularityMesosOfferScheduler.java
continue;
}
double score = score(offerHolder, stateCache, tasksPerOfferPerRequest, taskRequestHolder, getSlaveUsage(currentSlaveUsages, offerHolder.getOffer().getSlaveId().getValue()));

This comment has been minimized.

@ssalinas

ssalinas Apr 20, 2017

Member

for clarity, maybe something like 'hostScore' here? The score is for the particular slave, not necessarily about the offer

@ssalinas

ssalinas Apr 20, 2017

Member

for clarity, maybe something like 'hostScore' here? The score is for the particular slave, not necessarily about the offer

This comment has been minimized.

@darcatron

darcatron Apr 20, 2017

Contributor

I'm not sure about the naming here. We do look at the slave's utilization to score the offer, but we are still scoring the offer itself since offers aren't uniquely 1:1 for a slave (e.g. 2 offers for the same slave).

The slave utilization weight will be the same for all offers on the same slave, but the offer resources will be different per offer. So, it seems to me that we're scoring the offer in this class rather than the slave itself

@darcatron

darcatron Apr 20, 2017

Contributor

I'm not sure about the naming here. We do look at the slave's utilization to score the offer, but we are still scoring the offer itself since offers aren't uniquely 1:1 for a slave (e.g. 2 offers for the same slave).

The slave utilization weight will be the same for all offers on the same slave, but the offer resources will be different per offer. So, it seems to me that we're scoring the offer in this class rather than the slave itself

Show outdated Hide outdated ...n/java/com/hubspot/singularity/mesos/SingularityMesosOfferScheduler.java
@VisibleForTesting
double minScore(SingularityTaskRequest taskRequest, Map<String, Integer> offerMatchAttemptsPerTask, long now) {
double minScore = 0.80;
int maxOfferAttempts = 20;

This comment has been minimized.

@ssalinas

ssalinas Apr 20, 2017

Member

another that would be nice to have ocnfigurable

@ssalinas

ssalinas Apr 20, 2017

Member

another that would be nice to have ocnfigurable

Show outdated Hide outdated ...n/java/com/hubspot/singularity/mesos/SingularityMesosOfferScheduler.java
final SingularityTask task = mesosTaskBuilder.buildTask(offerHolder.getOffer(), offerHolder.getCurrentResources(), taskRequest, taskRequestHolder.getTaskResources(), taskRequestHolder.getExecutorResources());
@VisibleForTesting
double score(Offer offer, SingularityTaskRequest taskRequest, Optional<SingularitySlaveUsageWithId> maybeSlaveUsage) {
double requestTypeCpuWeight = 0.20;

This comment has been minimized.

@ssalinas

ssalinas Apr 20, 2017

Member

Let's make these configurable, maybe another object in the configuration yaml?

@ssalinas

ssalinas Apr 20, 2017

Member

Let's make these configurable, maybe another object in the configuration yaml?

This comment has been minimized.

@darcatron

darcatron Apr 20, 2017

Contributor

Yup, I was in progress on this (now committed), but I kept the fields under SingularityConfiguration since I saw a lot of other stuff in there as well (e.g. caching). We could pull it into an OfferConfiguration file if you think that'd be better for organization

@darcatron

darcatron Apr 20, 2017

Contributor

Yup, I was in progress on this (now committed), but I kept the fields under SingularityConfiguration since I saw a lot of other stuff in there as well (e.g. caching). We could pull it into an OfferConfiguration file if you think that'd be better for organization

Show outdated Hide outdated ...n/java/com/hubspot/singularity/mesos/SingularityMesosOfferScheduler.java
if (matchesResources && slaveMatchState.isMatchAllowed()) {
final SingularityTask task = mesosTaskBuilder.buildTask(offerHolder.getOffer(), offerHolder.getCurrentResources(), taskRequest, taskRequestHolder.getTaskResources(), taskRequestHolder.getExecutorResources());
@VisibleForTesting
double score(Offer offer, SingularityTaskRequest taskRequest, Optional<SingularitySlaveUsageWithId> maybeSlaveUsage) {

This comment has been minimized.

@ssalinas

ssalinas Apr 20, 2017

Member

Let's go over this one in-person, think we are getting close, just easier to chat than typing a novel in github ;)

@ssalinas

ssalinas Apr 20, 2017

Member

Let's go over this one in-person, think we are getting close, just easier to chat than typing a novel in github ;)

Show outdated Hide outdated .../main/java/com/hubspot/singularity/scheduler/SingularityUsagePoller.java
updateUsagesPerRequestType(usagesPerRequestType, getRequestType(taskUsage), usage.getMemoryRssBytes(), taskCpusUsed);
if (getRequestType(taskUsage).isLongRunning() || isConsideredLongRunning(taskUsage)) {
updateLongRunningTasksUsage(longRunningTasksUsage, usage.getMemoryRssBytes(), taskCpusUsed);
}

This comment has been minimized.

@darcatron

darcatron Apr 27, 2017

Contributor

Added this to also include the usage for tasks that are non long running, but run for a considerable amount of time to warrant adding their usage

@darcatron

darcatron Apr 27, 2017

Contributor

Added this to also include the usage for tasks that are non long running, but run for a considerable amount of time to warrant adding their usage

This comment has been minimized.

@darcatron

darcatron Apr 27, 2017

Contributor

This is not outdated GitHub.....
bitmoji

@darcatron

darcatron Apr 27, 2017

Contributor

This is not outdated GitHub.....
bitmoji

Show outdated Hide outdated ...n/java/com/hubspot/singularity/mesos/SingularityMesosOfferScheduler.java
// usage reduces score
return calculateScore(longRunningMemUsedScore, memTotalScore, longRunningCpusUsedScore, cpusTotalScore, freeResourceWeight, usedResourceWeight * -1);
}

This comment has been minimized.

@darcatron

darcatron Apr 27, 2017

Contributor

These changes will be easiest to discuss in person.

Essentially, long running tasks consider an offer 50:50 for usage and free space. non long running offers vary depending on how long they've been running with a default at the middle (25:75). non long running offers favor free space and will score lower if they run for a large amount of time and there are long running tasks on the offered machine

@darcatron

darcatron Apr 27, 2017

Contributor

These changes will be easiest to discuss in person.

Essentially, long running tasks consider an offer 50:50 for usage and free space. non long running offers vary depending on how long they've been running with a default at the middle (25:75). non long running offers favor free space and will score lower if they run for a large amount of time and there are long running tasks on the offered machine

Show outdated Hide outdated ...va/com/hubspot/singularity/mesos/SingularityMesosOfferSchedulerTest.java
setDeployStatistics(TimeUnit.MINUTES, 5);
longRunningTasksUsage.put(ResourceUsageType.CPU_USED, 0);
longRunningTasksUsage.put(ResourceUsageType.MEMORY_BYTES_USED, 0);
assertScoreIs(.993, scheduler.score(getOffer(10, 10, slaveId), taskRequest, Optional.of(getUsage(10, 10, longRunningTasksUsage))));

This comment has been minimized.

@darcatron

darcatron Apr 27, 2017

Contributor

something interesting to note here. A perfect offer for non long running tasks would be a task that runs for 0 seconds on an empty slave. Anything after that will reduce the score. This doesn't hurt us b/c it'll score each offer in the same regard so an emptier slave will score higher as we might expect, but it will likely mean that non long running tasks should have a lower minimum score. They are less picky than long running tasks

@darcatron

darcatron Apr 27, 2017

Contributor

something interesting to note here. A perfect offer for non long running tasks would be a task that runs for 0 seconds on an empty slave. Anything after that will reduce the score. This doesn't hurt us b/c it'll score each offer in the same regard so an emptier slave will score higher as we might expect, but it will likely mean that non long running tasks should have a lower minimum score. They are less picky than long running tasks

Show outdated Hide outdated ...c/main/java/com/hubspot/singularity/config/SingularityConfiguration.java
private int maxOfferAttemptsPerTask = 20;
private long maxMillisPastDuePerTask = TimeUnit.MINUTES.toMillis(10);

This comment has been minimized.

@ssalinas

ssalinas Apr 27, 2017

Member

thinking something shorter would be a better default. We can check in testing, but I think after even 5 minutes we shouldn't be worrying about score and just scheduling asap

@ssalinas

ssalinas Apr 27, 2017

Member

thinking something shorter would be a better default. We can check in testing, but I think after even 5 minutes we shouldn't be worrying about score and just scheduling asap

@darcatron darcatron added the hs_qa label May 18, 2017

@@ -31,22 +32,30 @@
private static final String SLAVE_PATH = ROOT_PATH + "/slaves";
private static final String TASK_PATH = ROOT_PATH + "/tasks";
private static final String USAGE_SUMMARY_PATH = ROOT_PATH + "/summary";

This comment has been minimized.

@darcatron

darcatron May 23, 2017

Contributor

@ssalinas can you take a look at this piece? Just wanna make sure I got this set up right

@darcatron

darcatron May 23, 2017

Contributor

@ssalinas can you take a look at this piece? Just wanna make sure I got this set up right

Show outdated Hide outdated ...rityService/src/main/java/com/hubspot/singularity/data/UsageManager.java
@@ -123,6 +140,10 @@ public SingularityCreateResult saveSpecificTaskUsage(String taskId, SingularityT
return save(getSpecificTaskUsagePath(taskId, usage.getTimestamp()), usage, taskUsageTranscoder);
}
public SingularityCreateResult saveSpecificClusterUtilization(SingularityClusterUtilization utilization) {
return save(getSpecificClusterUtilizationPath(utilization.getTimestamp()) , utilization, clusterUtilizationTranscoder);

This comment has been minimized.

@ssalinas

ssalinas May 23, 2017

Member

is there any point at which we would want a history of these? If we don't need a history of summaries, we might as well save the data to the summary path (without timestamp) and just overwrite when there is new data.

@ssalinas

ssalinas May 23, 2017

Member

is there any point at which we would want a history of these? If we don't need a history of summaries, we might as well save the data to the summary path (without timestamp) and just overwrite when there is new data.

This comment has been minimized.

@darcatron

darcatron May 23, 2017

Contributor

I didn't see a reason to at this point. Saw the others were saving them so it seemed okay since we only save up to 5 points. I think it's safe to just have the one point. Keeps it simpler

@darcatron

darcatron May 23, 2017

Contributor

I didn't see a reason to at this point. Saw the others were saving them so it seemed okay since we only save up to 5 points. I think it's safe to just have the one point. Keeps it simpler

Show outdated Hide outdated ...rityService/src/main/java/com/hubspot/singularity/data/UsageManager.java
@@ -95,6 +104,10 @@ private String getCurrentSlaveUsagePath(String slaveId) {
return ZKPaths.makePath(getSlaveUsagePath(slaveId), CURRENT_USAGE_NODE_KEY);
}
private String getSpecificClusterUtilizationPath(long timestamp) {

This comment has been minimized.

@ssalinas

ssalinas May 23, 2017

Member

why SpecificClusterUtilization instead of just getUsageSummaryPath or something like that?

@ssalinas

ssalinas May 23, 2017

Member

why SpecificClusterUtilization instead of just getUsageSummaryPath or something like that?

This comment has been minimized.

@darcatron

darcatron May 23, 2017

Contributor

the "specific" keyword was the pattern the other items were using so I kept that since the timestamp specified which one to grab. I can drop the historical data and then rename it

@darcatron

darcatron May 23, 2017

Contributor

the "specific" keyword was the pattern the other items were using so I kept that since the timestamp specified which one to grab. I can drop the historical data and then rename it

} catch (InvalidSingularityTaskIdException e) {
LOG.error("Couldn't get SingularityTaskId for {}", taskUsage);
continue;
}

This comment has been minimized.

@darcatron

darcatron May 23, 2017

Contributor

added this try catch for the potentially incorrect taskId types

@darcatron

darcatron May 23, 2017

Contributor

added this try catch for the potentially incorrect taskId types

@darcatron

This comment has been minimized.

Show comment
Hide comment
@darcatron

darcatron May 23, 2017

Contributor

Before

Min score was configurable by the user and would be reduced as tasks were delayed and offer match attempts were rejected.

After

Min score is calculated based on the overall utilization of the cluster. memUsed / memTotal and cpuUsed / cpuTotal. This will help give us a min score closer to the actual offer scores. Task delays and match attempts will still reduce the min score

Contributor

darcatron commented May 23, 2017

Before

Min score was configurable by the user and would be reduced as tasks were delayed and offer match attempts were rejected.

After

Min score is calculated based on the overall utilization of the cluster. memUsed / memTotal and cpuUsed / cpuTotal. This will help give us a min score closer to the actual offer scores. Task delays and match attempts will still reduce the min score

Show outdated Hide outdated ...n/java/com/hubspot/singularity/mesos/SingularityMesosOfferScheduler.java
@@ -130,12 +131,13 @@ public SingularityMesosOfferScheduler(MesosConfiguration mesosConfiguration,
while (!pendingTaskIdToTaskRequest.isEmpty() && addedTaskInLastLoop && canScheduleAdditionalTasks(taskCredits)) {
addedTaskInLastLoop = false;
double maxTaskMillisPastDue = maxTaskMillisPastDue(SingularityScheduledTasksInfo.getInfo(taskManager.getPendingTasks(), configuration.getDeltaAfterWhichTasksAreLateMillis()).getMaxTaskLagMillis());

This comment has been minimized.

@darcatron

darcatron May 31, 2017

Contributor

I think we'll need a better name for this variable. It's the max possible task lag before we decide we're going to take any offer we can get. I feel maxTaskMillisPastDue has some confusing overlap with maxTaskLag; which is the current highest lag time for pending tasks

@darcatron

darcatron May 31, 2017

Contributor

I think we'll need a better name for this variable. It's the max possible task lag before we decide we're going to take any offer we can get. I feel maxTaskMillisPastDue has some confusing overlap with maxTaskLag; which is the current highest lag time for pending tasks

@darcatron

This comment has been minimized.

Show comment
Hide comment
@darcatron

darcatron May 31, 2017

Contributor

Before

We had a configured value for the max possible task lag before we accepted any offer that matched a task. The default was 5 minutes.

After

The max possible task lag (maxTaskMillisPastDue) is determined by a simple point-slope formula that uses the maxTaskLag and the existing deltaAfterWhichTasksAreLateMillis.
maxTaskMillisPastDue = (-180 000 / deltaAfterWhichTasksAreLateMillis) * maxTaskLag + 180 000 where 180,000 is 3 min in milliseconds

A task with no lag will result in a 3 min maxTaskMillisPastDue. Any lag will linearly reduce the maxTaskMillisPastDue. The minimum possible maxTaskMillisPastDue is 1 millisecond.

Since we reduce the minScore by how long a task is past due: higher maxTaskLag => lower maxTaskMillisPastDue => lower minScore

It's also important to note that we will actually start accepting any matching offer (minScore == 0) before the maxTaskLag reaches deltaAfterWhichTasksAreLateMillis. We can use this formula to see what the values would be for different deltas.
180 000 = l + (180 000 / d) * l where l = the task lag at which we'll accept any offer and d = deltaAfterWhichTasksAreLateMillis

Contributor

darcatron commented May 31, 2017

Before

We had a configured value for the max possible task lag before we accepted any offer that matched a task. The default was 5 minutes.

After

The max possible task lag (maxTaskMillisPastDue) is determined by a simple point-slope formula that uses the maxTaskLag and the existing deltaAfterWhichTasksAreLateMillis.
maxTaskMillisPastDue = (-180 000 / deltaAfterWhichTasksAreLateMillis) * maxTaskLag + 180 000 where 180,000 is 3 min in milliseconds

A task with no lag will result in a 3 min maxTaskMillisPastDue. Any lag will linearly reduce the maxTaskMillisPastDue. The minimum possible maxTaskMillisPastDue is 1 millisecond.

Since we reduce the minScore by how long a task is past due: higher maxTaskLag => lower maxTaskMillisPastDue => lower minScore

It's also important to note that we will actually start accepting any matching offer (minScore == 0) before the maxTaskLag reaches deltaAfterWhichTasksAreLateMillis. We can use this formula to see what the values would be for different deltas.
180 000 = l + (180 000 / d) * l where l = the task lag at which we'll accept any offer and d = deltaAfterWhichTasksAreLateMillis

darcatron added some commits Jun 1, 2017

@ssalinas

This comment has been minimized.

Show comment
Hide comment
@ssalinas

ssalinas Jun 8, 2017

Member

@darcatron I think this one is good to merge. Going to get this in master so we can continue to build off of the resource usage updates without endless merge conflicts 馃憤 . Thanks for all the work on this one

Member

ssalinas commented Jun 8, 2017

@darcatron I think this one is good to merge. Going to get this in master so we can continue to build off of the resource usage updates without endless merge conflicts 馃憤 . Thanks for all the work on this one

@ssalinas ssalinas merged commit af223b1 into master Jun 8, 2017

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@ssalinas ssalinas deleted the task-juggling branch Jun 8, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment