Conversation
| double score = score(offerHolder, stateCache, tasksPerOfferPerRequest, taskRequestHolder, getUsagesPerRequestTypePerSlave()); | ||
| if (score > 0) { | ||
| // todo: can short circuit here if score is high enough | ||
| scorePerOffer.put(offerHolder, score); |
There was a problem hiding this comment.
Thought we might want to have a value that's definitely good enough to just accept instead of continue evaluating
|
I like the idea of a scoring system overall. Some comments on specific logic things I'll make later since this is still WIP. Overall comments though:
|
|
|
||
| taskManager.createTaskAndDeletePendingTask(zkTask); | ||
| private double minScore(SingularityTaskRequest taskRequest) { | ||
| double minScore = 0.80; |
There was a problem hiding this comment.
this can be adjusted as necessary. I thought an 80% match might be a good starting point, but we could def reduce it
|
This has been updated as follows: BeforeAll offers would be scored and of all scores above AfterAll offers are still scored, but the minimum score acceptable depends on the task's overdue milliseconds and number of offers a task has not accepted. Currently, the overdue time and offer count have a max of The offer attempts count is any offer that was considered, not just offers that scored too low. So an offer that didn't have enough resources to satisfy the task will still be counted. I picked these numbers based on generalizations. We will likely have to tune these |
|
|
||
| if (!taskRequest.getRequest().isRackSensitive() && slavePlacement == SlavePlacement.GREEDY) { | ||
| // todo: account for this or let this behavior continue? | ||
| return SlaveMatchState.NOT_RACK_OR_SLAVE_PARTICULAR; |
There was a problem hiding this comment.
I didn't know if we would need to account for any rack sensitivity outside of the existing checks done in the scheduler
|
@ssalinas Got some specific logic tests in. Let me know if there's a piece I missed that should be added. I'm going to continue to try to get some full logic tests in as well |
|
|
||
| public static final String CPU_USED = "cpusUsed"; | ||
| public static final String MEMORY_BYTES_USED = "memoryRssBytes"; | ||
| public static final long BYTES_PER_MEGABYTE = 1024L * 1024L; |
There was a problem hiding this comment.
was about to comment that there must be some type of easy class/enum for this like there is with TimeUnit, but apparently there isn't... weird...
There was a problem hiding this comment.
Yeah, I was sad to see there wasn't a lib method for this too 😢
| } | ||
|
|
||
| public double getCpusUsedForRequestType(RequestType type) { | ||
| return usagePerRequestType.get(type).get(CPU_USED).doubleValue(); |
There was a problem hiding this comment.
Maybe another enum is more appropriate for CPU_USED/MEMORY_BYTES_USED ?
There was a problem hiding this comment.
agreed, mapping would be clearer then too 👍
| return usagePerRequestType; | ||
| } | ||
|
|
||
| public double getCpusUsedForRequestType(RequestType type) { |
There was a problem hiding this comment.
this and getMemBytesUsedForRequestType are unused methods
| @Override | ||
| public void runActionOnPoll() { | ||
| final long now = System.currentTimeMillis(); | ||
| Map<RequestType, Map<String, Number>> usagesPerRequestType = new HashMap<>(); |
There was a problem hiding this comment.
wouldn't we want this to be per-slave, not overall?
There was a problem hiding this comment.
This should be per slave. This poller loops through each slave and creates a new SingularitySlaveUsage with the stats for that slave
| Map<SingularityOfferHolder, Double> scorePerOffer = new HashMap<>(); | ||
| double minScore = minScore(taskRequestHolder.getTaskRequest(), offerMatchAttemptsPerTask, System.currentTimeMillis()); | ||
|
|
||
| LOG.info("Minimum score for task {} is {}", taskRequestHolder.getTaskRequest().getPendingTask().getPendingTaskId().getId(), minScore); |
There was a problem hiding this comment.
probably can be lower than info level here
| continue; | ||
| } | ||
|
|
||
| double score = score(offerHolder, stateCache, tasksPerOfferPerRequest, taskRequestHolder, getSlaveUsage(currentSlaveUsages, offerHolder.getOffer().getSlaveId().getValue())); |
There was a problem hiding this comment.
for clarity, maybe something like 'hostScore' here? The score is for the particular slave, not necessarily about the offer
There was a problem hiding this comment.
I'm not sure about the naming here. We do look at the slave's utilization to score the offer, but we are still scoring the offer itself since offers aren't uniquely 1:1 for a slave (e.g. 2 offers for the same slave).
The slave utilization weight will be the same for all offers on the same slave, but the offer resources will be different per offer. So, it seems to me that we're scoring the offer in this class rather than the slave itself
| @VisibleForTesting | ||
| double minScore(SingularityTaskRequest taskRequest, Map<String, Integer> offerMatchAttemptsPerTask, long now) { | ||
| double minScore = 0.80; | ||
| int maxOfferAttempts = 20; |
There was a problem hiding this comment.
another that would be nice to have ocnfigurable
| final SingularityTask task = mesosTaskBuilder.buildTask(offerHolder.getOffer(), offerHolder.getCurrentResources(), taskRequest, taskRequestHolder.getTaskResources(), taskRequestHolder.getExecutorResources()); | ||
| @VisibleForTesting | ||
| double score(Offer offer, SingularityTaskRequest taskRequest, Optional<SingularitySlaveUsageWithId> maybeSlaveUsage) { | ||
| double requestTypeCpuWeight = 0.20; |
There was a problem hiding this comment.
Let's make these configurable, maybe another object in the configuration yaml?
There was a problem hiding this comment.
Yup, I was in progress on this (now committed), but I kept the fields under SingularityConfiguration since I saw a lot of other stuff in there as well (e.g. caching). We could pull it into an OfferConfiguration file if you think that'd be better for organization
| if (matchesResources && slaveMatchState.isMatchAllowed()) { | ||
| final SingularityTask task = mesosTaskBuilder.buildTask(offerHolder.getOffer(), offerHolder.getCurrentResources(), taskRequest, taskRequestHolder.getTaskResources(), taskRequestHolder.getExecutorResources()); | ||
| @VisibleForTesting | ||
| double score(Offer offer, SingularityTaskRequest taskRequest, Optional<SingularitySlaveUsageWithId> maybeSlaveUsage) { |
There was a problem hiding this comment.
Let's go over this one in-person, think we are getting close, just easier to chat than typing a novel in github ;)
|
|
||
| private static final String SLAVE_PATH = ROOT_PATH + "/slaves"; | ||
| private static final String TASK_PATH = ROOT_PATH + "/tasks"; | ||
| private static final String USAGE_SUMMARY_PATH = ROOT_PATH + "/summary"; |
There was a problem hiding this comment.
@ssalinas can you take a look at this piece? Just wanna make sure I got this set up right
| } | ||
|
|
||
| public SingularityCreateResult saveSpecificClusterUtilization(SingularityClusterUtilization utilization) { | ||
| return save(getSpecificClusterUtilizationPath(utilization.getTimestamp()) , utilization, clusterUtilizationTranscoder); |
There was a problem hiding this comment.
is there any point at which we would want a history of these? If we don't need a history of summaries, we might as well save the data to the summary path (without timestamp) and just overwrite when there is new data.
There was a problem hiding this comment.
I didn't see a reason to at this point. Saw the others were saving them so it seemed okay since we only save up to 5 points. I think it's safe to just have the one point. Keeps it simpler
| return ZKPaths.makePath(getSlaveUsagePath(slaveId), CURRENT_USAGE_NODE_KEY); | ||
| } | ||
|
|
||
| private String getSpecificClusterUtilizationPath(long timestamp) { |
There was a problem hiding this comment.
why SpecificClusterUtilization instead of just getUsageSummaryPath or something like that?
There was a problem hiding this comment.
the "specific" keyword was the pattern the other items were using so I kept that since the timestamp specified which one to grab. I can drop the historical data and then rename it
| } catch (InvalidSingularityTaskIdException e) { | ||
| LOG.error("Couldn't get SingularityTaskId for {}", taskUsage); | ||
| continue; | ||
| } |
There was a problem hiding this comment.
added this try catch for the potentially incorrect taskId types
BeforeMin score was configurable by the user and would be reduced as tasks were delayed and offer match attempts were rejected. AfterMin score is calculated based on the overall utilization of the cluster. |
|
|
||
| while (!pendingTaskIdToTaskRequest.isEmpty() && addedTaskInLastLoop && canScheduleAdditionalTasks(taskCredits)) { | ||
| addedTaskInLastLoop = false; | ||
| double maxTaskMillisPastDue = maxTaskMillisPastDue(SingularityScheduledTasksInfo.getInfo(taskManager.getPendingTasks(), configuration.getDeltaAfterWhichTasksAreLateMillis()).getMaxTaskLagMillis()); |
There was a problem hiding this comment.
I think we'll need a better name for this variable. It's the max possible task lag before we decide we're going to take any offer we can get. I feel maxTaskMillisPastDue has some confusing overlap with maxTaskLag; which is the current highest lag time for pending tasks
BeforeWe had a configured value for the max possible task lag before we accepted any offer that matched a task. The default was 5 minutes. AfterThe max possible task lag ( A task with no lag will result in a 3 min Since we reduce the It's also important to note that we will actually start accepting any matching offer (minScore == 0) before the |
|
@darcatron I think this one is good to merge. Going to get this in master so we can continue to build off of the resource usage updates without endless merge conflicts 👍 . Thanks for all the work on this one |
🚧 This is a WIP for task balancing.
The general idea is that a pendingTask will calculate the best offer by scoring each offer and choosing the best score. Right now, the top score is 1.00 assuming the slave has nothing on it.
Score is weighted based on 2 real criteria: current resource usage for the same request type and current resource availability. I choose weights based on what I thought might be important, but it can be changed. I thought mem would be the most important so I weighted it a bit higher:
This only scores based on what's running on the slave. It does not look at the acceptedPendingTasks for an offer.
@ssalinas