New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine offers to schedule tasks more efficiently #1561

Merged
merged 18 commits into from Jun 23, 2017

Conversation

Projects
None yet
3 participants
@ssalinas
Member

ssalinas commented Jun 7, 2017

This updates the scheduler to group offers by host and use the total resources available on all offers when scheduling tasks. Especially when using the offer cache, it has been common for us to have 10+ offers in the cache at a time from a single host. For tasks that require larger amounts of resources, this should get them scheduled much more quickly.

As far as I can tell, the mesos master will add up all resources on it's side when a task or tasks are scheduled using multiple offers. So, this code is not currently making an effort to match each task with a portion of the resources available, it is launching all tasks with all offers for the host. It could be a future improvement to save chunks of unused offers so the offer cache could hold on to them for the next run.

@ssalinas ssalinas modified the milestone: 0.16.0 Jun 8, 2017

@ssalinas ssalinas added the hs_staging label Jun 9, 2017

ssalinas added some commits Jun 9, 2017

@ssalinas ssalinas added the hs_qa label Jun 9, 2017

this.offers = offers;
this.roles = MesosUtils.getRoles(offers.get(0));
this.acceptedTasks = Lists.newArrayListWithExpectedSize(taskSizeHint);
this.currentResources = offers.size() > 1 ? MesosUtils.combineResources(offers.stream().map(Protos.Offer::getResourcesList).collect(Collectors.toList())) : offers.get(0).getResourcesList();

This comment has been minimized.

@darcatron

darcatron Jun 9, 2017

Contributor

Wonder if it's possible to have two currentResources for this? One for the individual offers (as originally given by mesos) and one for the collection of same host offers? The check could then try all the single offers and check the combined offers if no single offers were good enough.

It could help launch more tasks if the smaller offers were used instead of declined. The tough part would be that the combinedCurrentResources would need to be calculated each time before it's checked since some single offers could be utilized for a task in a later iteration.

@darcatron

darcatron Jun 9, 2017

Contributor

Wonder if it's possible to have two currentResources for this? One for the individual offers (as originally given by mesos) and one for the collection of same host offers? The check could then try all the single offers and check the combined offers if no single offers were good enough.

It could help launch more tasks if the smaller offers were used instead of declined. The tough part would be that the combinedCurrentResources would need to be calculated each time before it's checked since some single offers could be utilized for a task in a later iteration.

This comment has been minimized.

@ssalinas

ssalinas Jun 9, 2017

Member

I think a more general solution might be to sort out which offers to accept/decline/cache when launching the tasks at the end. i.e. if we have an offers of 2,3 and 4 cpus, and matched tasks that use 3 and 3, we can hold on to the 3 cpu offer, and use the other two. No reason to calculate that as we go along since the total resources is the same, but we can certainly do it at the end with the goal of maximizing usage of the current offers.

#lets-do-the-math

@ssalinas

ssalinas Jun 9, 2017

Member

I think a more general solution might be to sort out which offers to accept/decline/cache when launching the tasks at the end. i.e. if we have an offers of 2,3 and 4 cpus, and matched tasks that use 3 and 3, we can hold on to the 3 cpu offer, and use the other two. No reason to calculate that as we go along since the total resources is the same, but we can certainly do it at the end with the goal of maximizing usage of the current offers.

#lets-do-the-math

Show outdated Hide outdated .../src/main/java/com/hubspot/singularity/mesos/SingularityOfferHolder.java
List<Offer> neededOffers = offers.stream().filter(o -> {
List<Resource> remainingAfterSavingOffer = MesosUtils.subtractResources(currentResources, o.getResourcesList());
if (MesosUtils.allResourceCountsNonNegative(remainingAfterSavingOffer)) {
cache.cacheOffer(driver, System.currentTimeMillis(), o); // TODO: do we need this timestamp to be something specific, e.g. the time at which we got the offer originally?

This comment has been minimized.

@ssalinas

ssalinas Jun 15, 2017

Member

keep in mind here, the offer may already be from the cache to start with, we might need to be returning the offer rather than caching it here

@ssalinas

ssalinas Jun 15, 2017

Member

keep in mind here, the offer may already be from the cache to start with, we might need to be returning the offer rather than caching it here

@ssalinas ssalinas merged commit 0d09f7e into master Jun 23, 2017

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@ssalinas ssalinas deleted the offer_combination branch Jun 23, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment