New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Marathon should set a non-default filter when declining offers (on master) #1931

Closed
ConnorDoyle opened this Issue Aug 4, 2015 · 9 comments

Comments

Projects
None yet
7 participants
@ConnorDoyle
Contributor

ConnorDoyle commented Aug 4, 2015

Use case: As a cluster operator I would like to run many frameworks (possibly multiple Marathon instances) on the same Mesos cluster. I'd like to mitigate offer starvation; that is, schedulers with work to do should receive offers from the Mesos master.

From the SchedulerDriver documentation, Marathon can use both the overloaded definition of declineOffer to set filters:

  /**
   * Declines an offer in its entirety and applies the specified
   * filters on the resources (see mesos.proto for a description of
   * Filters). Note that this can be done at any time, it is not
   * necessary to do this within the {@link Scheduler#resourceOffers}
   * callback.
   *
   * @param offerId The ID of the offer to be declined.
   * @param filters The filters to set for any remaining resources.
   *
   * @return        The state of the driver after the call.
   *
   * @see OfferID
   * @see Filters
   * @see Status
   */
  Status declineOffer(OfferID offerId, Filters filters);

and clear them later on.

  /**
   * Removes all filters, previously set by the framework (via {@link
   * #launchTasks}). This enables the framework to receive offers
   * from those filtered slaves.
   *
   * @return    The state of the driver after the call.
   *
   * @see Status
   */
  Status reviveOffers();

Proposal: When the task queue becomes empty, decline subsequent offers with a long timeout (say, max double value). When the task queue becomes non-empty, invoke reviveOffers. (per offline conversation with @kolloch, @aquamatthias, and @gkleiman).

@aquamatthias aquamatthias added this to the 0.9.0 milestone Aug 4, 2015

@adam-mesos

This comment has been minimized.

adam-mesos commented Aug 4, 2015

Couldn't you also just abort() the SchedulerDriver and then create a new one (sends ReregisterFrameworkMessage) when you have tasks again? This abort() results in a DeactivateFrameworkMessage, which will temporarily deactivate the framework in the Mesos master allocator, preventing any new offers, while still leaving tasks running. Since abort() doesn't actually terminate the SchedulerDriver, you might still get status updates, etc. (I'd have to double-check). Reregistering will start the offers again.
Also note that other users have found that an excess of filters (one per each of 100 frameworks per each of 1000 nodes) can slow down the master's allocation/offer cycle. So I'd be wary of abusing the refuse filters too much. See https://issues.apache.org/jira/browse/MESOS-3052 and https://issues.apache.org/jira/browse/MESOS-3075 and https://issues.apache.org/jira/browse/MESOS-3157

@ConnorDoyle

This comment has been minimized.

Contributor

ConnorDoyle commented Aug 4, 2015

@adam-mesos we could do that, however I'm concerned about coordinating a reconciliation cycle after each re-registration since handling StatusUpdate messages can become a bottleneck for Marathon. Thanks for pointing out the potential issue with setting too many filters though, it's good to know.

@kolloch

This comment has been minimized.

Contributor

kolloch commented Aug 4, 2015

Hey @adam-mesos, just to save me the time to understand all this: I thought that declineOffers without a filters argument would actually use a default filter?

@adam-mesos

This comment has been minimized.

adam-mesos commented Aug 4, 2015

@kolloch That's correct. Decline sets a default 5sec filter timeout for offers from that slave.
But why bother setting a filter on each slave as you receive each offer when you could just deactivate your framework's offers entirely until you're ready to get more offers? What you really want is the QuiesceOffers/SUPPRESS call that MESOS-3075/MESOS-3037 propose. You still get all your status updates, can even send framework messages, but you're just not getting any offers until you REVIVE.
This mechanism already exists using DeactivateFramework/ReregisterFramework, and would be easy to implement using pure bindings. I'm not sure if the native libmesos' SchedulerDriver's abort() method does exactly what we want here.

kolloch pushed a commit that referenced this issue Aug 4, 2015

Peter Kolloch
Fixes #1931 - Flag for reviveOffers and the duration for which to rej…
…ect offers

`--revive_offers_for_new_apps` if true, revive offers is called when a new app
is added to the `TaskQueue` or if a task of an app with constraints dies. The
latter is necessary, for example, if there is a constraint that only allows
one task per host. In this case, we might accept an offer that we rejected
previously, simply because no task is running on the host anymore.

Also note, that Mesos only filters offers that are a strict sub set of
a rejected offer.

ATTENTION: This patch does not yet throttle calls to reviveOffers. Depending
on the usage of Marathon, `reviveOffers` might be called many times.

`--reject_offer_duration` allows configuring the duration for which offers
are declined if not matched in mesos.
@kolloch

This comment has been minimized.

Contributor

kolloch commented Aug 4, 2015

I agree with @ConnorDoyle here, at least in my current state (= tired), I will not able to cover all code paths.

It does sound kind of hacky to me but that is subjective.

kolloch pushed a commit that referenced this issue Aug 4, 2015

Peter Kolloch
Fixes #1931 - Flag for reviveOffers and the duration for which to rej…
…ect offers

`--revive_offers_for_new_apps` if true, revive offers is called when a new app
is added to the `TaskQueue` or if a task of an app with constraints dies. The
latter is necessary, for example, if there is a constraint that only allows
one task per host. In this case, we might accept an offer that we rejected
previously, simply because no task is running on the host anymore.

Also note, that Mesos only filters offers that are a strict sub set of
a rejected offer.

`--reject_offer_duration` allows configuring the duration for which offers
are declined if not matched in mesos.

`--min_revive_offers_interval` if `--revive_offers_for_new_apps` is specified,
do not call reviveOffers more often than this interval. It defaults to 5 seconds.

@aquamatthias aquamatthias assigned ConnorDoyle and unassigned kolloch Aug 4, 2015

kolloch pushed a commit that referenced this issue Aug 4, 2015

Peter Kolloch
Fixes #1931 - Flag for reviveOffers and the duration for which to rej…
…ect offers

`--revive_offers_for_new_apps` if true, revive offers is called when a new app
is added to the `TaskQueue` or if a task of an app with constraints dies. The
latter is necessary, for example, if there is a constraint that only allows
one task per host. In this case, we might accept an offer that we rejected
previously, simply because no task is running on the host anymore.

Also note, that Mesos only filters offers that are a strict sub set of
a rejected offer.

`--reject_offer_duration` allows configuring the duration for which offers
are declined if not matched in mesos.

`--min_revive_offers_interval` if `--revive_offers_for_new_apps` is specified,
do not call reviveOffers more often than this interval. It defaults to 5 seconds.

@kolloch kolloch assigned kolloch and unassigned ConnorDoyle Aug 5, 2015

@air

This comment has been minimized.

Contributor

air commented Aug 5, 2015

Going forward, what should the general advice be to framework authors that want to avoid starvation in the presence of many competing frameworks?

kolloch pushed a commit that referenced this issue Aug 5, 2015

Peter Kolloch
Fixes #1931 - Flag for reviveOffers and the duration for which to rej…
…ect offers

`--revive_offers_for_new_apps` if true, revive offers is called when a new app
is added to the `TaskQueue` or if a task of an app with constraints dies. The
latter is necessary, for example, if there is a constraint that only allows
one task per host. In this case, we might accept an offer that we rejected
previously, simply because no task is running on the host anymore.

Also note, that Mesos only filters offers that are a strict sub set of
a rejected offer.

`--reject_offer_duration` allows configuring the duration for which offers
are declined if not matched in mesos.

`--min_revive_offers_interval` if `--revive_offers_for_new_apps` is specified,
do not call reviveOffers more often than this interval. It defaults to 5 seconds.

kolloch pushed a commit that referenced this issue Aug 5, 2015

Peter Kolloch
Fixes #1931 - Flag for reviveOffers and the duration for which to rej…
…ect offers

`--revive_offers_for_new_apps` if true, revive offers is called when a new app
is added to the `TaskQueue` or if a task of an app with constraints dies. The
latter is necessary, for example, if there is a constraint that only allows
one task per host. In this case, we might accept an offer that we rejected
previously, simply because no task is running on the host anymore.

Also note, that Mesos only filters offers that are a strict sub set of
a rejected offer.

`--reject_offer_duration` allows configuring the duration for which offers
are declined if not matched in mesos.

`--min_revive_offers_interval` if `--revive_offers_for_new_apps` is specified,
do not call reviveOffers more often than this interval. It defaults to 5 seconds.

kolloch pushed a commit that referenced this issue Aug 5, 2015

Peter Kolloch
Fixes #1931 - Flag for reviveOffers and the duration for which to rej…
…ect offers

`--revive_offers_for_new_apps` if true, revive offers is called when a new app
is added to the `TaskQueue` or if a task of an app with constraints dies. The
latter is necessary, for example, if there is a constraint that only allows
one task per host. In this case, we might accept an offer that we rejected
previously, simply because no task is running on the host anymore.

Also note, that Mesos only filters offers that are a strict sub set of
a rejected offer.

`--reject_offer_duration` allows configuring the duration for which offers
are declined if not matched in mesos.

`--min_revive_offers_interval` if `--revive_offers_for_new_apps` is specified,
do not call reviveOffers more often than this interval. It defaults to 5 seconds.

kolloch pushed a commit that referenced this issue Aug 5, 2015

Peter Kolloch
Fixes #1931 - Flag for reviveOffers and the duration for which to rej…
…ect offers

`--revive_offers_for_new_apps` if true, revive offers is called when a new app
is added to the `TaskQueue` or if a task of an app with constraints dies. The
latter is necessary, for example, if there is a constraint that only allows
one task per host. In this case, we might accept an offer that we rejected
previously, simply because no task is running on the host anymore.

Also note, that Mesos only filters offers that are a strict sub set of
a rejected offer.

`--reject_offer_duration` allows configuring the duration for which offers
are declined if not matched in mesos.

`--min_revive_offers_interval` if `--revive_offers_for_new_apps` is specified,
do not call reviveOffers more often than this interval. It defaults to 5 seconds.
@kolloch

This comment has been minimized.

Contributor

kolloch commented Aug 5, 2015

@air, good point!

@kolloch kolloch added ready for review and removed in progress labels Aug 5, 2015

kolloch pushed a commit that referenced this issue Aug 5, 2015

Peter Kolloch
Fixes #1931 - Flag for reviveOffers and the duration for which to rej…
…ect offers

`--revive_offers_for_new_apps` if true, revive offers is called when a new app
is added to the `TaskQueue` or if a task of an app with constraints dies. The
latter is necessary, for example, if there is a constraint that only allows
one task per host. In this case, we might accept an offer that we rejected
previously, simply because no task is running on the host anymore.

Also note, that Mesos only filters offers that are a strict sub set of
a rejected offer.

`--reject_offer_duration` allows configuring the duration for which offers
are declined if not matched in mesos.

`--min_revive_offers_interval` if `--revive_offers_for_new_apps` is specified,
do not call reviveOffers more often than this interval. It defaults to 5 seconds.

kolloch pushed a commit that referenced this issue Aug 5, 2015

Peter Kolloch
Fixes #1931 - Flag for reviveOffers and the duration for which to rej…
…ect offers

`--revive_offers_for_new_apps` if true, revive offers is called when a new app
is added to the `TaskQueue` or if a task of an app with constraints dies. The
latter is necessary, for example, if there is a constraint that only allows
one task per host. In this case, we might accept an offer that we rejected
previously, simply because no task is running on the host anymore.

Also note, that Mesos only filters offers that are a strict sub set of
a rejected offer.

`--reject_offer_duration` allows configuring the duration for which offers
are declined if not matched in mesos.

`--min_revive_offers_interval` if `--revive_offers_for_new_apps` is specified,
do not call reviveOffers more often than this interval. It defaults to 5 seconds.

kolloch pushed a commit that referenced this issue Aug 5, 2015

Peter Kolloch
Fixes #1931 - Flag for reviveOffers and the duration for which to rej…
…ect offers

`--revive_offers_for_new_apps` if true, revive offers is called when a new app
is added to the `TaskQueue` or if a task of an app with constraints dies. The
latter is necessary, for example, if there is a constraint that only allows
one task per host. In this case, we might accept an offer that we rejected
previously, simply because no task is running on the host anymore.

Also note, that Mesos only filters offers that are a strict sub set of
a rejected offer.

`--reject_offer_duration` allows configuring the duration for which offers
are declined if not matched in mesos.

`--min_revive_offers_interval` if `--revive_offers_for_new_apps` is specified,
do not call reviveOffers more often than this interval. It defaults to 5 seconds.

gkleiman added a commit that referenced this issue Aug 5, 2015

Merge pull request #1949 from mesosphere/pk/revive-offers
Fixes #1931 - Flag for reviveOffers and offer reject duration

@kolloch kolloch modified the milestones: 0.11.0, 0.9.0 Aug 11, 2015

@kolloch kolloch added ready and removed ready for review labels Aug 11, 2015

@kolloch kolloch changed the title from Marathon should set a non-default filter when declining offers to Marathon should set a non-default filter when declining offers (on master) Aug 11, 2015

@kolloch kolloch assigned gkleiman and unassigned kolloch Aug 20, 2015

@gkleiman gkleiman removed their assignment Aug 26, 2015

@gkleiman gkleiman added in progress and removed ready labels Aug 26, 2015

@gkleiman gkleiman self-assigned this Aug 26, 2015

@gkleiman gkleiman added the service label Aug 27, 2015

aquamatthias added a commit that referenced this issue Aug 30, 2015

aquamatthias added a commit that referenced this issue Aug 31, 2015

gkleiman added a commit that referenced this issue Aug 31, 2015

Merge pull request #2126 from mesosphere/mv/fix_1931_b
Fix #1931 decline with configurable duration.
@anthonyrisinger

This comment has been minimized.

anthonyrisinger commented Sep 1, 2015

@kolloch what was the recommendation? did I miss it somewhere?

We are experiencing marathon offer starvation when running in the precence of multiple instances of our custom framework. Once 5 instances + marathon is running, even if our framework rejects every single offer and never schedules a thing, marathon is unable to ever schedule again, but our frameworks appear unaffected? Still investigating.

I've read some infos:

https://issues.apache.org/jira/browse/MESOS-3202
https://issues.apache.org/jira/browse/MESOS-3037

but any idea the best way to mitigate? Does boosting the filter timeout appear to work? Is it possible to use a massive timeout be default and then RequestOffer?

@anthonyrisinger

This comment has been minimized.

anthonyrisinger commented Sep 1, 2015

Specifically, to framework writers and thus marathon, what should be considered when implementing a general purpose framework designed to coexist with other, likely foreign, frameworks? Since marathon is a well-known/reference implementation, I basically just want to copy your current/upcoming strategy.

Right now I think refuse_seconds=3 within our framework, all the time, is killing marathon, but this was really only done to ensure tasks could be started in a timely manner.

If this discussion is better on-list, let me know.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.