Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Checkpoint default to true. Mesos already did so, we might as well also #730

Merged
merged 1 commit into from Oct 13, 2015

Conversation

@stevenschlansker
Copy link
Contributor

@stevenschlansker stevenschlansker commented Oct 9, 2015

No description provided.

@tpetr
Copy link
Member

@tpetr tpetr commented Oct 13, 2015

To enable this in a pre-existing cluster:

Changing FrameworkInfo (while keeping the FrameworkID) is not handled correctly by Mesos at the moment. This is what you currently need to do to propagate FrameworkInfo.checkpoint throughout the cluster.

--> Update FrameworkInfo inside your framework and re-register with master. (Old FrameworkInfo is still cached at master and slaves).
--> Failover the leading master. (New FrameworkInfo will be cached by new leading master).
--> Hard restart (kill slave and wipe meta data) your slave in batches.

The proper fix for this is tracked at: https://issues.apache.org/jira/browse/MESOS-703

tpetr added a commit that referenced this pull request Oct 13, 2015
Change Checkpoint default to true.  Mesos already did so, we might as well also
@tpetr tpetr merged commit c00c5e2 into HubSpot:master Oct 13, 2015
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@tpetr tpetr removed hs_qa labels Oct 13, 2015
@tpetr tpetr added this to the 0.4.6 milestone Oct 13, 2015
@stevenschlansker stevenschlansker deleted the stevenschlansker:checkpoint-by-default branch Oct 13, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants
You can’t perform that action at this time.