Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate what conditions should automatically trigger a service election #7486

Open
jsirex opened this issue Feb 19, 2020 · 6 comments
Open
Labels
Focus:Supervisor Related to the Habitat Supervisor (core/hab-sup) component Type: Bug Issues that describe broken functionality

Comments

@jsirex
Copy link
Contributor

jsirex commented Feb 19, 2020

This issue is for discuss Chef Habitats' election best behavior .

I see that stop/start service doesn't trigger election.
Now elections only triggered when leader supervisor goes down.

Probably, we should trigger a new election:

  1. Every time service topology changes (for example, if service stopped, configuration changed, but supervisor is still running)
  2. On health-check failed (not sure we should, but looks interesting)
  3. Periodically (not sure we should, but looks interesting). Example: pool of workers with changing leader on the fly depending on current member capacity)

My real use-case example now: I'm building postgresql streaming replication service. And I need a master node always be master. So if it goes down - new temporary leader will be elected (postgres stays in readonly mode). When master goes back online - it should be the leader.
To achieve this I create suitability hook which always returns 10 for a leader, 5 for a follower, and 0 when uninitialized.

Picture gets more complicated when you starting think about "should suitability hook be triggered before service start, after start, after health-check passed, after post-run?

@jsirex jsirex added the C-bug label Feb 19, 2020
@jsirex
Copy link
Contributor Author

jsirex commented Apr 17, 2020

I'm ended up with decision that nowadays it is impossible to use leader-follower topology at all.
Here is classic and simple use-case:
I have 3 postgresql servers running in leader topology. I set them so there is primary and 2 hot stand by with synchronous replication.

Now I want to upgrade services to something minor. But stopping the service is not triggering election, so it not possible to upgrade without downtime. I also cannot stop supervisor itself, because it is running other services.

Digging into source code I found a "hack" with HAB_FEAT_TRIGGER_ELECTION enabled to manually trigger it. Assume this flag is for debugging/developing and should not be considered as best practice.

@davidMcneil
Copy link
Contributor

Relates to #5325. I think it would be useful to have a command to manually trigger a service election (#7787). I also think it would be useful to evaluate under what conditions we should automatically trigger an election. I am going to change the title of this issue to reflect that broader scope.

@davidMcneil davidMcneil changed the title Re-Trigger Election when service topology changes Evaluate what conditions should automatically trigger a service election Jul 9, 2020
@christophermaier christophermaier added Focus:Supervisor Related to the Habitat Supervisor (core/hab-sup) component Type: Bug Issues that describe broken functionality and removed A-supervisor labels Jul 24, 2020
@stale
Copy link

stale bot commented Jul 26, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. We value your input and contribution. Please leave a comment if this issue still affects you.

@stale stale bot added the Stale label Jul 26, 2021
@mwrock
Copy link
Contributor

mwrock commented Jul 26, 2021

not stale

@stale stale bot removed the Stale label Jul 26, 2021
@stale
Copy link

stale bot commented Jul 31, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. We value your input and contribution. Please leave a comment if this issue still affects you.

1 similar comment
@stale
Copy link

stale bot commented Aug 12, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. We value your input and contribution. Please leave a comment if this issue still affects you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Focus:Supervisor Related to the Habitat Supervisor (core/hab-sup) component Type: Bug Issues that describe broken functionality
Projects
None yet
Development

No branches or pull requests

4 participants