Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[elasticsearch] configure the default replicas to 0 #2871

Merged
merged 4 commits into from Mar 3, 2020

Conversation

@ryancragun
Copy link
Member

ryancragun commented Feb 14, 2020

  • Make the refresh_interval and number_of_replicas configurable
  • Change the number_of_replicas from 1 to 0 for single node installs
    because it's impossible to replicate a shard with a single node

Signed-off-by: Ryan Cragun ryan@chef.io

* Make the refresh_interval and number_of_replicas configurable
* Change the number_of_replicas from 1 to 0 for single node installs
  because it's impossible to replicate a shard with a single node

Signed-off-by: Ryan Cragun <ryan@chef.io>
@ryancragun ryancragun self-assigned this Feb 14, 2020
Signed-off-by: Ryan Cragun <ryan@chef.io>

source {{pkg.svc_config_path}}/health_check

if [[ $(wait_until_healthy 300) -ne 0 ]]; then

This comment has been minimized.

Copy link
@jaym

jaym Feb 18, 2020

Contributor

i think past experience has shown blocking in hooks is not good. habitat-sh/habitat#6705 indicates that it will retry if you exit with non-zero. The ha services also did something similar to this but ended up ditching it because it caused some problems

This comment has been minimized.

Copy link
@ryancragun

ryancragun Feb 24, 2020

Author Member

For posterity: previous versions of hab ran the hooks in the main supervisor thread so any long operation in them would block the supervisor. Now hooks are run async so it's okay to block.

@@ -0,0 +1,41 @@
#!{{pkgPathFor "core/bash"}}/bin/bash

This comment has been minimized.

Copy link
@jaym

jaym Feb 18, 2020

Contributor

do we need to set -x

This comment has been minimized.

Copy link
@stevendanna

stevendanna Feb 18, 2020

Member

Do you mean set -e?

This comment has been minimized.

Copy link
@jaym

jaym Feb 18, 2020

Contributor

yes, that, the error one

This comment has been minimized.

Copy link
@ryancragun

ryancragun Feb 24, 2020

Author Member

I'm intentionally not failing at all in the post-run hook because it's a best effort attempt. Since there is no backoff of retries and hab doesn't currently reap hooks when a service is unloaded, that seems like the safest approach.

Signed-off-by: Ryan Cragun <ryan@chef.io>

# We give it a default timeout of 600 seconds to ensure that the post-run
# hook will eventually time out. We do this because Habitat does not currently
# reap hooks if the service is unloaded. We don't want to end up in a place

This comment has been minimized.

Copy link
@jaym

jaym Feb 27, 2020

Contributor

I think weird stuff is going to happen here if ES is in a failure loop in that these post-run hooks are going to likely run longer than it takes ES to start again and fire off another post-run hook. It'll probably be fine if it gets fixed quickly, but if it goes for a while, we're going to have a lot of health checks against ES running every second

This comment has been minimized.

Copy link
@jaym

jaym Feb 27, 2020

Contributor

A better strategy might be to check the health, sleep for a few seconds if it fails, then return 1. That should get around 7494

@jaym
jaym approved these changes Feb 27, 2020
@yzl
yzl approved these changes Feb 27, 2020
Signed-off-by: Ryan Cragun <ryan@chef.io>
@ryancragun ryancragun force-pushed the ryan/es-default-replicas branch from cddc169 to 0df2b57 Mar 3, 2020
@ryancragun ryancragun merged commit 8616e4f into master Mar 3, 2020
4 checks passed
4 checks passed
DCO This commit has a DCO Signed-off-by
Details
buildkite/chef-automate-master-verify Build #9644 passed (10 minutes, 45 seconds)
Details
buildkite/chef-automate-master-verify-private Build #9946 passed (1 hour, 8 minutes, 56 seconds)
Details
expeditor/config-validation Validated your Expeditor config file
Details
@chef-expeditor chef-expeditor bot deleted the ryan/es-default-replicas branch Mar 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.