Skip to content
This repository has been archived by the owner on May 16, 2023. It is now read-only.

Rewrite elasticsearch multi example to use child charts. #22

Closed
wants to merge 3 commits into from

Conversation

jordansissel
Copy link
Contributor

As a newbie to helm, I found child charts interesting. This change makes the multi example now use a single release with child charts instead of two separate helm releases.

In practice, I don't know if this is the best route, because we'll want to upgrade masters before upgrading data nodes, for example, so this PR is mostly food-for-thought.

Thoughts? :)

@jordansissel
Copy link
Contributor Author

jordansissel commented Dec 29, 2018

we'll want to upgrade masters before upgrading data nodes

I suppose in this example, we could set imageTag separately for each child chart in the values.yaml and do an upgrade that way:

  • set imageTag on master to new version
  • make install to upgrade just the masters
  • set imageTag on data role to new version
  • make install to ugprade just the data nodes

It'd be identical to the the upgrade process for current code (in master) which has two files (data + master) and separate helm releases, probably?

Edit: I just tried this, and it works as expected (set imageTag: ... on the master, then make install results in a rolling upgrade of the masters)

Copy link
Contributor

@Crazybus Crazybus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM apart from needing a rebase to fix the integration tests and pull in the latest changes and version from master. The tests are currently failing because Google dropped support for GKE 1.9 clusters so the build is failing when trying to create that cluster.

Apart from the needed rebase I love everything about this. This is a much nicer way to handle clusters with multiple node groups in a "helm native" way. Thanks a lot for trying this out, it has been on the todo list for too long and I'm glad to see it is really simple and easy to set up!

This approach has made me realise something else that could be changed to improve managing a full stack with helm charts using this method. In this example you are setting masterService: "multi-master". If you also wanted to add Kibana to this setup you would now also need to add elasticsearchURL: "http://multi-master:9200" which would be a bit silly.

So I think this means we should standardise on the connection variables for Elasticsearch so that they can be reused for Elasticsearch, Kibana, Logstash, Beats + whatever else in the future. I'll create a separate issue for it, to keep this PR clean.

elasticsearch/examples/multi/Chart.yaml Show resolved Hide resolved
@jordansissel
Copy link
Contributor Author

If you also wanted to add Kibana to this setup

Indeed! I'll share my stack chart with you after new years which includes Kibana.

@jordansissel jordansissel force-pushed the example/multi-use-child-charts branch 2 times, most recently from e92b7e2 to 03c5d6f Compare December 31, 2018 22:43
@jordansissel
Copy link
Contributor Author

Rebased!

@Crazybus
Copy link
Contributor

Crazybus commented Jan 4, 2019

CI tests are failing because helm isn't initialised.

22:54:45 make[1]: Entering directory `/app/elasticsearch/examples/multi'
22:54:45 helm dep update
22:54:45 Error: Couldn't load repositories file (/app/.helm/repository/repositories.yaml).
22:54:45 You might need to run `helm init` (or `helm init --client-only` if tiller is already installed)

I think adding the helm init --client-only into the testing Dockerfile would be the cleanest way to fix it. That is defined in helpers/terraform/Dockerfile

@rendhalver
Copy link

Doesn't this go against the original design of the Chart?
One of the original intents for having multiple releases was to be able to swap out the data node without needing to delete the data. If we have one release for a cluster we could have the same problem.
https://github.com/elastic/helm-charts/tree/master/elasticsearch#usage-notes-and-getting-started

@Crazybus
Copy link
Contributor

Crazybus commented Jan 8, 2019

Doesn't this go against the original design of the Chart?
One of the original intents for having multiple releases was to be able to swap out the data node without needing to delete the data. If we have one release for a cluster we could have the same problem.

This still functions identically to the previous example. Instead of adding a new release and a new upgrade command you can instead add a new node block.

es-data2:
  <<: *defaults # include with the default settings
  nodeGroup: "data2"

  roles:
    master: "false"
    ingest: "true"
    data: "true"

To then demote the old nodes you would then bump the replicas down one at a time, or manually move the shards off them then remove the node block.

While this is the same in function the terminology in the readme indeed no longer matches. I think we can replace the terminology "release" with "install". The important piece being that the chart itself does not attempt to handle multiple groups of node itself.

Upgrading the master group first would involve bumping the imageTag for only the masters first and still works in the same way as it would have before.

The motivation for doing it this way was to remove code duplication when specifying configuration that is shared between multiple node groups. This could also be solved by including multiple values files.

One advantage of using this method is that it gives you a single version for your cluster. If you want to roll back the previous change there is only one version to rollback. Rather than manually needing to figure out the order that you made the changes to the separate releases.

@rendhalver If you do find this example confusing though maybe it is a sign that we shouldn't merge this. Or maybe instead have it as a separate example of how to manage multiple node groups in a single release.

@rendhalver
Copy link

It's not confusing I understand what you are trying to do.

I do admit I didn't read what the code does before I put in my comment.

I think having it as a separate example is a better idea.

If we start enforcing opinions then we are at risk of breaking the un-opinionated nature of the Chart.

@rendhalver
Copy link

On a side note I copied this Chart into our local repo and we have used it to deploy 4 new clusters and it works really well.
I vote for adding this as a new top level chart so people can make use of it.

@jordansissel
Copy link
Contributor Author

@rendhalver I have some other ideas to make this chart style even more useful (so that settings can be inherited across parent->child chart values). Hopefully will have a PR to demo it this week. After that, we can discuss any future support for this style.

And thank you for testing! :)

@rendhalver
Copy link

That sounds pretty cool!
Tag me on your new PR. :)

@jordansissel
Copy link
Contributor Author

@Crazybus I think this is ready for another round of reviews. CI tests failure seems unrelated (timeout waiting for the GKE cluster to come online)

jordansissel added a commit that referenced this pull request Feb 8, 2019
The parent/child chart relationship is roughly:
inheritance -> secure-elasticsearch -> elasticsearch

The objective is to demonstrate a parent chart (secure-elasticsearch)
which sets broad business-specific defaults for Elasticsearch, and this
chart is then used to deploy master and data nodes, inheriting those
settings.

This example covers a few areas:

1) Having a child chart (secure-elasticsearch) add some baseline
   settings for the cluster.
2) A parent chart to inherit and amend the settings of the child
   secure-elasticsearch chart.
3) Using chart aliases to deploy master and data nodes independently
   Same result as multi in PR #22, but instead of using yaml anchors,
   this uses helm child charts.
4) Add a job which deploy the license payload.
5) Add a job for deploying cluster and index settings. This can be
   useful for setting cluster-wide things like concurrent_recoveries, or
   for index-level settings such as node_left recovery delay. These
   settings live along-side the node settings intended for easier reading.
6) (From previous commit) Elasticsearch node settings
   (elasticsearch.yml) is now specified by a structured value instead of
   a text block.

The `Makefile` does a few things to help make this testable outside of
Elastic's infrastructure. Mainly, all secrets and key material are
generated by `make secrets`. This is instead of using the `vault read`
method that other examples are using.

Note: The `secure-elasticsearch` chart is in a parent directory because
for some reason helm doesn't work if it were in
./charts/secure-elasticsearch due to some behavior in helm itself:
helm/helm#2887
@desaintmartin
Copy link
Contributor

desaintmartin commented Aug 27, 2019

Any update? Do you need help on this one? I can help bringing a multi example with several node groups (hot/warm/cold)

# base settings for both node types (data + master)
es-defaults: &defaults
clusterName: "multi"
masterService: "multi-master"
Copy link
Contributor

@desaintmartin desaintmartin Aug 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need this, as the masterService is computed from cluster name and not group name.

ingest: "false"
data: "false"

# To define common settings, it may be useful to use YAML anchors to use the same
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In helm 2.14.3, the anchor has to be defined BEFORE using it, so moving this part on top of the file works.

@Crazybus
Copy link
Contributor

Any update? Do you need help on this one? I can help bringing a multi example with several node groups (hot/warm/cold)

Pleas do! (In a separate PR).

I think having it as a separate example is a better idea.

I agree with @rendhalver and think that we should add this is a example to the existing multi one that is being used for testing.

@jmlrt jmlrt added the enhancement New feature or request label Oct 17, 2019
@botelastic
Copy link

botelastic bot commented Jan 7, 2020

This PR has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. To track this PR (even if closed), please open a corresponding issue if one does not already exist.

@desaintmartin
Copy link
Contributor

desaintmartin commented Jan 20, 2020

Any update? Do you need help on this one? I can help bringing a multi example with several node groups (hot/warm/cold)

Pleas do! (In a separate PR).

5 months later, PR at #452

@botelastic botelastic bot removed the triage/stale label Jan 20, 2020
@botelastic
Copy link

botelastic bot commented Apr 19, 2020

This PR has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. To track this PR (even if closed), please open a corresponding issue if one does not already exist.

@botelastic
Copy link

botelastic bot commented May 19, 2020

This PR has been automatically closed because it has not had recent activity since being marked as stale. Please reopen when work resumes.

@botelastic botelastic bot closed this May 19, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request triage/stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants