Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[helm loki/promtail] update of DaemonSet takes a long with a lot of nodes #1878

Closed
stefanandres opened this issue Apr 1, 2020 · 1 comment · Fixed by #1898
Closed

[helm loki/promtail] update of DaemonSet takes a long with a lot of nodes #1878

stefanandres opened this issue Apr 1, 2020 · 1 comment · Fixed by #1898
Assignees
Labels
type/enhancement Something existing could be improved

Comments

@stefanandres
Copy link

Is your feature request related to a problem? Please describe.
When deploying a change for the promtail DaemonSet, the configured default is to only rollover one pod at a time.
This may take 1 minute per pod. When you have a Cluster of more than a few Nodes, this will take an immensive amount of time for simples changes.

Current default:

  updateStrategy:
    rollingUpdate:
      maxUnavailable: 1
    type: RollingUpdate

Describe the solution you'd like
Since having a slow rollover does not give you any more availablility, we may configure to allow more parallel rollover.

Additional context

I'd suggest to change the MaxUnavailble settings in the UpdateStrategy.

Now we can also discuss a sane default for the helm chart.

  1. The option should be configureable
  2. What default should be use here? If we continue to use 1 updates are slow, but misconfiguration of the pod would mean that only one logging Pod would be broken and not all of them. If we use something like 100% options are fast, but when the Pod is misconfigured all logs would be broken.

Let me know what you think of it and I can whip up some PR ;)

@cyriltovena
Copy link
Contributor

I think 25% is a sane default. And yes we would love a contribution to improve this.

Thank you :)

@slim-bean slim-bean added the type/enhancement Something existing could be improved label Apr 2, 2020
stefanandres pushed a commit to syseleven/loki that referenced this issue Apr 6, 2020
This commit makes UpdateStrategy of the promtail daemonset configurable.
On large installations, you may want to increase the value of maxUnavailable.

This fixes grafana#1878
cyriltovena pushed a commit that referenced this issue Apr 6, 2020
This commit makes UpdateStrategy of the promtail daemonset configurable.
On large installations, you may want to increase the value of maxUnavailable.

This fixes #1878
torstenwalter pushed a commit to torstenwalter/grafana-helm-charts that referenced this issue Oct 3, 2020
This commit makes UpdateStrategy of the promtail daemonset configurable.
On large installations, you may want to increase the value of maxUnavailable.

This fixes grafana/loki#1878
cyriltovena pushed a commit to cyriltovena/loki that referenced this issue Jun 11, 2021
* querier.sum-shards

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* addresses pr comments

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* instruments frontend sharding, splitby

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* LabelsSeriesID unexported again

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* removes unnecessary codec interface in astmapping

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* simplifies VectorSquasher as we never use matrices
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* combines queryrange series & value files
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* removes noops struct embedding strategy in schema, provides noop impls on all schemas instead
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* NewSubtreeFolder no longer can return an error as it inlines the jsonCodec
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* account for QueryIngestersWithin renaming
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* fixes rebase import collision

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* fixes rebase conflicts
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* -marks absent as non parallelizable

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* upstream promql compatibility changes

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* addresses pr comments
Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* import collisions

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* linting - fixes goimports -local requirement

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* fixes merge conflicts

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* addresses pr comments

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* stylistic changes

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* s/downstream/sharded/

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* s/sum_shards/parallelise_shardable_queries/

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* query-audit docs

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* notes sharded parallelizations are only supported by chunk store

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>

* doc suggestions

Signed-off-by: Owen Diehl <ow.diehl@gmail.com>
mraboosk pushed a commit to mraboosk/loki that referenced this issue Oct 7, 2024
This commit makes UpdateStrategy of the promtail daemonset configurable.
On large installations, you may want to increase the value of maxUnavailable.

This fixes grafana#1878
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement Something existing could be improved
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants