Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tempo-distributed] refactor and introduce Enterprise Traces support #1759

Merged
merged 72 commits into from
Oct 27, 2022

Conversation

zalegrala
Copy link
Contributor

@zalegrala zalegrala commented Aug 31, 2022

With this change, I aim to align the helpers used in the tempo-distributed chart with those of the other database teams, namely the mimir-distributed chart for improved consistency and long term maintainability. This is also in prep to implement enterprise-traces into the tempo-distributed chart.

As part of this alignment, the following structural changes have been implemented.

  • import mimir helpers and begin alignment
  • centralize image reference handling
  • centralize name reference handling
  • BREAKING CHANGE centralize selector label handling -- users who wish to keep old values should still be able to use the nameOverride and fullNameOverride top level keys in their values.
  • centralize pod label handling
  • centralize service monitor around metamonitoring
  • BREAKING CHANGE serviceMonitor has been nested under metaMonitoring -- metamonitoring can be used scrape services as well as install the operator with the following values. Note also that the port names have changed from http to http-metrics.
metaMonitoring:
  serviceMonitor:
    enabled: true
  grafanaAgent:
    enabled: true
    installOperator: true
  • minio can now be enabled as part of this chart using the following values
minio:
  enabled: true

As part of the above changes, many helpers now expect that a context is passed. This pattern is meant to be used in the following way.

First declare the dictionay for the given component, and optionally set memberlist to true based on the component needs.

{{ $dict := dict "ctx" . "component" "compactor" "memberlist" true }}

Then you can pass the $dict when rendering a named template.

name: {{ template "tempo.resourceName" $dict }}

This allows us to use one template and centralize the logic passing in the
variables that change.

Additionally, the configuration can now be stored in a secret. See the documentation for useExternalConfig and configStorageType in the values file for more details.

@zalegrala zalegrala force-pushed the tempoDistEnterpriseTracesRebase branch 6 times, most recently from ef61691 to bea0e1b Compare August 31, 2022 22:58
@zalegrala zalegrala marked this pull request as ready for review August 31, 2022 23:13
podAnnotations: {}

nodeSelector: {}
affinity: {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original services in the same file use a string here and template it with tpl, this is a bit inconsistent now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made an additional breaking change to align. How's that look?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some discussion in slack, I expect to roll this back the other direction to template strings for the affinity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been rolled back. Now we use a template for all affinity.

charts/tempo-distributed/values.yaml Outdated Show resolved Hide resolved
# -- Config file contents for Tempo distributed. Passed through the `tpl` function to allow templating

# @default -- See values.yaml
config: |
multitenancy_enabled: {{ .Values.multitenancyEnabled }}
search_enabled: {{ .Values.search.enabled }}
metrics_generator_enabled: {{ .Values.metricsGenerator.enabled }}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: this is subtle stuff, but in mimir-distributed I reordered the yaml in config template alphabetically because the processing we do will output the result in an ordered way.

Copy link
Contributor Author

@zalegrala zalegrala Sep 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah very interesting.

charts/tempo-distributed/templates/_helpers.tpl Outdated Show resolved Hide resolved
charts/tempo-distributed/templates/_helpers.tpl Outdated Show resolved Hide resolved
tempo-query.yaml: {{ tpl .Values.queryFrontend.query.config . | b64enc }}
overrides.yaml: {{ include "tempo.calculatedOverridesConfig" . | b64enc }}
tempo.yaml: {{ include "tempo.calculatedConfig" . | b64enc }}
{{- else }}
data:
tempo-query.yaml: |
{{- tpl .Values.queryFrontend.query.config . | nindent 4 }}
overrides.yaml: |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit weird. Don't you reload overrides.yaml automatically in tempo? If it's automatically reloaded from a file, then you shouldn't have it next to the static config, because any change to the overrides will cause the config checksum to change and restart the services for no reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe we reload on config change, but I could be wrong.

Aside from Tempo, are you saying a configmap updates the running pod without restart? I'd no idea if that's true. Interesting. I'd been thinking that the annotations were in place specifically to restart the pod on change, but now I wonder.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marty confirmed that we will reload the overrides config. I'll come back to tidy that up. Thanks for the pointer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could follow up in another PR though, what do you think @krajorama ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds ok to me as long as it's not forgotten

@@ -1,10 +1,11 @@
{{- if .Values.gateway.enabled }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be

Suggested change
{{- if .Values.gateway.enabled }}
{{- if and .Values.gateway.enabled (not .Values.enterprise.enabled) }}

Because now it seems like you run both gateways when enterprise is enabled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't suppose a user would want to run both, but since both gateways default to false, I thought just switching in the specific component for each one reduced the logic. Users who want to run only one gateway will need to enable only one gateway in the config.

Alternatively, this could be the kind of thing we put in a validation template, but is not implemented in this pr.

@zalegrala zalegrala force-pushed the tempoDistEnterpriseTracesRebase branch 2 times, most recently from 523d52e to a31685b Compare September 2, 2022 15:01
@zalegrala zalegrala force-pushed the tempoDistEnterpriseTracesRebase branch 2 times, most recently from 2ae0f4f to f8c6d31 Compare September 19, 2022 19:26
Copy link
Contributor

@trevorwhitney trevorwhitney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Just a few small nits, and I'll leave the ✔️ for @krajorama as I see he requested some changes.

{{/*
Return the appropriate apiVersion for ingress.
*/}}
{{- define "tempo.ingress.apiVersion" -}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI We've in Mimir recently made the breaking change to only support kube >= 1.20. Since we are not actually testing on older versions - so we could get rid of this check altogether.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds good to me. Perhaps a follow up breaking change to upgrade the required version for the chart as well as drop the api version comparisons.


# Settings for the gateway service providing authentication and authorization via the admin_api.
# Can only be enabled if enterprise.enabled is true - requires license.
enterpriseGateway:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recently we got a feature request from a customer to make migrating from Mimir OSS to Enterprise Metrics easier, so now we're working on merging our "nginx" and "gateway" configuration under the "gateway" section.

So in case you want to add "nginx" for OSS users later, the "enterpriseGateway" naming will bite you. I'd consider calling it just "gateway".

See for ref: grafana/mimir#3181

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think we'll need to follow up here. We've currently got a gateway: section for the nginx gateway configuration. The linked PR here is quite a large change. Perhaps we can follow up.


{{/*
Calculate values.yaml section name from component name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Mimir we ended up making this include validation, a map and returning the full section instead of the name. https://github.com/grafana/mimir/blob/8962290b64ebcc4c9ba91a8ff1455f7a600c3a29/operations/helm/charts/mimir-distributed/templates/_helpers.tpl#L378

zalegrala and others added 22 commits October 25, 2022 17:00
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
Co-authored-by: George Krajcsovits <krajorama@users.noreply.github.com>
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
This reverts commit 22ed554.

Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
Co-authored-by: Trevor Whitney <trevorjwhitney@gmail.com>
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
…etrics-servmon.yaml

Co-authored-by: Trevor Whitney <trevorjwhitney@gmail.com>
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
…isor-servmon.yaml

Co-authored-by: Trevor Whitney <trevorjwhitney@gmail.com>
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
…e-usernames-secret.yaml

Co-authored-by: Trevor Whitney <trevorjwhitney@gmail.com>
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
Signed-off-by: Zach Leslie <zach.leslie@grafana.com>
@zalegrala zalegrala force-pushed the tempoDistEnterpriseTracesRebase branch from 12bc96c to 8a3443b Compare October 25, 2022 17:03
@zalegrala zalegrala merged commit 6ec320b into grafana:main Oct 27, 2022
@cajual
Copy link

cajual commented Oct 27, 2022

3534217#r88097126

@joshuasimon-taulia
Copy link
Contributor

should minio really be enabled by default?

@zalegrala
Copy link
Contributor Author

Fair question. After spending a bit more time with it, I think no. Would you like to put up a PR for it @joshuasimon-taulia ?

@zalegrala zalegrala deleted the tempoDistEnterpriseTracesRebase branch February 28, 2023 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants