Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/grafana8 unified alerting #559

Merged
merged 6 commits into from
Nov 23, 2021

Conversation

apryor6
Copy link
Contributor

@apryor6 apryor6 commented Oct 12, 2021

Description

This implements parameter support to grafana.ini for Unified Alerting in Grafana 8 -> docs here

Relevant issues/tickets

None but was discussed in #grafana-operator slack

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist

  • This change requires a documentation update
  • I have added tests that prove my fix is effective or that my feature works
  • I have added a test case that will be used to verify my changes
  • Verified independently on a cluster by reviewer

Verification steps

A minimal deployment that leverages this would be something like

apiVersion: integreatly.org/v1alpha1
kind: Grafana
metadata:
  name: db
  namespace: grafana
spec:
  baseImage: grafana/grafana:8.2.0
  deployment:
  config:
    alerting:
      enabled: False
    unified_alerting:
      enabled: True
      min_interval: 10s

Note, per the docs, you MUST simultaneously enable unified_alerting and disable alerting

@NissesSenap
Copy link
Collaborator

Is it worth adding some logic to give an error if both unified_alerting and alerting is enabled at the same time?
Could you create a new document or use a existing one explaining/linking to the documents that you describe in the PR.

@apryor6
Copy link
Contributor Author

apryor6 commented Oct 12, 2021

Is it worth adding some logic to give an error if both unified_alerting and alerting is enabled at the same time? Could you create a new document or use a existing one explaining/linking to the documents that you describe in the PR.

I think this is a good idea. I would ask for a bit of guidance as a) I'm very new to golang and have no clue what proper exception throwing/handling practices are and b) am unsure of any conventions or contextual implications of doing so within this system based on where the code executes

Behaviorally, would we want this to error if the user provides mutual-inclusive settings that won't work? Or we could assume that the more opt-in nature of unified_alerting is what they wanted and log a warning that we are unsetting the alerting flag and do that for them.

Conceptually something like

	if i.cfg.Alerting != nil {
		config = i.cfgAlerting(config)
	}

	if i.cfg.UnifiedAlerting != nil {
		config = i.cfgUnifiedAlerting(config)
	}
      
	if (i.cfg.Alerting != nil) && (i.cfg.UnifiedAlerting != nil){
		if (i.cfg.Alerting.Enabled && i.cfg.UnifiedAlerting.Enabled) {
			// Then we have violated a Grafana requirement that only one be set. Defaulting
			// to assuming they want UnifiedAlerting
			i.cfg.Alerting.Enabled = false
		}
	}

I get that the syntax of i.cfg.Alerting.Enabled = false isn't right, consider it pseudocode from a clueless dev :)
Thoughts?

@david-martin david-martin added the v4 Major version 4 label Oct 12, 2021
@NissesSenap
Copy link
Collaborator

My general philosophy when it comes to microservices is to fail loudly, rather then generating logs so I don't think the grafana instance should be created at all if both the values are set.

It possible to solve this in OpenAPIv3 so we don't have to add any logic at all
https://kubernetes.io/blog/2019/06/20/crd-structural-schema/#structural-schema
but sadly it doesn't seem like it's supported kubernetes-sigs/controller-tools#461 so far.

Another solution would be to use a controller level//admission webhook level but I think that would create more complexity then we want right now.

I think to automatically set i.cfg.Alerting.Enabled = false in the operator even
if the user provides it would be a big source of confusion and possible issues. In general it's good to stay away from "magic" even though I understand your point.

I have been looking for a good place where we could put some kind of logic to return a error in a good way but I'm unable to do so from the top of my head.
Here @HubertStefanski or @pb82 will probably have some good ideas.

@apryor6
Copy link
Contributor Author

apryor6 commented Oct 13, 2021

Your points make good sense. Given that the error logic here depends upon the fully built configuration (wanting to avoid any order-of-operations issues), perhaps at the end of config construction a single error check call could be made wrapping these various confirmations. In this case, if both flags are set you return an error. If the check returns no errors (aka they are nil) you proceed - otherwise yell loudly and fatally exit.

@HVBE
Copy link
Collaborator

HVBE commented Oct 28, 2021

I like the idea of a loud failure in the case of the ini, I think we could possibly add some logic that would fatal out in case of a misconfiguration, we already have similar logic when we define certain flags at the same time, so we can definitely do it here as well. My one requirement would be that we give a decent description of the misconfiguration issue and how to fix it in the log message.

@Cellebyte
Copy link
Contributor

Will this one be backported to v3?

@NissesSenap
Copy link
Collaborator

Only critical bug fixes are backported to v3, so no.

@NissesSenap
Copy link
Collaborator

We have had some more internal discussions and in general the operator don't provide logic for users to not make errors. Instead we forward all config to grafana, we will keep on doing this in this case as well.
This might be something that we decide to change in the future but it will be a more general work that needs to be done.

We will release 4.0.2 (a few bug fixes that we want released) first when that is done we will merge this PR after a second review.

@WesleyCharlesBlake
Copy link

is there a way to override the config to have this enabled using a custom.ini config with this operator?

@Cellebyte
Copy link
Contributor

@NissesSenap any update as v4.0.2 is now released?

@NissesSenap
Copy link
Collaborator

I will look through later tonight but if I remember correctly all is fine.
Then we will have a internal talk but I think we should be able to cut a new release rather quickly after this to make it available to you as quickly as possible.

@NissesSenap NissesSenap self-requested a review November 23, 2021 17:22
Copy link
Collaborator

@NissesSenap NissesSenap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM i removed the update of go.mod, and running go mod tidy didn't bring em back and I can't see why your change would need em.
I do enjoy the \n in .gitignore.

Great job and sorry for the delay @apryor6

@NissesSenap NissesSenap merged commit 4e6f4c3 into grafana:master Nov 23, 2021
@apryor6
Copy link
Contributor Author

apryor6 commented Nov 23, 2021

Woo hoo. Thanks very much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hacktoberfest v4 Major version 4
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants