Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alerts config basic implementation #15

Merged
merged 4 commits into from
Aug 19, 2021
Merged

Conversation

mnkg561
Copy link
Collaborator

@mnkg561 mnkg561 commented Aug 17, 2021

Signed-off-by: nmogulla Naveen_Mogulla@intuit.com

close

Could you share the solution in high level?

  1. This is just a basic implementation and there is still lot to be done
  2. I've created pending tasks as issues in the repo
  3. Will be refactoring lot of code in the next issue to use the same function in both the controllers instead of duplicate code.

Could you share the test results?

  1. Single Alert Config creating multiple alerts
mtvl15367e28a:~ nmogulla$ k get alertsconfig -A -o yaml
apiVersion: v1
items:
- apiVersion: alertmanager.keikoproj.io/v1alpha1
  kind: AlertsConfig
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"alertmanager.keikoproj.io/v1alpha1","kind":"AlertsConfig","metadata":{"annotations":{},"name":"alertsconfig-sample","namespace":"alert-manager-system"},"spec":{"alerts":[{"alertName":"wavefrontalert-sample2","params":{"bar":"warn","foo":"status.health"}},{"alertName":"wavefrontalert-sample3","params":{"bar":"severe","foo":"status.health","zzz":"status.health"}}],"globalGVK":{"group":"alertmanager.keikoproj.io","kind":"WavefrontAlert","version":"v1alpha1"}}}
    creationTimestamp: "2021-08-16T17:23:32Z"
    finalizers:
    - alertsconfig.finalizers.alertmanager.keikoproj.io
    generation: 4
    name: alertsconfig-sample
    namespace: alert-manager-system
    resourceVersion: "3781104"
    uid: 56a2fc50-e2ae-44db-89c5-a5fb412c714f
  spec:
    alerts:
    - alertName: wavefrontalert-sample2
      params:
        bar: warn
        foo: status.health
    - alertName: wavefrontalert-sample3
      params:
        bar: severe
        foo: status.health
        zzz: status.health
    globalGVK:
      group: alertmanager.keikoproj.io
      kind: WavefrontAlert
      version: v1alpha1
  status:
    alertStatus:
      wavefrontalert-sample2:
        alertName: test-alert3
        associatedAlert:
          CR: wavefrontalert-sample2
        id: "1629134612607"
        lastChangeChecksum: 8732d013103b6ac26805d6db1376c8de
        link: https://try.wavefront.com/alerts/1629134612607
        state: Ready
      wavefrontalert-sample3:
        alertName: test-alert4
        associatedAlert:
          CR: wavefrontalert-sample3
        id: "1629220616623"
        lastChangeChecksum: 19a7690ea0baf9b0e9a6af54b5897265
        link: https://try.wavefront.com/alerts/1629220616623
        state: Ready
    alertsCount: 0
    retryCount: 0
    state: Ready
  1. Two different alert configs for same wavefront alert
mtvl15367e28a:~ nmogulla$ k get alertsconfig -A -o yaml
apiVersion: v1
items:
- apiVersion: alertmanager.keikoproj.io/v1alpha1
  kind: AlertsConfig
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"alertmanager.keikoproj.io/v1alpha1","kind":"AlertsConfig","metadata":{"annotations":{},"name":"alertsconfig-sample","namespace":"alert-manager-system"},"spec":{"alerts":[{"alertName":"wavefrontalert-sample2","params":{"bar":"warn","foo":"status.health"}}],"globalGVK":{"group":"alertmanager.keikoproj.io","kind":"WavefrontAlert","version":"v1alpha1"}}}
    creationTimestamp: "2021-08-16T17:23:32Z"
    finalizers:
    - alertsconfig.finalizers.alertmanager.keikoproj.io
    generation: 3
    name: alertsconfig-sample
    namespace: alert-manager-system
    resourceVersion: "3713767"
    uid: 56a2fc50-e2ae-44db-89c5-a5fb412c714f
  spec:
    alerts:
    - alertName: wavefrontalert-sample2
      params:
        bar: warn
        foo: status.health
    globalGVK:
      group: alertmanager.keikoproj.io
      kind: WavefrontAlert
      version: v1alpha1
  status:
    alertStatus:
      wavefrontalert-sample2:
        alertName: test-alert3
        associatedAlert:
          CR: wavefrontalert-sample2
        id: "1629134612607"
        lastChangeChecksum: 8732d013103b6ac26805d6db1376c8de
        link: https://try.wavefront.com/alerts/1629134612607
        state: Ready
    alertsCount: 0
    retryCount: 0
    state: Ready
- apiVersion: alertmanager.keikoproj.io/v1alpha1
  kind: AlertsConfig
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"alertmanager.keikoproj.io/v1alpha1","kind":"AlertsConfig","metadata":{"annotations":{},"name":"alertsconfig-sample2","namespace":"alert-manager-system"},"spec":{"alerts":[{"alertName":"wavefrontalert-sample2","params":{"bar":"severe","foo":"status.health"}}],"globalGVK":{"group":"alertmanager.keikoproj.io","kind":"WavefrontAlert","version":"v1alpha1"}}}
    creationTimestamp: "2021-08-17T17:11:56Z"
    finalizers:
    - alertsconfig.finalizers.alertmanager.keikoproj.io
    generation: 2
    name: alertsconfig-sample2
    namespace: alert-manager-system
    resourceVersion: "3780589"
    uid: 75ba9517-6447-4d35-a64a-fac5a643de37
  spec:
    alerts:
    - alertName: wavefrontalert-sample2
      gvk: {}
      params:
        bar: severe
        foo: status.health
    globalGVK:
      group: alertmanager.keikoproj.io
      kind: WavefrontAlert
      version: v1alpha1
  status:
    alertStatus:
      wavefrontalert-sample2:
        alertName: test-alert3
        associatedAlert:
          CR: wavefrontalert-sample2
        id: "1629220316625"
        lastChangeChecksum: a97f34f9b4566e0cdf703ea304940824
        link: https://try.wavefront.com/alerts/1629220316625
        state: Ready
    alertsCount: 0
    retryCount: 0
    state: Ready
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

mnkg561 and others added 2 commits August 17, 2021 16:25
Copy link
Collaborator

@lilwan lilwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one question regarding creating alert, if a single alert was created through wavefrontalert controller and not associated with any AlertConfig, what does it mean for our cluster alerts use case? I know we need that logic for namespace alerts use case or other standalone alert use case

api/v1alpha1/alertsconfig_types.go Show resolved Hide resolved
api/v1alpha1/alertsconfig_types.go Show resolved Hide resolved
for _, config := range alertsConfig.Spec.Alerts {
// Calculate checksum and compare it with the status checksum
exist, reqChecksum := utils.CalculateAlertConfigChecksum(ctx, config)
// if request and status checksum matches then there is change in this specific alert config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the comment should be there is no change based on your logic?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Let me change it

controllers/alertsconfig_controller.go Show resolved Hide resolved
if err := wavefront.ConvertAlertCRToWavefrontRequest(ctx, wfAlert.Spec, &alert); err != nil {
errMsg := "unable to convert the wavefront spec to Alert API request. will not be retried"
log.Error(err, errMsg)
return r.PatchIndividualAlertsConfigError(ctx, &alertsConfig, config.AlertName, err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did you distinguish retry or not retry here? I do not see any difference between line 162 and 154

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this time, we are retrying for every error case- we can probably improved that during testing to see if REQUEUE is not the best option for specific Usecase and update it accordingly

controllers/alertsconfig_controller.go Show resolved Hide resolved
controllers/alertsconfig_controller.go Show resolved Hide resolved
state := alertmanagerv1alpha1.Error
if strings.Contains(err.Error(), "Exceeded limit setting") {
// For ex: error is "Exceeded limit setting: 100 alerts allowed per customer"
state = alertmanagerv1alpha1.Error
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

controllers/alertsconfig_controller.go Show resolved Hide resolved
controllers/alertsconfig_controller.go Show resolved Hide resolved
Signed-off-by: nmogulla <Naveen_Mogulla@intuit.com>
Signed-off-by: nmogulla <Naveen_Mogulla@intuit.com>
@mnkg561 mnkg561 merged commit ab69dc1 into keikoproj:master Aug 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants