'notifier.url' always set even when 'notifier.blackhole' is set to true #894

hconnan · 2024-02-26T11:19:06Z

Hello,

'notifier.url' parameter is not longer needed by default in Victoria Metrics.

When we would like to disable, we need to set the notifier.blackhole parameter to true. Since Victoria Metrics 1.96, when this parameter is set, we cannot set notifier.url parameter in the same time (source)
However, in the server-deployment template of the victoria-metric-alerts chart, the notifier-url is always set whatever the extra arguments.

How to reproduce it? Deploy VM alert with notifier.blackhole extraArgs set to true.

Fix suggestion: I suggest to add a condition to set the notifier.url parameter when there is no -notifier.blackhole or -notifier.config extra arguments.

The text was updated successfully, but these errors were encountered:

zekker6 · 2024-02-28T07:44:25Z

@hconnan Chart release v0.9.2 allows using -notifier.blackhole correctly. Could you check if this release helps in your case?

hconnan · 2024-03-05T10:46:03Z

Hey! Thanks for the quick response!
Could you please release a new version for [victoria-metrics-k8s-stack](https://github.com/VictoriaMetrics/helm-charts/tree/master/charts/victoria-metrics-k8s-stack) please? I need an updated version for it in order to check if it helps in my case.

EDIT
Ok I just saw a new version has been released for victoria-metrics-k8s-stack. Let me check

hconnan · 2024-03-07T13:54:27Z

It does not work. I got this error : failed to init: failed to init notifier: only one of -notifier.blackhole, -notifier.url and -notifier.config flags must be specified
I saw you add a fix but it seems that, somewhere, the vmalert.alertmanager.urls is set and it's empty by default. With your fix, I still the notifer.url= parameter. I am not sure the existing condition is enough.

zekker6 · 2024-03-07T13:57:14Z

@hconnan Could you please share values file which reproduces this error for you? (with any sensitive information removed)

hconnan · 2024-03-07T14:11:46Z

Sure.

alertmanager:
  enabled: false
coreDns:
  enabled: false
defaultDashboardsEnabled: false
defaultRules:
  create: false
  rules:
    alertmanager: false
    etcd: false
    general: false
    k8s: false
    kubeApiserver: false
    kubeApiserverAvailability: false
    kubeApiserverBurnrate: false
    kubeApiserverHistogram: false
    kubeApiserverSlos: false
    kubePrometheusGeneral: false
    kubePrometheusNodeRecording: false
    kubeScheduler: false
    kubeStateMetrics: false
    kubelet: false
    kubernetesApps: false
    kubernetesResources: false
    kubernetesStorage: false
    kubernetesSystem: false
    network: false
    node: false
    vmagent: false
    vmcluster: false
    vmhealth: false
    vmsingle: false
fullnameOverride: vm-cluster
grafana:
  enabled: false
kube-state-metrics:
  enabled: true
  nameOverride: kube-state-metrics-staging
  namespaces: monitoring-staging
  rbac:
    useClusterRole: true
  replicas: 1
kubeApiServer:
  enabled: false
kubeControllerManager:
  enabled: false
kubeEtcd:
  enabled: false
kubeProxy:
  enabled: false
kubeScheduler:
  enabled: false
kubelet:
  enabled: false
prometheus-node-exporter:
  enabled: false
serviceAccount:
  annotations:
    iam.gke.io/gcp-service-account: xxx
  create: true
  name: vm-cluster
victoria-metrics-operator:
  enabled: false
vmagent:
  enabled: true
  spec:
    externalLabels:
      cluster: xxx
      entity: xxx
      environment: staging
    extraArgs:
      enableTCP6: "true"
      promscrape.suppressScrapeErrorsDelay: 120s
    ignoreNamespaceSelectors: false
    image:
      tag: v1.99.0
    inlineRelabelConfig:
    - source_labels:
      - service
      target_label: gce_instance
    nodeSelector:
      cloud.google.com/gke-nodepool: xxx
    replicaCount: 2
    resources:
      limits:
        cpu: 100m
        memory: 300Mi
      requests:
        cpu: 50m
        memory: 200Mi
    scrapeInterval: 30s
    selectAllByDefault: true
    serviceScrapeNamespaceSelector:
      matchLabels:
        kubernetes.io/metadata.name: monitoring-staging
    tolerations:
    - effect: NoSchedule
      key: dedicated
      operator: Equal
      value: monitoring
vmalert:
  enabled: true
  ingress:
    enabled: true
    hosts:
    - vmalert-staging.victoria-metrics
    ingressClassName: traefik
    tls:
    - secretName: tls-certs
  spec:
    extraArgs:
      notifier.blackhole: "true"
    image:
      tag: v1.99.0
    inlineRelabelConfig:
    - source_labels:
      - service
      target_label: gce_instance
    logFormat: json
    logLevel: INFO
    nodeSelector:
      cloud.google.com/gke-nodepool: xxx
    replicaCount: 2
    resources:
      limits:
        cpu: 100m
        memory: 200Mi
      requests:
        cpu: 50m
        memory: 100Mi
    selectAllByDefault: true
    tolerations:
    - effect: NoSchedule
      key: dedicated
      operator: Equal
      value: monitoring
vmcluster:
  enabled: true
  ingress:
    insert:
      enabled: true
      hosts:
      - vminsert-staging.victoria-metrics
      ingressClassName: traefik
      tls:
      - secretName: tls-certs
    select:
      enabled: true
      hosts:
      - vmselect-staging.victoria-metrics
      ingressClassName: traefik
      tls:
      - secretName: tls-certs
    storage:
      enabled: false
  spec:
    replicationFactor: 2
    retentionPeriod: "1"
    serviceAccountName: vm-cluster
    vminsert:
      extraArgs:
        maxLabelsPerTimeseries: "10000000"
      image:
        tag: v1.99.0-cluster
      nodeSelector:
        cloud.google.com/gke-nodepool:xxx
      replicaCount: 2
      resources:
        limits:
          cpu: "1.5"
          memory: 1000Mi
        requests:
          cpu: "1"
          memory: 500Mi
      tolerations:
      - effect: NoSchedule
        key: dedicated
        operator: Equal
        value: monitoring
    vmselect:
      extraArgs:
        search.maxSeries: "1000000"
        search.maxUniqueTimeseries: "0"
      image:
        tag: v1.99.0-cluster
      nodeSelector:
        cloud.google.com/gke-nodepool: xxx
      replicaCount: 2
      resources:
        limits:
          cpu: "1"
          memory: 1000Mi
        requests:
          cpu: "0.5"
          memory: 500Mi
      tolerations:
      - effect: NoSchedule
        key: dedicated
        operator: Equal
        value: monitoring
    vmstorage:
      containers:
      - command:
        - /bin/sh
        - -c
        - |
          sleep 40
          while true; do
              # every hour we create a snapshot and upload it to latest
              /vmbackup-prod \
                  -storageDataPath=/vm-data \
                  -snapshot.createURL=http://localhost:8482/snapshot/create \
                  -dst=gs://xxx/vmstorage-snapshots/latest/monitoring-staging-$POD_NAME
              # if its 5am we also upload the daily snapshot
              if [ $(date +%H) -eq "05" ]; then
                 /vmbackup-prod \
                    -storageDataPath=/vm-data \
                    -snapshot.createURL=http://localhost:8482/snapshot/create \
                    -dst=gs://xxx/vmstorage-snapshots/daily-$(date +%d-%m-%Y)/monitoring-staging-$POD_NAME
              fi
              sleep 1h
          done
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        image: victoriametrics/vmbackup:v1.99.0
        name: hourly-sidecar-backup
        volumeMounts:
        - mountPath: /vm-data
          name: vmstorage-db
      extraArgs:
        search.maxUniqueTimeseries: "0"
      image:
        tag: v1.99.0-cluster
      nodeSelector:
        cloud.google.com/gke-nodepool: xxx
      replicaCount: 3
      resources:
        limits:
          cpu: "4"
          memory: 12000Mi
        requests:
          cpu: "4"
          memory: 12000Mi
      storage:
        volumeClaimTemplate:
          spec:
            resources:
              requests:
                storage: 1000Gi
      tolerations:
      - effect: NoSchedule
        key: dedicated
        operator: Equal
        value: monitoring
vmsingle:
  enabled: false

So there is no alertmanager.urls set somewhere and you can see there is notifier.blackhole set to true

zekker6 · 2024-03-07T16:29:53Z

@hconnan The chart I've referred to in this comment was actuall victoria-metrics-alert, not victoria-metrics-k8s-stack.
Let me also check k8s-stack chart and apply similar fix there.

…blackhole is set by user Previously, setting "notifier.blackhole" and not using any notifiers would lead to CrashLoopBackoff because vmalert would receive an empty "notifier.url" value. Updates: - #894 - #813

hconnan · 2024-03-15T15:03:08Z

All is good for me! Great job! Thank you very much 😄 🥳

Haleygo added the bug Something isn't working label Feb 27, 2024

Haleygo mentioned this issue Feb 27, 2024

victoria-metrics-alert: Fix possible null value on flag `notifier.url… VictoriaMetrics/helm-charts#897

Merged

Haleygo self-assigned this Feb 27, 2024

zekker6 transferred this issue from VictoriaMetrics/helm-charts Mar 7, 2024

zekker6 mentioned this issue Mar 7, 2024

controllers/factory/vmalert: do not add notifier.* flags if notifier.blackhole is set by user #895

Merged

zekker6 mentioned this issue Mar 8, 2024

[VMAlert] cr.spec.Notifier, cr.spec.notifierConfigRef and discovered alertmanager are empty prevent using flag notifier.blackhole #811

Closed

hconnan closed this as completed Mar 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

'notifier.url' always set even when 'notifier.blackhole' is set to true #894

'notifier.url' always set even when 'notifier.blackhole' is set to true #894

hconnan commented Feb 26, 2024

zekker6 commented Feb 28, 2024

hconnan commented Mar 5, 2024 •

edited

hconnan commented Mar 7, 2024

zekker6 commented Mar 7, 2024

hconnan commented Mar 7, 2024 •

edited

zekker6 commented Mar 7, 2024

hconnan commented Mar 15, 2024

'notifier.url' always set even when 'notifier.blackhole' is set to true #894

'notifier.url' always set even when 'notifier.blackhole' is set to true #894

Comments

hconnan commented Feb 26, 2024

zekker6 commented Feb 28, 2024

hconnan commented Mar 5, 2024 • edited

hconnan commented Mar 7, 2024

zekker6 commented Mar 7, 2024

hconnan commented Mar 7, 2024 • edited

zekker6 commented Mar 7, 2024

hconnan commented Mar 15, 2024

hconnan commented Mar 5, 2024 •

edited

hconnan commented Mar 7, 2024 •

edited