Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No error while setting and empty string to scrape_interval #31

Open
Abuelodelanada opened this issue Oct 2, 2023 · 1 comment
Open

Comments

@Abuelodelanada
Copy link

Abuelodelanada commented Oct 2, 2023

Bug Description

Setting and empty string to scrape_interval config only produce an ERROR in debug-log but not in the juju status nor in the command line.
Besides removes part of the prometheus config without alerting.

To Reproduce

  1. Deploy this bundle:
bundle: kubernetes
applications:
  prometheus:
    charm: prometheus-k8s
    channel: edge
    revision: 150
    series: focal
    resources:
      prometheus-image: 128
    scale: 1
    constraints: arch=amd64
    storage:
      database: kubernetes,1,1024M
    trust: true
  scrape-config:
    charm: prometheus-scrape-config-k8s
    channel: edge
    revision: 42
    series: focal
    scale: 1
    constraints: arch=amd64
  zinc:
    charm: zinc-k8s
    channel: edge
    revision: 124
    resources:
      zinc-image: 120
    scale: 1
    constraints: arch=amd64
    storage:
      data: kubernetes,1,1024M
relations:
- - prometheus:metrics-endpoint
  - scrape-config:metrics-endpoint
- - zinc:metrics-endpoint
  - scrape-config:configurable-scrape-jobs
  1. Check that zinc scrape jobs landed at prometheus:
$ juju ssh --container prometheus prometheus/0 cat /etc/prometheus/prometheus.yml
global:
  evaluation_interval: 1m
  scrape_interval: 1m
  scrape_timeout: 10s
rule_files:
- /etc/prometheus/rules/juju_*.rules
scrape_configs:
- honor_timestamps: true
  job_name: prometheus
  metrics_path: /metrics
  relabel_configs:
  - regex: (.*)
    separator: _
    source_labels:
    - juju_model
    - juju_model_uuid
    - juju_application
    - juju_unit
    target_label: instance
  scheme: http
  scrape_interval: 5s
  scrape_timeout: 5s
  static_configs:
  - labels:
      host: localhost
      juju_application: prometheus
      juju_model: cos
      juju_model_uuid: 803f1f9b-8e3c-4414-8e9a-07966d834767
      juju_unit: prometheus-k8s
    targets:
    - prometheus-0.prometheus-endpoints.cos.svc.cluster.local:9090
- honor_labels: true
  job_name: juju_cos_803f1f9b_zinc_prometheus_scrape-0
  metrics_path: /metrics
  relabel_configs:
  - regex: (.*)
    separator: _
    source_labels:
    - juju_model
    - juju_model_uuid
    - juju_application
    - juju_unit
    target_label: instance
  static_configs:
  - labels:
      juju_application: zinc
      juju_charm: zinc-k8s
      juju_model: cos
      juju_model_uuid: 803f1f9b-8e3c-4414-8e9a-07966d834767
      juju_unit: zinc/0
    targets:
    - 10.1.38.114:4080
  1. Change scrape_interval: juju config scrape-config scrape_interval="12s"
  2. Check this config landed at prometheus config:
$ juju ssh --container prometheus prometheus/0 cat /etc/prometheus/prometheus.yml | grep "scrape_interval: 12s"
  scrape_interval: 12s
  1. Set an empty string to that config: $ juju config scrape-config scrape_interval=""
  2. Check the command return no error:
$ echo $?
0
  1. Check that zinc scrape job has gone:
$ juju ssh --container prometheus prometheus/0 cat /etc/prometheus/prometheus.yml                                                       
global:
  evaluation_interval: 1m
  scrape_interval: 1m
  scrape_timeout: 10s
rule_files:
- /etc/prometheus/rules/juju_*.rules
scrape_configs:
- honor_timestamps: true
  job_name: prometheus
  metrics_path: /metrics
  relabel_configs:
  - regex: (.*)
    separator: _
    source_labels:
    - juju_model
    - juju_model_uuid
    - juju_application
    - juju_unit
    target_label: instance
  scheme: http
  scrape_interval: 5s
  scrape_timeout: 5s
  static_configs:
  - labels:
      host: localhost
      juju_application: prometheus
      juju_model: cos
      juju_model_uuid: 803f1f9b-8e3c-4414-8e9a-07966d834767
      juju_unit: prometheus-k8s
    targets:
    - prometheus-0.prometheus-endpoints.cos.svc.cluster.local:9090

Environment

  • juju 3.1.5

Relevant log output

unit-prometheus-0: 17:33:47.794 INFO unit.prometheus/0.juju-log metrics-endpoint:2: HTTP Request: GET https://10.152.183.1/api/v1/namespaces/cos/pods/prometheus-0 "HTTP/1.1 200 OK"
unit-prometheus-0: 17:33:47.835 ERROR unit.prometheus/0.juju-log metrics-endpoint:2: Validating scrape jobs failed: b'time="2023-10-02T20:33:47Z" level=fatal msg="parsing YAML file /tmp/tmpneye6lpy: empty duration string"\n'
unit-prometheus-0: 17:33:47.868 INFO unit.prometheus/0.juju-log metrics-endpoint:2: Pushed new configuration
unit-prometheus-0: 17:33:47.928 INFO unit.prometheus/0.juju-log metrics-endpoint:2: HTTP Request: GET https://10.152.183.1/api/v1/namespaces/cos/pods/prometheus-0 "HTTP/1.1 200 OK"

Additional context

No response

@dstathis
Copy link
Contributor

There are to things to consider here. I prometheus-scrape-config we should validate input and block if the data is wrong.

Also, if Prometheus receives a bad scrape config, it should skip that job and continue working. Which it does. An open question is if it should set a blocked status to notify the user that something is wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants