Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bool Operator does not work as expected #5220

Closed
toninis opened this Issue Feb 14, 2019 · 1 comment

Comments

Projects
None yet
2 participants
@toninis
Copy link

toninis commented Feb 14, 2019

Bug Report

What did you do?
Tried to bool compare two vectors.
Vector 1

count by (container_env_monitored_environment_name) (container_last_seen{job="dcos-docker-metrics",container_env_driver_type="engine"})

which has output:

{container_env_monitored_environment_name="value1"} | 1
{container_env_monitored_environment_name="value2"} | 1
{container_env_monitored_environment_name="value3"} | 1
{container_env_monitored_environment_name="value4"} | 1

Vector 2

count by (container_env_monitored_environment_name) (container_last_seen{job="dcos-docker-metrics",container_env_driver_type="riskevolution"})  

which has output:

{container_env_monitored_environment_name="value1"} | 1
{container_env_monitored_environment_name="value2"} | 1

So I compared the two vectors with the bool option:

count by (container_env_monitored_environment_name) (container_last_seen{job="dcos-docker-metrics",container_env_driver_type="engine"}) == bool count by (container_env_monitored_environment_name) (container_last_seen{job="dcos-docker-metrics",container_env_driver_type="riskevolution"})  

What did you expect to see?
Based on docs here :

Comparison operators are defined between scalar/scalar, vector/scalar, and vector/vector value pairs. By default they filter. Their behavior can be modified by providing bool after the operator, which will return 0 or 1 for the value rather than filtering.

Shouldn't the result of this operation using the bool option to keep the ones unmached and provide the output below.

{container_env_monitored_environment_name="value4"} | 0
{container_env_monitored_environment_name="value2"} | 1
{container_env_monitored_environment_name="value3"} | 0
{container_env_monitored_environment_name="value1"} | 1

What did you see instead? Under which circumstances?
Instead the expected results get filtered out as below. Is this the corect behaviour ?
What am I missing ?

{container_env_monitored_environment_name="value1"} | 1
{container_env_monitored_environment_name="value2"} | 1

Environment

  • System information:
    Linux 4.15.0-1031-aws x86_64
  • Docker Info
Client:
 Version:           18.09.1
 API version:       1.39
 Go version:        go1.10.6
 Git commit:        4c52b90
 Built:             Wed Jan  9 19:35:31 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.1
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.6
  Git commit:       4c52b90
  Built:            Wed Jan  9 19:02:44 2019
  OS/Arch:          linux/amd64
  Experimental:     false
  • Prometheus version:
prometheus, version 2.6.1 (branch: HEAD, revision: b639fe140c1f71b2cbad3fc322b17efe60839e7e)
  build user:       root@4c0e286fe2b3
  build date:       20190115-19:12:04
  go version:       go1.11.4
  • Alertmanager version:
alertmanager, version 0.15.3 (branch: HEAD, revision: d4a7697cc90f8bce62efe7c44b63b542578ec0a1)
  build user:       root@4ecc17c53d26
  build date:       20181109-15:40:48
  go version:       go1.11.2
  • Prometheus configuration file:
# global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - scheme: http
    static_configs:
    - targets:
      - alertmanager:9093


# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  - "alert.rules"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090']

  # EC2 Jobs
  - job_name: 'mss-production-nodes'
    ec2_sd_configs:
      - region: eu-west-1
        access_key: *******
        secret_key: *******
        port: 9100
    relabel_configs:
      - source_labels: [__meta_ec2_tag_Monitored]
        regex: PrometheusProduction.*
        action: keep
      - source_labels: [__meta_ec2_tag_privateurl]
        regex:  '(.*)'
        target_label: __address__
        replacement: '${1}:9100'
      - source_labels: [__meta_ec2_tag_Name]
        regex:  '(.*)'
        target_label: hostname
        replacement: '${1}'
  - job_name: 'mss-production-nodes-services'
    ec2_sd_configs:
      - region: eu-west-1
        access_key: ******
        secret_key: ******
        port: 9265
    relabel_configs:
      - source_labels: [__meta_ec2_tag_Monitored]
        regex: PrometheusProduction.*
        action: keep
      - source_labels: [__meta_ec2_private_ip]
        regex:  '(.*)'
        target_label: __address__
        replacement: '${1}:9256'
      - source_labels: [__meta_ec2_tag_Name]
        regex:  '(.*)'
        target_label: hostname
        replacement: '${1}'
  - job_name: 'dcos-cluster-nodes'
    ec2_sd_configs:
      - region: eu-west-1
        access_key: *****
        secret_key: *******
        port: 9100
    relabel_configs:
      - source_labels: [__meta_ec2_tag_Monitored]
        regex: DCOSProduction.*
        action: keep
      - source_labels: [__meta_ec2_tag_privateurl]
        regex:  '(.*)'
        target_label: __address__
        replacement: '${1}:9100'
      - source_labels: [__meta_ec2_tag_Name]
        regex:  '(.*)'
        target_label: hostname
        replacement: '${1}'
  - job_name: 'elastic-node-monitoring'
    ec2_sd_configs:
      - region: eu-west-1
        access_key: *****
        secret_key: ******
    relabel_configs:
      - source_labels: [__meta_ec2_tag_Monitored]
        regex: ElasticProduction.*
        action: keep
      - source_labels: [__meta_ec2_tag_privateurl]
        regex:  '(.*)'
        target_label: __address__
        replacement: '${1}:9100'
      - source_labels: [__meta_ec2_tag_Name]
        regex:  '(.*)'
        target_label: hostname
        replacement: '${1}'
  - job_name: 'elasticsearch'
    metrics_path: "/_prometheus/metrics"
    ec2_sd_configs:
      - region: eu-west-1
        access_key: ****
        secret_key: ******      
    relabel_configs:
      - source_labels: [__meta_ec2_tag_Monitored]
        regex: ElasticProduction.*
        action: keep
      - source_labels: [__meta_ec2_private_ip]
        regex:  '(.*)'
        target_label: __address__
        replacement: '${1}:9200'
      - source_labels: [__meta_ec2_tag_Name]
        regex:  '(.*)'
        target_label: hostname
        replacement: '${1}'
  - job_name: 'dcos-docker-metrics'
    ec2_sd_configs:
      - region: eu-west-1
        access_key: ****
        secret_key: ***
    relabel_configs:
      - source_labels: [__meta_ec2_tag_Monitored]
        regex: DCOSProduction.*
        action: keep
      - source_labels: [__meta_ec2_tag_privateurl]
        regex:  '(.*)'
        target_label: __address__
        replacement: '${1}:11256'
      - source_labels: [__meta_ec2_tag_Name]
        regex:  '(.*)'
        target_label: hostname
        replacement: '${1}'
    metric_relabel_configs:
      - source_labels: [container_label_MESOS_TASK_ID]
        regex:  '(.*)\.(.*)'
        target_label: mesos_service_name
        replacement: '${1}'
      - source_labels: [container_env_spark_executor_opts]
        regex:  '-DCLIENT_NAME=(.*) -DEXECUTOR_TYPE=(.*)'
        target_label: container_env_monitored_environment_name
        replacement: '${1}'
      - source_labels: [container_env_spark_executor_opts]
        regex:  '-DCLIENT_NAME=(.*) -DEXECUTOR_TYPE=(.*)'
        target_label: container_env_executor_type
        replacement: '${2}'
  • Alertmanager configuration file:
global:
  resolve_timeout: 3m
  slack_api_url: *****************
templates:
- '/etc/alertmanager/templates/*.tmpl'
receivers:
- name: slack
  slack_configs:
  - api_url: ****************
    channel: '#******'
    send_resolved: true
  email_configs:
  - to: ********************
    from: *************************
    smarthost: ***********************
    auth_username: ********************
    auth_password: ****************
    send_resolved: true

route:
  group_by:
  - alertname
  - cluster
  - service
  group_interval: 2m
  group_wait: 30s
  receiver: slack
  repeat_interval: 4h
@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Feb 14, 2019

Normal matching behaviour is what's happening here, try with values other than 1 and you'll see what's happening.

It makes more sense to ask questions like this on the prometheus-users mailing list rather than in a GitHub issue. On the mailing list, more people are available to potentially respond to your question, and the whole community can benefit from the answers provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.