Skip to content

Response time for ping test on netprobe quantiles metrics increases as time goes by #554

@manrodrigues

Description

@manrodrigues

When an agent with netprobe tap is running a netprobe policy, the response time for "ping" test increases as time goes by and it is weird.

I have two agents running and both have similar behavior:

image.png

I have another agent, on the same machine, that starts after the 2 mentioned.. the pattern looks similar, but the response time is different:

image

/api/v1/policies/policy-probe/metrics/prometheus data:

agent1_policy-probe.txt

agent0_policy-probe.txt

/api/v1/policies/__all/metrics/prometheus data:

agent1_all.txt

agent0_all.txt

To reproduce:

Agent tap:

  taps:
    default_netprobe:
      input_type: netprobe
      config:
        test_type: ping
      tags:
        virtual: true
        vhost: 1

Policy:

handlers:
  modules:
    default_netprobe:
      type: netprobe
      metric_groups:
        enable:
          - counters
          - quantiles
input:
  tap: default_netprobe
  input_type: netprobe
  config:
    test_type: ping
    packets_per_test: 5
    interval_msec: 2000
    timeout_msec: 1000
    packets_interval_msec: 10
    packet_payload_size: 56
    targets:
      www.google.com:
        target: www.google.com
      'orb live':
        target: orb.live
      orb_community:
        target: orb.community       
kind: collection

grafana query: avg(netprobe_response_quantiles_us{quantile="0.5"}) by (agent,target, policy)

obs: i'm usign the query filtering the 'orb live' data because it have another problem so..(avg(netprobe_response_quantiles_us{quantile="0.5", target!="orb live"}) by (agent,target, policy))

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingnetprobe

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions