Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(1693): allow job label to be rewritten in prometheus receiver #3347

Closed
wants to merge 9 commits into from

Conversation

neufeldtech
Copy link

Description:
Bugfix - allow job label to be rewritten in prometheus receiver via file_sd_configs and relabel_configs

When the 'job' label is added to the metadata of file discovery targets (see json below) or is rewritten with a relabel_configs block, the metadata service was failing to look up the target by job because the map of targets was only able to be indexed by the 'original' job name, not the rewritten job name.

This PR solves this issue by iterating through all targets when looking up metadata rather than indexing the job map directly. This lookup operation is done once per scrape per target, so the iteration will take longer given a large number of targets.

Resolves #1693

Testing:

  • Added unit test to transaction_test.go

Manual testing:
Setup similar to the bug report in #1693

// examples/local/targets.json

[
  {
    "labels": {
      "job": "pushgateway"
    },
    "targets": [
      "demo.robustperception.io:9091"
    ]
  }
]

// examples/local/otel-config.yaml

receivers:
  prometheus:
    config:
      scrape_configs:
        - job_name: "defaultJob"
          scrape_interval: 10s
          file_sd_configs:
            - refresh_interval: 10s
              files:
                - "./examples/local/targets.json"

exporters:
  prometheus:
    endpoint: "0.0.0.0:1234"
    namespace: test
    resource_to_telemetry_conversion:
      enabled: true

processors:
  batch:
    send_batch_size: 10000
    timeout: 1000ms
  
service:
  pipelines:
    metrics:
      receivers:
        - prometheus
      processors:
        - batch
      exporters:
        - prometheus

Run the collector

$ make run

Confirmed no longer throwing errors of type unable to find a target group with job=x" and confirmed that the job label is properly exposed as job="pushgateway" at http://localhost:1234/metrics

...
# HELP test_pushgateway_http_requests_total Total HTTP requests processed by the Pushgateway, excluding scrapes.
# TYPE test_pushgateway_http_requests_total counter
test_pushgateway_http_requests_total{code="200",handler="api/v1/metrics",host_name="demo.robustperception.io",instance="demo.robustperception.io:9091",job="pushgateway",method="get",port="9091",scheme="http",service_name="pushgateway"} 18
test_pushgateway_http_requests_total{code="200",handler="push",host_name="demo.robustperception.io",instance="demo.robustperception.io:9091",job="pushgateway",method="post",port="9091",scheme="http",service_name="pushgateway"} 5
...

@neufeldtech neufeldtech requested review from alolita and a team as code owners June 2, 2021 04:20
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jun 2, 2021

CLA Signed

The committers are authorized under a signed CLA.

@dashpole
Copy link
Contributor

dashpole commented Jun 2, 2021

Could this be extended to support internal labels as well, such as __name__? Context: https://github.com/open-telemetry/opentelemetry-collector/issues/2297

@neufeldtech
Copy link
Author

Could this be extended to support internal labels as well, such as __name__? Context: #2297

@dashpole, I think issue #2297 is best treated as a separate issue. The lookup piece in this PR doesn't have any context of name and other hidden labels (that I'm aware of). I.e. if I attempt to log target.Labels() in metadata.go I don't ever see name label.

for _, target := range targetGroup {
switch {
case target.Labels().Get(model.InstanceLabel) == instance && originalJobLabel == job:
// job label was not rewritten
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this will be the common case, can we handle it with the previous direct lookup method and fall back to the iterative search only if it isn't found?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally - I'll take a look at refactoring to apply this optimization

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dashpole @rakyll updated

for _, target := range targetGroup {
if target.Labels().Get(model.InstanceLabel) == instance {
return &mCache{target}, nil
for originalJobLabel, targetGroup := range s.sm.TargetsAll() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep targetGroup, ok := s.sm.TargetsAll()[job] here and fallback to the new loop if it's not found.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, just realized @dashpole already left the same feedback.

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@punya
Copy link
Member

punya commented Jun 18, 2021

@Aneurysm9 please take a look, when you get a chance.

@github-actions
Copy link
Contributor

github-actions bot commented Jul 2, 2021

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jul 2, 2021
@github-actions
Copy link
Contributor

github-actions bot commented Jul 9, 2021

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Jul 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prometheus file-based service discovery failed
7 participants