Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent Flow Convert Issues - Issues with Dashes in Job Name and Incorrect Scrape Targets #5077

Closed
SeamusGrafana opened this issue Sep 4, 2023 · 9 comments · Fixed by #5102 or #5117
Closed
Assignees
Labels
bug Something isn't working frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed.
Milestone

Comments

@SeamusGrafana
Copy link
Contributor

SeamusGrafana commented Sep 4, 2023

What's wrong?

When using Grafana Agent (Flow) to convert a Prometheus Config File to a Flow Config File, the outcome is not what I would expect and there are several issues.

The first is if the config file contains scrape_classic_histograms , it fails with the below error;

AGENT_MODE=flow grafana-agent convert --source-format=prometheus --output /home/srt/dev/docker-compose/converted_mimir_distributed.yaml /home/srt/dev/repos/grafana/mimir/development/mimir-microservices-mode/config/prometheus.yaml --bypass-errors
Error: (Critical) failed to parse Prometheus config: yaml: unmarshal errors:
  line 41: field scrape_classic_histograms not found in type config.ScrapeConfig

I would have expected the --bypass-errors would have skipped this if its not supported in Flow.

The next is that it fails due to an issue with the Job Name containing a - (dash/hypen) while valid in Prometheus (the config works) it fails in Flow with the below error;

AGENT_MODE=flow grafana-agent convert --source-format=prometheus --output /home/srt/dev/docker-compose/converted_mimir_distributed.yaml /home/srt/dev/repos/grafana/mimir/development/mimir-microservices-mode/config/prometheus.yaml --bypass-errors
Error: (Error) unsupported rule_files config was provided
(Info) Converted scrape_configs job_name "mimir-microservices" into...
        A prometheus.scrape.mimir-microservices component
        A discovery.relabel.mimir-microservices component
(Info) Converted 1 remote_write[s] "" into...
        A prometheus.remote_write.default component
(Critical) failed to render Flow config: 1:19: expected block label to be a valid identifier (and 1 more diagnostics)

Upon further testing, I found that changing the Scrape Job Name from - job_name: mimir-microservices to - job_name: mimir_microservices (replacing the Dash/Hypen with an Underscore) resolved this and the Flow Config was then generated.

However, finally, in my Flow Config that was outputted the Discovery section contains the same host for all items, when there were multiple unique hosts in the original Config, example

Original Prometheus Config;

scrape_configs:
  - job_name: mimir-microservices
    static_configs:
      - targets:
          - 'distributor-1:8000'
          - 'distributor-2:8001'
          - 'ingester-1:8002'
          - 'ingester-2:8003'
          - 'querier:8004'

Output Flow Config;

discovery.relabel "mimir_microservices" {
	targets = concat(
		[{
			__address__ = "load-generator:9900",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "load-generator:9900",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "load-generator:9900",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "load-generator:9900",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],

The load-generator:9900 is the last value in the original Prometheus Configuration.

Steps to reproduce

Download this Prometheus Configuration;

https://github.com/grafana/mimir/blob/main/development/mimir-microservices-mode/config/prometheus.yaml

Verify that it works in Prometheus.

Try to convert the config to Agent Flow

System information

Linux, Manjaro

Software version

grafana-agent --version ─╯
agent, version 0.35.4 (branch: , revision: unknown)
build user:
build date: 2023-08-24T12:01:47Z
go version: go1.21.0
platform: linux/amd64
tags: builtinassets,promtail_journal_enabled

Configuration

Output Configuration;

discovery.relabel "mimir_microservices" {
	targets = concat(
		[{
			__address__ = "load-generator:9900",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "load-generator:9900",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "load-generator:9900",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "load-generator:9900",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "load-generator:9900",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "load-generator:9900",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "load-generator:9900",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "load-generator:9900",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "load-generator:9900",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "load-generator:9900",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "load-generator:9900",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "load-generator:9900",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "load-generator:9900",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "load-generator:9900",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
	)

	rule {
		source_labels = ["__address__"]
		regex         = "([^:]+)(:[0-9]+)?"
		target_label  = "pod"
		replacement   = "${1}"
	}

	rule {
		source_labels = ["namespace", "pod"]
		separator     = "/"
		regex         = "(.+?)(-\\d+)?"
		target_label  = "job"
		replacement   = "${1}"
	}

	rule {
		source_labels = ["pod"]
		regex         = "(.+?)(-\\d+)?"
		target_label  = "container"
		replacement   = "${1}"
	}
}

prometheus.scrape "mimir_microservices" {
	targets         = discovery.relabel.mimir_microservices.targets
	forward_to      = [prometheus.remote_write.default.receiver]
	job_name        = "mimir_microservices"
	scrape_interval = "5s"
	scrape_timeout  = "5s"
}

prometheus.remote_write "default" {
	external_labels = {
		scraped_by = "prometheus",
	}

	endpoint {
		url = "http://distributor-1:8000/api/v1/push"

		queue_config {
			capacity             = 2500
			max_shards           = 200
			max_samples_per_send = 500
		}

		metadata_config {
			max_samples_per_send = 500
		}
	}
}

Logs

No response

@SeamusGrafana SeamusGrafana added the bug Something isn't working label Sep 4, 2023
@SeamusGrafana
Copy link
Contributor Author

Another issue I noticed was that the prometheus.scrape has a targets targets = discovery.relabel.mimir_microservices.targets but when using this I got an error that it was not valid. Changing the .targets to .output worked.

Is this an issue with the conversion or an option that should be available?

@erikbaranowski erikbaranowski self-assigned this Sep 5, 2023
@tpaschalis tpaschalis added this to the v0.37.0 milestone Sep 5, 2023
@erikbaranowski
Copy link
Contributor

erikbaranowski commented Sep 5, 2023

There's a bunch here so I will do this in multiple comments

Error: (Critical) failed to parse Prometheus config: yaml: unmarshal errors: line 41: field scrape_classic_histograms not found in type config.ScrapeConfig

Critical failures mean that the converter cannot move forward and cannot be bypassed. This specific error indicates that the prometheus config provided is not valid to run with the version of prometheus we are using. This failure occurs before any of the converter specific code executes. Would you be able to share your input for this?

@erikbaranowski
Copy link
Contributor

(Critical) failed to render Flow config: 1:19: expected block label to be a valid identifier

This is an issue with the exact workaround you came up with that will be resolved with v0.37.0. Here are the PRs for River and Agent that resolve it

grafana/river#18
#4998

To fix it properly this work was necessary since River identifiers are fairly strict in how they are constructed.

https://grafana.com/docs/agent/next/flow/config-language/syntax/#identifiers

@erikbaranowski
Copy link
Contributor

I will need another PR to fix the valid identifier issue everywhere for prometheus (missed a couple spots in the above PR).

For the repeating targets rather than the correct targets I have a working fix and will create a PR for this.

@erikbaranowski
Copy link
Contributor

discovery.relabel having the wrong export name was already corrected #4500 and should be fixed in v0.36.0

@erikbaranowski
Copy link
Contributor

Here's the PR with the fixes discussed above: #5102

I tested the config you provided here: https://github.com/grafana/mimir/blob/main/development/mimir-microservices-mode/config/prometheus.yaml

Please let me know if the below output matches your expectation after my code changes.

(Error) unsupported rule_files config was provided

The river output is:

discovery.relabel "mimir_microservices" {
	targets = concat(
		[{
			__address__ = "distributor-1:8000",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "distributor-2:8001",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "ingester-1:8002",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "ingester-2:8003",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "querier:8004",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "ruler-1:8021",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "ruler-2:8022",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "compactor:8006",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "query-frontend:8007",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "store-gateway-1:8008",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "store-gateway-2:8009",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "query-scheduler:8011",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "memcached-exporter:9150",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
		[{
			__address__ = "load-generator:9900",
			cluster     = "docker-compose",
			namespace   = "mimir-microservices-mode",
		}],
	)

	rule {
		source_labels = ["__address__"]
		regex         = "([^:]+)(:[0-9]+)?"
		target_label  = "pod"
		replacement   = "${1}"
	}

	rule {
		source_labels = ["namespace", "pod"]
		separator     = "/"
		regex         = "(.+?)(-\\d+)?"
		target_label  = "job"
		replacement   = "${1}"
	}

	rule {
		source_labels = ["pod"]
		regex         = "(.+?)(-\\d+)?"
		target_label  = "container"
		replacement   = "${1}"
	}
}

prometheus.scrape "mimir_microservices" {
	targets         = discovery.relabel.mimir_microservices.output
	forward_to      = [prometheus.remote_write.default.receiver]
	job_name        = "mimir-microservices"
	scrape_interval = "5s"
	scrape_timeout  = "5s"
}

prometheus.remote_write "default" {
	external_labels = {
		scraped_by = "prometheus",
	}

	endpoint {
		url                    = "http://distributor-1:8000/api/v1/push"
		send_native_histograms = true

		queue_config { }

		metadata_config { }
	}
}

@SeamusGrafana
Copy link
Contributor Author

There's a bunch here so I will do this in multiple comments

Error: (Critical) failed to parse Prometheus config: yaml: unmarshal errors: line 41: field scrape_classic_histograms not found in type config.ScrapeConfig

Critical failures mean that the converter cannot move forward and cannot be bypassed. This specific error indicates that the prometheus config provided is not valid to run with the version of prometheus we are using. This failure occurs before any of the converter specific code executes. Would you be able to share your input for this?

It should be the same config file;

https://github.com/grafana/mimir/blob/main/development/mimir-microservices-mode/config/prometheus.yaml

In particular line 41.

@SeamusGrafana
Copy link
Contributor Author

Please let me know if the below output matches your expectation after my code changes.

Yeah, that matches roughly what I was expecting, tested it here, and it works, no obvious issues etc.

That is great, Thanks for the assistance here, much appreciated.

@erikbaranowski
Copy link
Contributor

I tested the new file with histograms and needed to add a couple more unsupported checks for histograms so they don't silently get ignored. #5117

Both fixes will be released today along with a v0.36.1 patch.

@github-actions github-actions bot added the frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed. label Feb 21, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 21, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed.
Projects
No open projects
3 participants