Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vmagent: wrong relabeling via -remoteWrite.urlRelabelConfig #599

Closed
Allineer opened this issue Jul 2, 2020 · 14 comments
Closed

vmagent: wrong relabeling via -remoteWrite.urlRelabelConfig #599

Allineer opened this issue Jul 2, 2020 · 14 comments
Labels
bug Something isn't working

Comments

@Allineer
Copy link

Allineer commented Jul 2, 2020

Describe the bug

-remoteWrite.urlRelabelConfig:

- source_labels: [job, agent]
  regex: postgres;gate
  replacement: true
  target_label: __keep

- source_labels: [__keep]
  regex: true
  action: keep

I see metrics for other jobs and agents in the database:

Снимок экрана 2020-07-02 в 15 34 29

Снимок экрана 2020-07-02 в 21 45 25

another -remoteWrite.urlRelabelConfig:

- source_labels: [job, agent]
  regex: extra;shkoder
  replacement: true
  target_label: __keep

- source_labels: [__keep]
  regex: true
  action: keep

Снимок экрана 2020-07-02 в 15 30 25

Expected behavior

Correct filtering.

Version

vmagent-20200701-124503-tags-v1.37.4-0-g8da3f773a and v1.37.3
@valyala
Copy link
Collaborator

valyala commented Jul 2, 2020

@Allineer , could you add {job!="victoria-metrics"} filter to the query from the first screenshot in order to see metrics for other jobs? Make sure you didn't enable self-scraping on VictoriaMetrics side with -selfScrapeInterval, because the self-scraped metrics aren't filtered by vmagent.

@Allineer
Copy link
Author

Allineer commented Jul 3, 2020

@valyala, no other metrics atm.

Снимок экрана 2020-07-03 в 07 55 21

self-scraping is disabled.

@Allineer
Copy link
Author

Allineer commented Jul 3, 2020

Something new.
Third database. Postgres job is in testing stage, so it's metrics routed by vmagent only for @testing database. But..

Снимок экрана 2020-07-03 в 14 04 24

@Allineer
Copy link
Author

Allineer commented Jul 6, 2020

Updates:

Снимок экрана 2020-07-06 в 07 57 46

Снимок экрана 2020-07-06 в 07 58 17

@Allineer
Copy link
Author

Allineer commented Jul 7, 2020

vmagent's logs:

2020-07-07T07:53:00.466Z	error	VictoriaMetrics/app/vmagent/remotewrite/client.go:237	couldn't send a block with size 19520 bytes to "http://127.0.0.1:8427/api/v1/write": write tcp4 127.0.0.1:39608->127.0.0.1:8427: write: broken pipe; re-sending the block in 2.000 seconds
2020-07-07T08:08:00.472Z	error	VictoriaMetrics/app/vmagent/remotewrite/client.go:237	couldn't send a block with size 19512 bytes to "http://127.0.0.1:8427/api/v1/write": write tcp4 127.0.0.1:43568->127.0.0.1:8427: write: broken pipe; re-sending the block in 2.000 seconds
2020-07-07T08:13:00.466Z	error	VictoriaMetrics/app/vmagent/remotewrite/client.go:237	couldn't send a block with size 19530 bytes to "http://127.0.0.1:8427/api/v1/write": write tcp4 127.0.0.1:44794->127.0.0.1:8427: write: broken pipe; re-sending the block in 2.000 seconds

I have 20-30 messages with this error daily.
:8427 is the victoria-metrics@etc instance.

@Allineer
Copy link
Author

Allineer commented Jul 7, 2020

WOW!

Снимок экрана 2020-07-07 в 11 35 34

@valyala
Copy link
Collaborator

valyala commented Jul 9, 2020

The last graph looks really weird - scrape_samples_post_metric_relabeling metric shouldn't contain type label unless this label is explicitly added via relabeling (I assume this isn't your case according to relabeling configs posted in the initial message).

I'm still investigating this issue... It looks like some data race or improper cleanup of reused structures during relabeling.

@Allineer
Copy link
Author

Allineer commented Jul 10, 2020

I have a really extensive promscrape config with a lot of relabeling.
And even if I assume that I add the type label as a result of erroneous relabeling, then why does it appear only twice, for a short period and without my participation?

valyala added a commit that referenced this issue Jul 10, 2020
…ig` options are set

Previously multiple goroutines could access remoteWriteCtx.tss concurrently, which could lead to data race
and improper relabeling. Now each goroutine has its own copy of tss during relabeling.

Updates #599
valyala added a commit that referenced this issue Jul 10, 2020
…ig` options are set

Previously multiple goroutines could access remoteWriteCtx.tss concurrently, which could lead to data race
and improper relabeling. Now each goroutine has its own copy of tss during relabeling.

Updates #599
@valyala
Copy link
Collaborator

valyala commented Jul 10, 2020

@Allineer , the commit a730b3f should fix the issue. Could you try building vmagent from this commit according to these instructions and verify whether the issue is gone?

@Allineer
Copy link
Author

Allineer commented Jul 12, 2020

@valyala

vmagent-20200712-133719-heads-master-0-ga730b3f6 (in instance with multiple -remoteWrite.urlRelabelConfig options only):

panic: interface conversion: interface {} is []prompbmarshal.TimeSeries, not *[]prompbmarshal.TimeSeries

goroutine 33 [running]:
github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite.(*remoteWriteCtx).Push(0xc000023fc0, 0xc0004fc600, 0x5, 0x8)
	/tmp/VictoriaMetrics/app/vmagent/remotewrite/remotewrite.go:218 +0x46c
github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite.Push(0xc0004d4718)
	/tmp/VictoriaMetrics/app/vmagent/remotewrite/remotewrite.go:145 +0xfb
github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape.(*scrapeWork).scrapeInternal(0xc0004d4600, 0x1734359e903, 0x0, 0xf1e8c0)
	/tmp/VictoriaMetrics/lib/promscrape/scrapework.go:230 +0x3f0
github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape.(*scrapeWork).scrapeAndLogError(0xc0004d4600, 0x1734359e903)
	/tmp/VictoriaMetrics/lib/promscrape/scrapework.go:183 +0x4d
github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape.(*scrapeWork).run(0xc0004d4600, 0xc0002d0f60)
	/tmp/VictoriaMetrics/lib/promscrape/scrapework.go:157 +0x3b7
github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape.(*scraperGroup).update.func1(0xc000092140, 0xc0004d4600, 0xc0004b1988)
	/tmp/VictoriaMetrics/lib/promscrape/scraper.go:284 +0x69
created by github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape.(*scraperGroup).update
	/tmp/VictoriaMetrics/lib/promscrape/scraper.go:282 +0x3e2

vmagent-20200708-173353-tags-v1.38.0-0-gb66c7c13a has no errors with some configs.

@valyala
Copy link
Collaborator

valyala commented Jul 14, 2020

@Allineer , thanks for testing! The panic should be fixed in the commit b442a42

@Allineer
Copy link
Author

Already testing...

@valyala
Copy link
Collaborator

valyala commented Jul 14, 2020

@Allineer , FYI, all the bugfix commits mentioned in this issue have been included in v1.38.1.

@Allineer
Copy link
Author

It's working! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants