You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
FAIL: TestRemoteWrite/telegraf/Up (10.03s)
Telegraf does not create up metrics (this is unlikely to change)
PASS: TestRemoteWrite/telegraf/Gauge (10.03s)
FAIL: TestRemoteWrite/telegraf/Invalid (10.03s)
We drop the metric where it's invalid but do not create an up metric resulting in the failure of this test
Test output:
=== CONT TestRemoteWrite/telegraf/Invalid
up.go:39:
Error Trace: up.go:39
main_test.go:94
main_test.go:62
Error: Should be true
Test: TestRemoteWrite/telegraf/Invalid
Messages: found zero samples for up{job="test"}
=== CONT TestRemoteWrite/telegraf/Staleness
staleness.go:43:
Error Trace: staleness.go:43
main_test.go:94
main_test.go:62
Error: Should be true
Test: TestRemoteWrite/telegraf/Staleness
Messages: found no staleness markers for stale{job="test"}
=== CONT TestRemoteWrite/telegraf/Up
up.go:23:
Error Trace: up.go:23
main_test.go:94
main_test.go:62
Error: Should be true
Test: TestRemoteWrite/telegraf/Up
Messages: found zero samples for up{job="test"}
The text was updated successfully, but these errors were encountered:
So I don't understand how Telegraf could ever reasonably generate Staleness markers. We could do this on a time-delay basis, but this doesn't provide any actual information about whether the metric has gone away or is just missing from the reported data. Telegraf is merely relaying data from elsewhere and knows nothing about the underlying data in a metric.
A Telegraf prometheus output would not now when a scrape fails, nor would it know about the collection interval, so it has no reasonable way to interleave staleness markers. It's also not reasonable for a Prometheus input to generate the staleness markers, as these would end up going to all outputs, even non-prometheus ones, where they have dubious value and meaning.
Closing as we have fixed the tests that are in line with how Telegraf works. If more work to be done on this can reopen this or open a new separate issue.
Prometheus remote write compliance test cases:
https://github.com/tomwilkie/remote-write-compliance
Run telegraf tests with
go test -run "TestRemoteWrite/telegraf/.+" -v ./
Initial results were 6/13 passing
Removed up as a dependency on NameLabel & RepeatedLabels Don't require the up metric for non-up-metric tests. prometheus/compliance#16
Configuration option for instance label (could add to our prometheus read me Setup telegraf to set the instance tag correctly. prometheus/compliance#17
Fixed: repeated labels: Integration tests for Prometheus Remote Write #9166
Fixed empty labels: Integration tests for Prometheus Remote Write #9166
PASS: TestRemoteWrite/telegraf/Counter (10.03s)
PASS: TestRemoteWrite/telegraf/RepeatedLabels (10.03s)
PASS: TestRemoteWrite/telegraf/SortedLabels (10.03s)
PASS: TestRemoteWrite/telegraf/EmptyLabels (10.03s)
PASS: TestRemoteWrite/telegraf/Timestamp (10.03s)
FAIL: TestRemoteWrite/telegraf/Staleness (10.03s)
PASS: TestRemoteWrite/telegraf/Histogram (10.03s)
PASS: TestRemoteWrite/telegraf/NameLabel (10.03s)
PASS: TestRemoteWrite/telegraf/JobLabel (10.03s)
PASS: TestRemoteWrite/telegraf/InstanceLabel (10.03s)
FAIL: TestRemoteWrite/telegraf/Up (10.03s)
Telegraf does not create up metrics (this is unlikely to change)
PASS: TestRemoteWrite/telegraf/Gauge (10.03s)
FAIL: TestRemoteWrite/telegraf/Invalid (10.03s)
We drop the metric where it's invalid but do not create an up metric resulting in the failure of this test
Test output:
The text was updated successfully, but these errors were encountered: