Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate sample for timestamp issue when remote writing to Prometheus server #134

Open
immahi79 opened this issue May 22, 2022 · 1 comment

Comments

@immahi79
Copy link

Hi

Thank you for the Prometheus Flask exporter and grafna dashboard exmple which was very helpful to ingest metrics from microservices running in the ECS cluster. I have followed the article https://aws.amazon.com/blogs/opensource/metrics-collection-from-amazon-ecs-using-amazon-managed-service-for-prometheus/ to deploy and scrape the metrics from each flask target and pushed them into the amazon Prometheus workspace.

I got the below errors when pushing the data collected from the Prometheus flask exporter.

ts=2022-05-18T19:54:33.370Z caller=dedupe.go:112 component=remote level=error remote_name=3f2135 url=http://localhost:8080/workspaces/xxxxx/api/v1/remote_write msg="non-recoverable error" count=500 err="server returned HTTP status 400 Bad Request: user=xxxxx: err: duplicate sample for timestamp. timestamp=2022-05-18T19:54:33.251Z, series={name="flask_http_request_duration_seconds_bucket", cluster="test", instance="10.0.x.xxx", job="ecs_services", le="0.01", method="GET", path="/api/test", service="test-api", status="201", taskid="b71155a6004445f2900fb294ca382eec"}"

  • I tried various approaches to change the scrape interval but that did not help.

  • I observed that average response request metrics showed incorrectly as well. The same sample is repeatedly shown across another timestamp as well even though that API was not called. Using rate should reset if the endpoint is not called but appears /metrics returns the same sample for the series flask_http_request_duration_seconds_sum, flask_http_request_duration_seconds_count every time /metrics is being scrapped thus it is showing an error duplicate sample for timestamp as well average response duration is showing incorrect.

Can you please help to understand the timestamp issue with the Prometheus flask exporter? Do I need to reset the counter explicitly after every /metrics call?

Thanks & regards
Mahesh

@rycus86
Copy link
Owner

rycus86 commented May 22, 2022

Hi Mahesh,

I'm not familiar with the AWS Prometheus workspace, not sure if that perhaps has some configuration that needs tweaking. The duplicate series error is a good lead of you can figure out what is the key it's comparing things with - is it metric name and all tags?

You shouldn't need to reset counters, in Prometheus they keep their values regardless of scraping, so it makes sense that they return those same values even without additional requests to the endpoints.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants