New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sinks::prometheus::remote_write::integration_tests::insert_metrics_over_http CI failure #5612
Comments
This looks like a race condition between two tests, where both create the same influx db, and then one deletes it before the other can finish using it for its tests. However, they are supposed to all create their own database, serialized by an atomic, so this should be impossible. |
It is a race condition, but not between multiple vector threads as I first thought, but rather creating the test database, populating it, and querying the results. I have reproduced it a few times on my system, but only by running it repeatedly in a loop. I'm trying to figure out what exactly is the ordering problem. |
After running these tests for a long time again, using unique database names instead of fixed ones, I have hit a failure. It appears the query to create the database is returning before the database is actually created, and then the first (and only) insert of metrics to the database fails with Rust debug logs (a couple of newlines added for readability):
InfluxDB logs:
|
Interesting, that is very odd and might warrant an issue upstream on influxdb to see if that is expected behavior. |
I don't know if we have a pattern for this in the test suite already, but one way to deal with those eventually consistent issues in integration tests is to retry the requests with a back-off and a cap rather than just a fixed delay. |
|
This is the first time I've seen this failure, but putting it here to track. It passes locally for me.
https://github.com/timberio/vector/runs/1578308476?check_suite_focus=true#step:9:566
The text was updated successfully, but these errors were encountered: