-
Notifications
You must be signed in to change notification settings - Fork 484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tempo leaves behind data files for blocks with no metadata #2754
Comments
Adding that this situation will cause repeated errors like the following until all files are cleaned up: In this situation it should not be considered a failure and propagate up. It's falsely triggering the TempoTenantIndexFailure. I think this should be an easy fix by ignoring |
This issue has been automatically marked as stale because it has not had any activity in the past 60 days. |
The tenant index deletion was originally put in as TCO win, but did not have the desired effect and surfaced other issues in the system. Related grafana#2678 Related grafana#2754 Related grafana#2781 Related grafana#2878 Related grafana#3115 Related grafana#3223 Due to the number of issues here, and causing considerable noise on the pager, perhaps the right thing to do is back out the tenant deletion. Raising here for discussion.
Describe the bug
In some circumstances, a
data.parquet
file is the only object in a block path, which means this block shows up in the list, but the metadata is available. As of #2678 tempo now deletes the tenant index when the tenant is found to have zero blocks, but because these paths still show up, the tenant is not completely deleted which causes index failures to occur repeatedly for these tenants and results in additional calls to the backend that aren't helpful.The index deletion I believe only uncovered this issue, which was dormant due to the index being left in place prior to #2678.
To Reproduce
Running
r106
we see this in environments where a tenant index was deleted due to no blocks being found. My hunch is that this has something to do with unclean shutdown, but have no data to back this up.Expected behavior
Block meta is either reconstructed from the data, or the data is removed.
Environment:
The text was updated successfully, but these errors were encountered: