New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tsdb: Checkpoint closes mmaped chunk file despite open ChunkQuerier query; causing SIGSEGV #8217
Comments
Inception |
Sorry, a late hour. Edited (: -> thanos-io/thanos#3497 |
Not really sure how checkpointing would close a m-mapped chunk file, we would have faced this panic if that was the case. And I am not seeing the panic pointing to the TSDB codebase (was it truncated?). |
Are you by any chance running the checkpointing in parallel? |
TODO: Double check if simple iterator is affected by this truncation & chunkDiskMapper bug. |
To potentially add: pending reader tracking as we have for blocks. |
Covered by #5877 I think |
To add more info: once we close the m-map file, the byte slice that is m-mapped is no longer valid. Hence the panic when the query has already got the chunk and in the meanwhile when it is reading the m-map file was closed and truncated. |
We are still experiencing this issue in Thanos, exactly as reported here: thanos-io/thanos#3497. Prometheus version is 2.40. I see that #8723 is merged, but it either does not fix the root cause, or there is another place where already released memory is accessed. |
Prometheus version used:
v1.8.2-0.20201029103703-63be30dceed9
Details: thanos-io/thanos#3497
Funny enough we hit this issue on ALL Thanos receivers every 16h ;p Exactly every 16h.
The text was updated successfully, but these errors were encountered: