Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tsdb: Block Head GC till pending readers are done reading #9081

Merged
merged 4 commits into from
Jul 20, 2021

Conversation

codesome
Copy link
Member

@codesome codesome commented Jul 13, 2021

Closes #5877 #8221 #9079

This PR tracks the in-flight readers on Head block and block Head truncation till those readers are done reading. And while the in-memory truncation is happening (and not the m-mapped stuff), getting querier on Head block is blocked.

I have copied the tests from #9077 and #9080 into this while modifying slightly to take care of blocking Compact()/Querier() calls.

I have opened this as draft since the current approach is very naive without considering the time ranges of the queriers. I will be looking at improving it further to only block (and get blocked on) the queriers that would be affected with the truncation.

Only blocks memory truncation on overlapping queries and if truncation is going on or has started, only block queries that are overlapping with the truncation.

The querier adjusts the time range based on the on-going truncation without taking a lock. The memory truncation waits for in-flight queries that overlap with the truncation to finish.

@roidelapluie
Copy link
Member

Couldn't we reuse the isolation tracking here?

@codesome
Copy link
Member Author

Couldn't we reuse the isolation tracking here?

I think we can, we will need to additionally store and mint and maxt of the reads. Let me try that.

@codesome
Copy link
Member Author

codesome commented Jul 14, 2021

I am now using the existing isolation tracking with mint and maxt in the isolation states. I have also added enhancements to only block memory truncation on overlapping queries and if truncation is going on or started, only block queries that are overlapping with the truncation.

The PR is ready for review now. I will try adding more tests in case any part of it is untested yet tests added.

@codesome codesome marked this pull request as ready for review July 14, 2021 06:30
@codesome codesome requested a review from bwplotka July 14, 2021 06:30
@codesome codesome marked this pull request as draft July 14, 2021 09:05
@codesome codesome marked this pull request as ready for review July 14, 2021 10:08
@codesome
Copy link
Member Author

I have added more tests cases now with some small improvements and fixes. Now it is finally ready for review.

tsdb/head.go Outdated Show resolved Hide resolved
@codesome codesome marked this pull request as draft July 14, 2021 12:02
@codesome codesome marked this pull request as ready for review July 15, 2021 08:16
@codesome codesome force-pushed the head-gc-fix branch 3 times, most recently from 574906b to 8ae3f55 Compare July 15, 2021 08:40
@codesome
Copy link
Member Author

I have updated the PR with a better lock-free approach for this, and much simpler to understand. I have added detailed comments in the code explaining what is it trying to do. PR is ready for review again.

Copy link
Member

@beorn7 beorn7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @codesome , thanks for doing this. I have tried to fully understand this, but so far I have only gained a partial understanding. I have written down a few concerns and questions. Perhaps they are dumb, coming from my merely partial or even wrong understanding, but at least then I have the opportunity to improve my understanding.

tsdb/head.go Outdated Show resolved Hide resolved
tsdb/head.go Outdated Show resolved Hide resolved
tsdb/head.go Outdated Show resolved Hide resolved
tsdb/head.go Outdated Show resolved Hide resolved
tsdb/head.go Outdated Show resolved Hide resolved
tsdb/head.go Outdated Show resolved Hide resolved
tsdb/head.go Outdated Show resolved Hide resolved
tsdb/head.go Outdated Show resolved Hide resolved
tsdb/isolation.go Outdated Show resolved Hide resolved
tsdb/db.go Outdated Show resolved Hide resolved
@codesome
Copy link
Member Author

@beorn7 those are all valid questions and I agree few things can get confusing. I have answered them in the comments.

Copy link
Member

@beorn7 beorn7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot say I'm 100% confident I have understood everything, but I guess it's good enough to give approval from my side. Just a few nits about doc comments.

tsdb/db.go Outdated Show resolved Hide resolved
tsdb/db.go Outdated Show resolved Hide resolved
tsdb/head.go Outdated Show resolved Hide resolved
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
@codesome
Copy link
Member Author

Thanks for the reviews! I made 1 additional fix to take care of exclusiveness of the truncation time in WaitForPendingReadersInTimeRange and added a test case for it. I will merge this on green.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

tsdb: GC does not respect pending readers
3 participants