Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable purging of BufferPool pages based on time-since-last-unpinned #11441

Merged
merged 1 commit into from May 4, 2024

Conversation

jkub
Copy link
Contributor

@jkub jkub commented Mar 30, 2024

The bufferpool currently uses a fixed-size LRU, but will not evict pages unless it is at the size limit. Stale pages can accumulate indefinitely despite not having been recently referenced. Because these pages are 'anonymous' the OS can only reclaim memory by swapping them, which is often counterproductive, compared to simply evicting them. This PR enables evicting pages that have not been referenced in some recency window, defined by the number of seconds since they were last unpinned.

Overhead

The overhead involves 8 bytes added to the BlockHandle (which is insignificant on this already heavy data structure) and a call to steady_clock::now (which should have a latency measured in ~10 nanoseconds on most platforms, and again be dwarfed by the other work happening when we queue a page for eviction).

Status

I haven't included tests, in part because I'm not sure the best way to expose this feature. As a DUCKDB_API on the database object? As a table function, similar to checkpoint()? I'd appreciate suggestions. Thanks.

@Tishj
Copy link
Contributor

Tishj commented Mar 30, 2024

This PR added the sleep command to the sqllogic tester
#10426

Perhaps we can make a table function that gives us insight into the active/freed buffers and using sleep we can ensure that this feature works as intended?

But that will probably require a statement to force trigger this purging

@github-actions github-actions bot marked this pull request as draft March 30, 2024 23:15
@jkub
Copy link
Contributor Author

jkub commented Mar 30, 2024

I might have misjudged the overhead of steady_clock. I'll verify before re-publishing.

@carlopi
Copy link
Contributor

carlopi commented Mar 31, 2024

Question: is this ever relevant mid-query or only at query boundaries?

If the second, maybe this could be exposed to SQL via a SET max_age_seconds = X + calling the clean up after each query is completed (or just before it's executed)?
Then SET max_age_seconds = -1 (the default) would disable it, 0 trigger the full cleanup, N would clean after N seconds.

@jkub jkub force-pushed the age_based_eviction branch 4 times, most recently from 3946ed7 to 7f4680d Compare April 6, 2024 04:14
@jkub
Copy link
Contributor Author

jkub commented Apr 6, 2024

@carlopi: Queries can run quite a while. I think it should be possible for a C++ client to invoke cleanup independently of query execution, while queries are running, either on a background thread or something else. For the purposes of testing though, I think a more constrained path for exercising the code would probably be adequate.

@jkub jkub marked this pull request as ready for review April 6, 2024 04:17
@github-actions github-actions bot marked this pull request as draft April 6, 2024 16:29
@jkub jkub marked this pull request as ready for review April 8, 2024 22:34
Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Looks good - comments below:

@@ -17,4 +17,9 @@ using std::chrono::high_resolution_clock;
using std::chrono::milliseconds;
using std::chrono::system_clock;
using std::chrono::time_point;

// The portable way to do this (steady_clock) is not performant on all platforms.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you expand on what platforms this isn't performant, and why it isn't performant? Could we use steady_clock::now() on platforms where it is performant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial implementation used steady_clock because initial research suggested that it should be broadly performant. However, when I submitted the PR, there was an unmistakable perf regression reported by CI. I switched to this implementation and the regression went away.

Unfortunately, since steady_clock makes no actual guarantees about performance, it seemed safer to provide an implementation that uses the most performant method on each platform.

Copy link
Collaborator

@Mytherin Mytherin May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you point out to where the performance regression is in the CI? The regression tests sometimes have false positives. Unless the perf regression is consistent/severe I would prefer to stick to the portable implementation.

Copy link
Contributor Author

@jkub jkub May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like github may have purged the runs from the original commit on this PR, which used steady_clock.

I created a new PR that uses steady clock, for perf comparison. #11904 If that one isn't worse, I will clean up this PR to use steady_clock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That other PR didn't complain about perf. Should I switch this back to steady_clock?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's switch it to steady_clock, we can always switch back

@@ -50,7 +56,8 @@ bool BufferPool::AddToEvictionQueue(shared_ptr<BlockHandle> &handle) {
// or the block handle is still a local variable (ConvertToPersistent)

D_ASSERT(handle->readers == 0);
auto ts = ++handle->eviction_timestamp;
auto ts = ++handle->eviction_seq_num;
handle->lru_timestamp_msec = GetCoarseMonotonicMillisecondTimestamp();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the eviction isn't enabled by default and PurgeAgedBlocks is never called in the common case - can we avoid making this call in all cases?

For example can we have a setting (e.g. buffer_manager_eviction_timestamps) - that defaults to false - and when turned off we will avoid making this call here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can make this change.

@jkub jkub marked this pull request as ready for review May 3, 2024 20:12
@jkub
Copy link
Contributor Author

jkub commented May 3, 2024

The test failure is almost certainly unrelated to this PR. It's a concurrency test, so it's possibly a real issue in trunk that repros only occasionally. Rerun might pass, but I don't want to do anything that would lose state, if it's useful for debugging.

@Mytherin
Copy link
Collaborator

Mytherin commented May 4, 2024

Indeed it's most likely related to my recent checkpoint rework - I'll have a look.

@Mytherin Mytherin merged commit 540b194 into duckdb:main May 4, 2024
41 checks passed
@Mytherin
Copy link
Collaborator

Mytherin commented May 4, 2024

Thanks!

github-actions bot pushed a commit to duckdb/duckdb-r that referenced this pull request May 6, 2024
Merge pull request duckdb/duckdb#11441 from jkub/age_based_eviction
Mytherin added a commit that referenced this pull request May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants