Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vinyl: implement page cache #7602

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

quaiion
Copy link

@quaiion quaiion commented Aug 24, 2022

No description provided.

@quaiion quaiion changed the title Pull request dedicated to implement vinyl page cache Implement page cache for Vinyl Aug 24, 2022
@quaiion quaiion force-pushed the qn/gh-3202-page-cache-implementation branch from 1718afe to b483d7b Compare September 27, 2022 16:25
@CuriousGeorgiy CuriousGeorgiy marked this pull request as ready for review September 29, 2022 12:11
src/box/vy_run.c Outdated
@@ -250,7 +250,7 @@ vy_page_info_create(struct vy_page_info *page_info, uint64_t offset,
{
memset(page_info, 0, sizeof(*page_info));
page_info->offset = offset;
page_info->unpacked_size = 0;
page_info->unpacked_size = 0; //temp/ why?..
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving this line untouched

src/box/vy_run.c Outdated
@@ -710,6 +715,8 @@ vy_page_new(const struct vy_page_info *page_info)
"load_page", "page cache");
return NULL;
}
page->ri_refs = 0; //temp/ maybe should already set to 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently removing any refs-related code so this won't be presented in the final version anyway

src/box/vy_run.c Outdated
vy_page_ref(struct vy_page *page)
{
assert(page != NULL);
assert(page->ri_refs != 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at our other reference counted objects, this looks strange. You should rather call it after vy_page_new.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently removing any refs-related code so this function won't be presented in the final version anyway

src/box/vy_run.c Outdated
* Decrease memory-stored page's reference counter.
*/
static inline void
vy_page_unref(struct vy_page_info *page_info)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API asymmetry looks weird. Please make it symmetric and probably rename it to vy_page_info_page_*ref.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently removing any refs-related code so this function won't be presented in the final version anyway

src/box/vy_run.c Outdated
* End iteration and unref cached pages.
*/
static void
vy_run_iterator_stop(struct vy_run_iterator *itr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you move it here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was moved here because it needed the vy_page_unref() function to be defined. Now we abandon using refs, so it will be moved back.

src/box/vy_run.c Outdated
* percentage of pages in current page cache is deleted.
*/
static void //temp/ maybe should add error codes (what errors?..)
vy_page_cache_mem_cutback(struct vy_run_env *run_env, uint8_t del_percent)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use percentage instead of ratio.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess you meant "ratio instead of percentage". Anyway, we already abandoned the percentage-based cutbacks and switched to a classic "rotating LRU list" scheme

src/box/vy_run.c Outdated
*/
static inline void
vy_page_ref(struct vy_page *page)
vy_page_ref(struct vy_run *run, struct vy_page_info *page_info)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess you should be passing just the page number here.

Copy link
Author

@quaiion quaiion Oct 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked simplier to me as we often already have page_info in use and there is no need to re-access it via number + run. Anyways, already deleted all the ref-related functions

src/box/vy_run.c Outdated
*/
static inline void
vy_page_unref(struct vy_page_info *page_info)
vy_page_unref(struct vy_run *run, struct vy_page_info *page_info)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess you should be passing just the page number here.

Copy link
Author

@quaiion quaiion Oct 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked simplier to me as we often already have page_info in use and there is no need to re-access it via number + run. Anyways, already deleted all the ref-related functions

src/box/vy_run.c Outdated
assert(tail_page->ri_refs == 0);
run->run_env->pc_env.mem_used -= (tail_item->unpacked_size +
sizeof(struct vy_page) + tail_item->row_count *
sizeof(*tail_page->row_index));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you use this in two places, please refactor it out into a separate function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, see the vy_page_cache_account_page_mem() and vy_page_cache_unaccount_page_mem() functions in last commits

src/box/vy_run.c Outdated
{
env->mem_quota = 0; //temp/ looks tricky, but done just as for tuple cache
env->mem_used = 0;
env->del_percent = 20; //temp/ literally should be changed at least to a constant
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use ratio instead of percentage.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already abandoned the percentage-based cutbacks and switched to a classic "rotating LRU list" scheme

@quaiion quaiion force-pushed the qn/gh-3202-page-cache-implementation branch 6 times, most recently from 5b4c590 to b081ba6 Compare October 6, 2022 13:30
@quaiion quaiion changed the title Implement page cache for Vinyl vinyl: implement page cache Oct 8, 2022
@quaiion quaiion force-pushed the qn/gh-3202-page-cache-implementation branch 5 times, most recently from c5d33c6 to 019bcd4 Compare February 23, 2023 21:10
Every vinyl's run iterator independently keeps 2 uncompressed pages
stored in memory - the current one and the previous one used by this
exact iterator. 2 most recently requested pages stored for each run
iterator do not provide much of caching effect and are used mainly to
boost working with multi-page data chunks. So in practice if a tuple is
not found in tuple cache, the engine almost always has to read run files.
The latter incurs severe overhead even if the pages being read happen to
be found in system page cache, because run files are compressed and we
have to decompress them on each read. Besides, relying on system page
cache eviction algorithm may result in sub-optimal performance.

To address this problem, we make vinyl store some amount of unused pages
in memory (up to page cache memory quota), giving preference to most
recently requested (among all runs) ones. These pages are stored in
uncompressed state, so in case of page cache hit no reading from the disk
or decompression is required.

As an additional effect, by making the way of storing pages in memory
suitable for page cache usage, we prevent duplicating pages in memory.
Currently, if an iterator receives a request on a page that is already
stored in memory but attached to another iterator, it will re-read this
page from the disk and decompress it, creating a page's copy. We make run
iterators access pages via run-wise centralized metadata structures, so
that no more duplicates of existing (in memory) pages are created.

Closes tarantool#3202

@TarantoolBot document
Title: new box.cfg parameter: 'vinyl_page_cache'

`vinyl_page_cache` puts a limit on the amount of RAM that can be occupied
by currently unused runs' pages, stored in Vinyl's page cache.

The `vinyl_page_cache` limit can be periodically exceeded by up to the
approximate size of one page + it's metadata, so that pages are not
deleted from the cache after being loaded from the disk but before being
processed by the engine and pages are not deleted right before being
loaded again upon request.

`vinyl_page_cache` parameter can be changed dynamically. If the amount of
RAM already being used by the page cache is greater than the new quota,
Vinyl will immediately free enough cache's memory to fit it in the new
limit.

If `vinyl_page_cache` is set to 0, page cache keeps only 1 page in memory
(for the entire engine!), so its performance (and the engine's
performance as well) drops significantly. Therefore it is strongly
recommended not to set `vinyl_page_cache` to 0 if RAM resources are not
exhausted.
@quaiion quaiion force-pushed the qn/gh-3202-page-cache-implementation branch from 019bcd4 to 25c3055 Compare February 23, 2023 23:48
@kyukhin kyukhin marked this pull request as draft August 25, 2023 07:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants