Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid memcpy from mmapped plain table tables #8375

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bkgood
Copy link

@bkgood bkgood commented Jun 8, 2021

Intended to fix #8374.

Things might be a bit rough now (in particularly I've left various comments I left along the way that I don't intend to keep, the commit message is garbage, ...), but I wanted to get feedback on the implementation before I clean it all up.

plain_table_db_test passes as patched (it no longer copies when files are mapped in and is updated as such). I've seen stress testing tools that I intend to run; any suggestions and guidance is very much welcome.

@bkgood bkgood force-pushed the avoid-memcpy-on-mmapped-plain-tables branch from 4fd0214 to ac1012e Compare June 8, 2021 19:37
@bkgood
Copy link
Author

bkgood commented Jun 8, 2021

It appears I've tickled something in MultiGet. working on sorting that out.

the refcount of Version is not atomically updated so I'm guessing I need to hold on to something wider to keep it alive.

@bkgood bkgood marked this pull request as draft June 8, 2021 20:41
// guessing these files can be used by multiple versions, since they are
// immutable). We can ensure this lives by holding a ref to this
// version.
get_context.set_table_pinner(&table_pinner);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for taking on the problem. I think you are making a very interesting optimization: prevent memcopy for mmapped, uncompressed data (potentially for both of plain table and block based table), so I appreciate your effort. I'm just curious, why do we need another pinner for this, rather than directly registering cleaner to GetContext.pinnable_val_?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind. Ignore my previous comment. I misunderstood the code and figured out why it is done this way. Let me think about it a little bit more whether there is a potentially better way.

@ajkr
Copy link
Contributor

ajkr commented Jun 18, 2021

Agreed, thanks for working on this. I had forgotten we are currently unable to do zero copy point lookups with mmap. Does this limitation apply to read-only mode too?

I made public another known issue (#8422) with mmap efficiency that has been sitting a while in case you find it interesting. Perhaps such a MappedRead() could be used on pinned files.

@siying
Copy link
Contributor

siying commented Sep 8, 2021

@ajkr I thought about it and feel this approach is reasonable. Thought?

@bkgood
Copy link
Author

bkgood commented Sep 12, 2021

Agreed, thanks for working on this. I had forgotten we are currently unable to do zero copy point lookups with mmap. Does this limitation apply to read-only mode too?

I don't believe so -- there are branches for 'immortal' tables that are taken in read-only mode that provide direct pointers into the mapped region without any associated cleaner required (this is my recollection from some months ago).

I made public another known issue (#8422) with mmap efficiency that has been sitting a while in case you find it interesting. Perhaps such a MappedRead() could be used on pinned files.

My understanding is that such a fix for immortal tables won't aid reads from tables in normally opened DBs (i.e. not opened as secondary or RO) because such tables can't be immortal as they may be eventually discarded as compactions cause them to be superseded by new tables. But this is again a months-old recollection, so I may be off?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Get on mmapped PlainTables should be pinnable/zero-copy
4 participants