Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve unordered table scans on rotational drives #1552

Open
coffeemug opened this issue Oct 17, 2013 · 3 comments
Open

Improve unordered table scans on rotational drives #1552

coffeemug opened this issue Oct 17, 2013 · 3 comments

Comments

@coffeemug
Copy link
Contributor

An unordered table scan in Rethink is a very slow operation because we do a b-tree traversal without any awareness of disk layout. This is a problem because database dumps are extremely slow on rotational storage because of seeks.

We can improve this by making unordered scans disk-layout aware (this is obviously hard), or by somehow signaling the serializer to allow returning some blocks in a different order.

Leaving this in backlog.

@jdoliner
Copy link
Contributor

This isn't actually tough from a traversal perspective. This is basically already what backfilling does. We just need to make a way to hook that up to a ReQL API. So it's still kind of tough, but not as tough as OP makes it out to be.

@danielmewes
Copy link
Member

@jdoliner Can you explain a bit more how backfilling does that? Does it bypass the serializer interface or is there another trick which makes disk access more sequential (on average)?

I imagine that the solution to this would essentially be some kind of smart rethinkdb extract, probably on a per-extent level.

@jdoliner
Copy link
Contributor

Well, actually I'm overselling this quite a bit. It's just that backfilling works correctly with blocks returned in whatever order the serializer finds convenient. So that would probably already be a bit faster, and with some more optimization could request big batches of blocks and try to read them sequentially. It seems like we can only do so much here without changing the layout on disk though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants