Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persistency (at least for fast recovery - maybe some log of requests?) #5

Open
dumblob opened this issue Jun 21, 2021 · 3 comments
Open

Comments

@dumblob
Copy link

dumblob commented Jun 21, 2021

This question sounds rather dumb in the context of a memory DB, but I'm asking it anyway 馃槈.

Do you plan to support some kind of persistency in terms of power outage with the following two guarantees?

  1. the DB can deal with all corrupted data (e.g. by detecting them and throwing them away)
  2. the DB will "return" to the caller first after the data were safely persisted (i.e. any corruption happening after this "return" would be guaranteed to not affect such data)

Or maybe just abstract the low-level storage API into some "VFS" and let the community create their backend? In this case the "VFS" API should account for the client per-request choice whether it shall treat it as best effort (to minimize latency and maximize throughput with no power outage guarantee but still with all transactional guarantees) or as persistency guarantee.

Any thoughts?

@kelindar
Copy link
Owner

@dumblob There's no dumb questions. You are absolutely right, I'm planning to add some optional pluggable durability to the store. The main idea is that I would like to keep it pluggable, so the community can build on top of the columnar engine and use anything else (e.g. kafka, RDBMS, S3, any other db ...) to persist it as well.

This is the reason I've focused on the change stream first, I still need to do some refactoring but by moving the change stream at the beginning of each transaction commit (after lock), it would become a write-ahead-log which means when someone commits a transaction, before it's reflected in the store you an write it to disk. If this write fails for some reason, the transaction will fail. If the write succeeds, you can recover.

The "out-of-the-box" implementation I'm thinking about is a simple write-ahead-log support which simply appends updates/deletes/inserts into a log on disk which is when periodically compacted/compressed to avoid running out of disk space.

@dumblob
Copy link
Author

dumblob commented Jun 22, 2021

Thanks for the valuable insight!

The "out-of-the-box" implementation I'm thinking about is a simple write-ahead-log support which simply appends updates/deletes/inserts into a log on disk which is when periodically compacted/compressed to avoid running out of disk space.

That'd be also my choice. The only difficulty is efficient compaction/compression in the presence of transactions, but I'm sure with your design it's still manageable.

I'm curious how this'll evolve. Does actually column have any commercial backing?

Feel free to close this issue if you think there is not much to discuss nor track.

@kelindar
Copy link
Owner

There's no commercial backing, but I would want this library to first get to a stable state where I'm very happy with the programmer's UX, then keep the library focused on one thing and keeping it pluggable and extendable. I'm also planning to use it in my workplace for certain use-cases.

Ideally, I would like to see the community building other things/products on top, such as game engines, full SQL databases, replicated columnar databases, sharding. I don't believe any of those belong in this project. Go has a very much UNIX philosophy: do one thing but do it well, the library would follow the same philosophy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants