Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Virtualize WAL methods #53

Merged
merged 16 commits into from
Dec 21, 2022
Merged

Virtualize WAL methods #53

merged 16 commits into from
Dec 21, 2022

Conversation

psarna
Copy link
Collaborator

@psarna psarna commented Oct 21, 2022

This preparatory commit moves all WAL routines to a virtual table.
Also, a helper libsql_open() function is provided. It allows passing
a new parameter - name of the custom WAL methods implementation.

@psarna
Copy link
Collaborator Author

psarna commented Oct 30, 2022

v2:

src/main.c Show resolved Hide resolved
src/main.c Show resolved Hide resolved
src/main.c Outdated
@@ -3127,6 +3132,11 @@ int sqlite3ParseUri(
*pzErrMsg = sqlite3_mprintf("no such vfs: %s", zVfs);
rc = SQLITE_ERROR;
}
*ppWal = libsql_wal_methods_find(zWal);
if (*ppWal == NULL) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: Do we want to keep the formatting consistent?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a fan of this style, but it's trivial to fix here, so let's :)

src/sqlite.h.in Outdated
** ^Names are case sensitive.
** ^Names are zero-terminated UTF-8 strings.
** ^If there is no match, a NULL pointer is returned.
** ^If zVfsName is NULL then the default implementation is returned.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/zVfsName/zName/

src/pager.c Show resolved Hide resolved
@@ -4667,6 +4670,7 @@ int sqlite3PagerFlush(Pager *pPager){
*/
int sqlite3PagerOpen(
sqlite3_vfs *pVfs, /* The virtual file system to use */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultra nit: formatting of comments got destroyed.

src/pager.c Outdated Show resolved Hide resolved
src/pager.c Outdated
@@ -7391,7 +7403,9 @@ int sqlite3PagerOkToChangeJournalMode(Pager *pPager){
i64 sqlite3PagerJournalSizeLimit(Pager *pPager, i64 iLimit){
if( iLimit>=-1 ){
pPager->journalSizeLimit = iLimit;
sqlite3WalLimit(pPager->pWal, iLimit);
if (pagerUseWal(pPager)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have to check whether pWal is NULL after virtualising the WAL interface?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, sqlite3WalLimit simply checked if pPager->pWal is not null and that was sufficient. Now it is not, because WAL support can even be compiled out by defining SQLITE_OMIT_WAL, and then pWalMethods is null and we can't call the virtual interface, because it does not exist. An alternative would be to spray #ifdefs on each call site, but I find if + pagerUseWal more readable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah wait, it's not enough, with SQLITE_OMIT_WAL this code won't compile, because the pWalMethods field is not even there. I'll recompile with SQLITE_OMIT_WAL and double check all the missing places, and fix accordingly

src/wal.c Outdated Show resolved Hide resolved
** An open write-ahead log file is represented by an instance of the
** following object.
*/
struct Wal {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to make this opaque for upper layers (Pager)? It seems that a lot of details are being exposed. Can this become a problem for some very exotic WAL implementations?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pager doesn't assume anything about struct Wal, it used to be opaque before this patch. To be honest, I don't remember why this struct is itself exposed, as we should only care about exposing libsql_wal_methods to the user. Perhaps it's an artifact from previous iterations that should just be removed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I refreshed my knowledge a little bit - wal.h is still a header private to libSQL files and no module assumes anything about it. In particular, pager.c does not include it. It is however needed for people who implement their own WAL routines, because their own implementations need to know the WAL structure, that's why this definition is exposed in a header.

Wal *pWal; /* Write-ahead log used by "journal_mode=wal" */
char *zWal; /* File name for write-ahead log */
Wal *pWal; /* Write-ahead log used by "journal_mode=wal" */
libsql_wal_methods *pWalMethods; /* Virtual methods for interacting with WAL */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to store pWalMethods here when pWal already has them in pMethods field ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the comment above - Wal used to (and I'm pretty sure still should be) opaque to upper layers, pager included. And once struct Wal is opaque, we need to somehow propagate WAL methods from libsql_open down to WAL, and pager is the only way through.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above - struct Wal is still opaque to its upper layers (pager), but it's not opaque for the authors of custom virtual WAL implementations

Copy link
Contributor

@haaawk haaawk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
I left some nits and questions just to prove I've actually read all that code :)

@psarna
Copy link
Collaborator Author

psarna commented Nov 11, 2022

Thanks for reviewing, I need to take a fresh look at this PR next week and validate - e.g. I'm pretty sure moving struct Wal definition from .c file to a public .h header is unnecessary and we should avoid it indeed

@psarna psarna force-pushed the virtual_wal branch 3 times, most recently from 7068201 to bccbe71 Compare November 14, 2022 11:26
This preparatory commit moves all WAL routines to a virtual table.
Also, a helper libsql_open() function is provided. It allows passing
a new parameter - name of the custom WAL methods implementation.
Before enabling WAL journaling, libSQL detects whether WAL
is fully supported. Historically it meant either operating
in exclusive mode or supporting shared memory locks in the
underlying VFS, but it may not be the case with custom WAL
implementations. Thus, custom WAL methods can declare whether
they rely on shared memory - if not, it will also not get
verified when checking for WAL support.
It makes little sense to store the WAL file name
if the virtual WAL implementation is not based
on a single file. Therefore, WAL pathname handling
is also virtualized, with the original implementation
still producing a <dbname>-wal file.
This commit adds a stub implementation of custom WAL methods
in ext/vwal subdirectory. It can be used as a starting point
for implementing custom WAL routines.
The test case registers a new naive virtual WAL coded on top
of Rust's std::collections::HashMap. The WAL implementation
only covers reading and writing pages, without checkpointing
or savepoints, but it's enough to validate that the virtual
method table works. After registering custom WAL, a few simple
operations are performed on the database to validate that
pages were indeed stored and retrieved properly.
For extra logs, run the test with -- --nocapture.
The command is heavily inspired by .vfslist
and lists all the registered custom WAL methods.
These are customarily run early, before a call to libsql_open,
so it makes sense to auto-initialize.
With this patch applied, WAL methods can be registered
from a loadable extension module dynamically.
Not critical, as WAL methods would customarily live as long
as the program that runs it, but it's good practice to be able
to unregister during a cleanup.
@psarna psarna force-pushed the virtual_wal branch 4 times, most recently from 1b76a01 to db10473 Compare December 13, 2022 10:25
WAL is open lazily on first access, while it is useful to be able
to set up ground for it even before the main db file is open.
This optional hook allows it.
Similarly to how other interfaces work, the version number in WAL methods
lets the user know which functions are supported, and which aren't yet.
@psarna psarna force-pushed the virtual_wal branch 2 times, most recently from 57247c9 to 18d4cd3 Compare December 13, 2022 10:28
The test now properly sets the iVersion and pre-main-db-open hook.
ABI should be consistent regardless of the compilation options,
so the optional functions are in there anyway - they can be simply
set to nulls if the user did not compile support for them in libSQL.
It will be useful for any state that custom WAL methods could
potentially have.
@penberg
Copy link
Collaborator

penberg commented Dec 20, 2022

@psarna I think this is good to go. Should we document this in doc/libsql_extensions.md?

@psarna
Copy link
Collaborator Author

psarna commented Dec 20, 2022

@penberg yes, a docs entry is definitely needed. I'll provide one soon, and then we can proceed with this PR

In amalgamation mode it compiled just fine, but otherwise
it relies on the wal.h header now.
The paragraph briefly explains the newly introduced mechanism.
@psarna
Copy link
Collaborator Author

psarna commented Dec 20, 2022

@penberg the checks are still ongoing, but I wrote the doc entry, green light from my end

@penberg penberg merged commit 273a494 into tursodatabase:main Dec 21, 2022
MarinPostma added a commit that referenced this pull request Oct 17, 2023
53: Rename crate folders to match crate names r=psarna a=MarinPostma

It is good practice to have the crate name match that of its folder, this is what people expect, and is less surprising when working with cargo commands, such as the `-p` flag to specify a target package


Co-authored-by: ad hoc <postma.marin@protonmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants