Skip to content
This repository was archived by the owner on Oct 18, 2023. It is now read-only.

Conversation

@MarinPostma
Copy link
Contributor

@MarinPostma MarinPostma commented Sep 24, 2023

A series of improvements to the frame injection process:

  • in-place injection: there is no need anymore to have a separate thread for the injector connection
  • faster replication from snapshots: the replicator was re-performing a handshake in between snapshot
  • Frame API refactor to allow in-place frame header mutation (this one might need a bit of rework later on. We force the alignment of the whole frame when only the header needs to be aligned, meaning we have to perform a full copy on the frame each time. Thinking about it now, we can do better, work with unaligned frames, and mutate the header by writing it back to the frame data)

@MarinPostma MarinPostma force-pushed the same-thread-frame-injector branch 8 times, most recently from 1c52981 to 30ec997 Compare September 26, 2023 09:33
@MarinPostma MarinPostma marked this pull request as ready for review September 26, 2023 14:57
Copy link
Contributor

@psarna psarna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is too vast for my tiny brain to review in one go, so I'll do it in stages. First batch of comments:


Ok(())
} else {
anyhow::bail!("failed to apply pages");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we bother updating is_txn here, should we?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't bother, because this is caught as a fatal replication error anyway, and everything blows up

OpenFlags::SQLITE_OPEN_READ_WRITE
| OpenFlags::SQLITE_OPEN_CREATE
| OpenFlags::SQLITE_OPEN_URI
| OpenFlags::SQLITE_OPEN_NO_MUTEX,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: once we migrate to libsql crate, we need to audit that NO_MUTEX is still ok, because there's been discussions around which threading model we should use.

impl Frame {
/// size of a single frame
pub const SIZE: usize = size_of::<FrameHeader>() + WAL_PAGE_SIZE as usize;
/// Owned version of a frame, on the heap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing the purpose of the new FrameMut and FrameBorrowed. Why do we need it? Feel free to also post the answer in the commit message

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, only got back to this PR now. So basically, FrameBorrowed is a stack data structure that contains the frame's raw data. But since it's quite big, you don't really ever get an owned version. Instead, you get a Frame and a FrameMut that are the stack-allocated equivalents. Both deref to a FrameBorrowed, with the difference between FrameMut and Frame being that Frame is a shared pointer that is cheaply clonable, and FrameMut is an exclusive reference to the Frame.

@MarinPostma MarinPostma marked this pull request as draft October 6, 2023 09:43
@MarinPostma MarinPostma force-pushed the same-thread-frame-injector branch 2 times, most recently from 43c0012 to 2d645cd Compare October 10, 2023 14:59
@MarinPostma MarinPostma marked this pull request as ready for review October 10, 2023 16:11
&TRANSPARENT_METHODS,
// safety: hook is dropped after connection
(),
DEFAULT_AUTO_CHECKPOINT,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Autocheckpoint should be preventively disabled in this case. If we somehow end up having 999 pages in WAL while recreating this table (e.g. because the user mistakenly dropped it), this call will trigger an autocheckpoint outside of our virtual WAL. And that we don't want, especially if we later introduce checkpoint logic that only allows checkpointing frames that were backed up in S3.

let connection = self.connection.lock();
connection.execute("INSERT INTO libsql_temp_injection VALUES (42)", ())?;
// force call to xframe
match connection.cache_flush() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MarinPostma did you observe xFrames not being called without it? I found it weird, the docs https://www.sqlite.org/c3ref/db_cacheflush.html suggest that it's for flushing pages mid-transaction, and a single insert is a whole transaction on its own. It should be flushed to disk the moment you finish calling execute. And rusqlite's cache_flush is just a wrapper over sqlite3_db_cacheflush.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are in the context of a transaction here, so the call to INSERT never causes xFrame to get called

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are in the context of a transaction here

Ok, that's new, and proves I didn't follow the idea closely enough.. I see the begin_txn now. And I suppose we take it so that we can inject frames in multiple xFrames calls, but we need it to happen atomically anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea sir, this way we can apply very large transactions, such as snapshots

@MarinPostma MarinPostma requested a review from psarna October 11, 2023 12:34
Copy link
Contributor

@psarna psarna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok, although I'm still anxious if we never leave this new transaction hanging forever, preventing further injections and checkpoints. I suppose the idea is that either we apply a finite number of frames, or exit with a fatal error, right? Is there are non-fatal error that would leave a transaction hanging, or do we ever wait for the network in transaction context?

@MarinPostma MarinPostma force-pushed the same-thread-frame-injector branch 2 times, most recently from e581066 to 742fe9d Compare October 13, 2023 06:52
@MarinPostma MarinPostma force-pushed the same-thread-frame-injector branch from 742fe9d to f749b1d Compare October 13, 2023 07:37
@MarinPostma MarinPostma enabled auto-merge October 13, 2023 07:38
@MarinPostma MarinPostma added this pull request to the merge queue Oct 13, 2023
Merged via the queue into main with commit ebb7ae9 Oct 13, 2023
@MarinPostma MarinPostma deleted the same-thread-frame-injector branch October 13, 2023 08:02
LucioFranco added a commit that referenced this pull request Oct 13, 2023
MarinPostma pushed a commit that referenced this pull request Oct 15, 2023
github-merge-queue bot pushed a commit that referenced this pull request Oct 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants