Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose current position to application #323

Closed
benbjohnson opened this issue Feb 17, 2022 · 3 comments
Closed

Expose current position to application #323

benbjohnson opened this issue Feb 17, 2022 · 3 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@benbjohnson
Copy link
Owner

Currently, Litestream maintains a replication position in-memory, however, it would be useful to application developers to have access to this position. In addition to the generation file, Litestream should keep a position file in the db-litestream directory that stores the generation (x), index (y), & offset (z) in the litestream.Pos.String() format:

xxxxxxxxxxxxxxxx/yyyyyyyyyyyyyyyy:zzzzzzzzzzzzzzzz

This file doen't need to be fsync'd to disk as it is only meant to be used as a simple IPC mechanism (similar to the -shm SQLite file). Optionally, it could hold a checksum to avoid partial write/read issues but that may be overkill since it's smaller than a sector of data.

/cc @chrismccord

@benbjohnson benbjohnson added the enhancement New feature or request label Feb 17, 2022
@benbjohnson benbjohnson added this to the v0.4.0 milestone Feb 17, 2022
@benbjohnson benbjohnson self-assigned this Feb 17, 2022
@simonw
Copy link

simonw commented Feb 17, 2022

A neat trick I've seen done with this relates to helping users avoid replication lag.

After you make an update to the database, it's really important you see that update on the next GET request you make.

A common solution to this problem is to set a cookie (or similar) for the user such that for the next 5s after they perform a write all of their reads are sent to the primary database - which should ensure the replicas have caught up by the time that cookie expires.

But another trick I've seen is to make the replication position available to the replicas, and then to record the position at the time the user's last write was committed somewhere.

If the user is talking to a replica it can then make a comparison, effectively saying "this user last wrote at position 11234 - I've only replicated up to position 11221 so I should redirect them to the primary".

I think this is how Wikipedia address replica lag, so it definitely works at scale!

@simonw
Copy link

simonw commented Feb 17, 2022

Hah, and after I typed all of that it looks like that was the impetus for adding this in the first place!

https://twitter.com/mrkurt/status/1494380016238481415

Chris McCord Today at 12:13 PM @benbjohnson super awesome work on litestream! I'm wanting to set it up with an Elixir/Phoenix project that does something similar to what we're doing with fly_postgres, which sends writes to a primary instance and all reads to local replicas. We use the postgres LSN from the primary to block the caller while we await the SN to be replicated on the replica. I'm completely ignorant of sqlite WAL/litestream internals atm, so forgive my ignorance, but do you have any pointers on how I might handle this kind of scenario? For example, we already have an rpc mechanism, so writes to the primary sqlite are trivial. What I'll need to solve is 1) perform write, 2) obtain WAL index/position 3) send write result to remote caller with WAL index/position. 4) block on remote until replica >= index/position  benbjohnson 11 minutes ago hey Chris! Good question. Litestream doesn't currently expose the current position but that's a good idea and not too difficult. I added a GitHub issue to track it here: https://github.com/benbjohnson/litestream/issues/323

@benbjohnson
Copy link
Owner Author

@simonw I personally like the "redirect to primary for X seconds" approach because it's so simple but it does require replication lag to be below X.

One issue I realized after after chatting with @chrismccord is that Litestream batches up changes every 10ms so you can't check it right away. You could add a sleep although that's hacky. However, before the transaction you could read the position (after this issue is implemented), then run your write transaction, and then keep re-reading the position file until it changes and that should always give you a WAL position that contains your transaction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants