-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pier
: epoch system
#313
Labels
feature
New feature or feature request
Comments
pkova
added a commit
that referenced
this issue
Sep 18, 2023
This PR implements a new format for how piers store their event logs on disk. Resolves #313. ### Design Existing format: ``` ./zod/.urb/log ├── data.mdb └── lock.mdb ``` New format: ``` ./zod/.urb/log ├── 0i0 # epoch dirnames specify the last event of the previous epoch │ ├── data.mdb # lmdb file containing events 1-132 │ ├── epoc.txt # disk format version (this PR starts versioning at 1) │ ├── lock.mdb # lmdb lock file │ └── vere.txt # binary version this set of events was originally run with └── 0i132 ├── data.mdb ├── epoc.txt ├── lock.mdb ├── north.bin # ├── south.bin # snapshot files (state as of event 132), strictly read-only └── vere.txt ``` The new format introduces *epochs*, which are simply "slices" or "chunks" of a ship's complete event log. Above, you can see the ship's event log chunked into two epochs: `0i0` and `0i132`. New ships booted with the code in this PR instantiate their `log` directories with the new format. Existing piers are automatically migrated on boot. Epoch "rollovers" (when the current epoch is ended and a new, empty epoch is created) occur under three conditions: 1. The pilot uses the new `roll` subcommand to manually rollover. 2. The pilot runs the `chop` subcommand. 3. We detect a different running binary version than the one pinned in the current epoch. Both migrations and epoch rollovers ensure there's a current snapshot before running. A few TODOs left: - [x] Iron out small kink in migration behavior for previously chopped piers - [x] Make sure correct binary version gets pinned to first epoch of migrated piers - [x] Rollover to new epoch when a new binary version is detected - [x] Make sure manual migration logic is idempotent - [x] ~~Update `prep` command~~ - [x] Fix `chop` so it works when there are 3 epochs starting with `0i0` - [x] ~~Reproduce and fix partially-deleted epoch `0i0` after `chop`~~ - [x] Pair with someone to run manual GDB testing for migration idempotency and rollover logic - [x] Take a look at @joemfb's replay code and compare/find overlaps - [x] Document final system design in this PR - [x] Correct epoch naming scheme - [x] Make `chop` leave the latest two epochs - [x] Better error handling - [x] Better cleanup - [x] Test migration with real ships running on local-networking mode - [x] Test epoch rollover idempotency - [x] Test fresh boot - [x] Handle case where snapshot has been deleted from `chk/` - [x] Ensure `u3_disk_epoc_good()` is implemented and used how we want - [x] Ensure `u3_disk_epoc_init()` is implemented and used how we want - [x] Replay works with `urbit play` and `urbit` - [x] Replay works in edge case where only epoch 0 and no valid snapshot exist - [x] Move new-epoch-on-vere-version-mismatch logic to `_pier_wyrd_init()` - [x] Make subcommands which call `u3_disk_init()` auto-migrate - [x] `info` - [x] `cram` - [x] `queu` - [x] `meld` - [x] `pack` - [x] `play` - [x] `chop` - [x] `roll` - [x] Make replay on boot use `u3_mars_play()` - [x] Test migration from an old pier (again) - [x] Test migration from an old pier that needs a full replay (i.e., from beginning of its event log) first works - [x] Test that `./urbit roll zod` with an updated binary version *and* an empty latest epoch, it does not roll but instead just updates the `vere.txt` file
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
To ameliorate jet mismatching issues (especially during event log replay), an "epoch system" should be created. The epoch system is a new
<pier>/.urb/log
format which enables event log replays to correctly match binary versions with particular subsets of events from the pier's event log.Related:
The text was updated successfully, but these errors were encountered: