Storage Engine v4 — Stack 1: protos, codec, engine skeleton#868
Open
c1-squire-dev[bot] wants to merge 1 commit into
Open
Storage Engine v4 — Stack 1: protos, codec, engine skeleton#868c1-squire-dev[bot] wants to merge 1 commit into
c1-squire-dev[bot] wants to merge 1 commit into
Conversation
Stack 1 of the storage-engine-v4 PR series (per RFC v4 in the
pebble-baton-sdk squire plan). All new code lives under the
`//go:build batonsdkv2` build tag so default connector binaries are
unaffected.
Protos at `proto/c1/storage/v3/`:
* `options.proto` — TableOption + IndexOption descriptor extensions
(proto2 syntax; proto3 forbids extending non-descriptor messages).
* `refs.proto` — EntitlementRef, PrincipalRef, ResourceRef.
* `records.proto` — six record types (ResourceTypeRecord,
ResourceRecord, EntitlementRecord, GrantRecord, AssetRecord,
SyncRunRecord) plus the v3-owned mirrors of v2 types
(GrantExpandableRecord, GrantSourceRecord, SyncType).
Generated Go committed under `pb/c1/storage/v3/`.
Codec layer at `pkg/dotc1z/engine/pebble/codec/`:
* `tuple.go` — FoundationDB-style tuple encoding with NUL escape
rules that work for raw bytes (not just UTF-8 strings); appenders
for string, bytes, int32, int64, uint32, uint64, bool, plus a
decoder that consumes a single component.
* `syncid.go` — KSUID 20-byte canonical binary encoding for
sync_id keys (saves ~2 GB at 100M-grant scale vs. storing the
27-char base62 string redundantly across primary + 4 indexes).
* `registry.go` — Codec interface with error-returning methods
(no panics on type mismatch); frozen-after-init map keyed by
proto FullName; lazy ReflectCodec fallback cached process-wide.
* `reflect.go` — *ReflectCodec skeleton. Value-side encode/decode
is wired (deterministic proto.Marshal); key-side requires
(storage.v3.table) walking and lands in Stack 3.
* `errors.go` — ErrCodecTypeMismatch, ErrInvalidSyncID,
ErrInvalidTuple.
Engine stub at `pkg/dotc1z/engine/pebble/`:
* `engine_stub.go` — empty Engine struct + the full sentinel-error
set from RFC v4 Appendix E. Stack 3 fills in the engine; Stack 2
consumes the sentinels for envelope errors.
Tests: tuple-encoding prefix-free invariant ported from the
microtest at `/tmp/baton-rfc-microtests/tuple_test.go` and run in-tree
with `-tags=batonsdkv2`. Sync-ID roundtrip + invalid-input + order-
preserving tests. Registry generated-hit and reflection-fallback tests.
Deferred to follow-up commits on this branch:
* `cmd/protoc-gen-batonstore/` codegen plugin (Appendix D). The
typed codecs the plugin emits land alongside the plugin.
* `buf.gen.yaml` entry once the plugin compiles.
* Per-record generated codecs under `pkg/dotc1z/engine/pebble/gen/`.
Dependencies: adds `github.com/cockroachdb/pebble` via local
`replace` to `/data/squire/src/pebble` for now (the modern Pebble
APIs — IngestAndExcise, FormatValueSeparation, DBCompressionGood —
aren't in any tagged release yet). The replace will lift to a real
version pin before merge.
Refs: RFC v4 §3.4 (record protos), §3.6 (codec hybrid), Appendix A
(full record proto), Appendix D (codegen toolchain), Appendix E
(sentinels), micro-test results in research/import-11.md.
4ea29f4 to
464a8f3
Compare
5b79a33 to
205e0e4
Compare
Contributor
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Stack 1 of RFC 0004 (storage engine v4). Stacks on PR #867.
New protos under
proto/c1/storage/v3/:options.proto— proto2 file withTableOption+IndexOptiondescriptor extensions (field 90000 / 90001) so record types declare their primary key and index shapes in the schema.refs.proto— identity-onlyEntitlementRef,PrincipalRef,ResourceRef.records.proto—GrantRecord,EntitlementRecord,ResourceRecord,ResourceTypeRecord,AssetRecord,SyncRunRecord, plus mirror types (GrantExpandableRecord,GrantSourceRecord,SyncTypeenum). All declare(table)options.New Go codec layer under
pkg/dotc1z/engine/pebble/codec/:tuple.go— FoundationDB-style tuple encoder (AppendTupleString,AppendTupleBytes,AppendTupleInt32/64,AppendTupleUint32/64,AppendTupleBool,AppendTupleSeparator) with the NUL/escape rules from RFC §3.5. Property-tested for the prefix-free property in Stack 5's microtests.syncid.go—EncodeSyncID(string)→[]byte/DecodeSyncID([]byte)→stringusing KSUID's 20-byte canonical binary form (lex-equivalent to base62 but 7 bytes smaller per row × hundreds of millions of rows).registry.go—Codecinterface + frozen-after-init map for generated codecs +sync.Mapreflection cache.Lookup()returns a generated codec when registered or constructs aReflectCodeclazily.reflect.go—ReflectCodeccovers value-side encode/decode (deterministic proto.Marshal); key-side is gated by(table)options arriving with codegen (deferred — see §8).Engine skeleton at
pkg/dotc1z/engine/pebble/engine_stub.go— centralized sentinel-error declarations (Appendix E) only. The full engine struct lands in Stack 3.Build tag
//go:build batonsdkv2throughout — connector binaries don't link Pebble unless explicitly built with the tag.Test plan
make lintcleanTestAppendTupleString,TestEncodeSyncIDRoundtrip, etc)pb/c1/storage/v3/(vendored)Notes on deferred items (RFC §8)
protoc-gen-batonstorecodegen plugin DEFERRED — theReflectCodeccovers the MVP; codegen is a perf optimization for the hot record types and lands as a follow-up on this branch once the engine is in production use.🤖 Generated with Claude Code