Conversation
Signed-off-by: Nick Cameron <nrc@ncameron.org>
danderson
left a comment
There was a problem hiding this comment.
Calling time on this first round of review. I didn't get all the way into the implementation of transactions and indexes, but I got through the schema and storage machinery.
My main thought is "wow that's a lot of type and lifetime parameters" 😂 I can see how they become necessary to make everything strongly typed, it just makes the logic hard for my tiny mind to follow. That said I think the worst of it ends up hidden way by the schema macros, so hopefully actually using a concrete store should be relatively pleasant.
I'll come back later for a second run at the remaining pieces.
| //! The data store has raw and transactional APIs. The raw API guarantees that each operation is | ||
| //! atomic, but has no guarantees across multiple operatons. The transactional API groups operations | ||
| //! into transactions which are atomic and serializable. Both singleton and tabular data can be part | ||
| //! of a transaction, and tables and transactions are orthogonal. Transactions may be read/write or |
There was a problem hiding this comment.
I don't understand what you mean by "tables and transactions are orthogonal".
| //! into transactions which are atomic and serializable. Both singleton and tabular data can be part | ||
| //! of a transaction, and tables and transactions are orthogonal. Transactions may be read/write or | ||
| //! read-only. Transactions don't need to be explicitly committed - the effect is 'commit on drop' | ||
| //! but the implementation is that operations are individually committed. |
There was a problem hiding this comment.
I don't understand the implementation part of this statement. My reading of it is that it says that if dropped, the transaction applies its changes but does so in a non-transactional way, where readers can observe partial applications. Hopefully I'm wrong?
There was a problem hiding this comment.
Ditto — what happens on a panic unwind or error rethrow via ? — I assume we normally don't want that state to be committed
| //! For example, consider a table `Base` which maps `u64` keys to `Foo` values where `Foo` has a | ||
| //! field `bar: Url` (and `bar` is a unique identifier for a `Foo` in `Base`). The schema would look | ||
| //! like `tables!(Base(Base => Foo; index(bar: Url)));` which will create a `Base` table and an | ||
| //! `index::Base::bar` table. The index table can be accessed directly like a normal table, but I |
There was a problem hiding this comment.
"there's no I in team", or something :)
I'd suggest leading with the "correct" way to access index tables, and only afterwards mention that index tables can also be accessed as regular tables, and what its schema looks like (in this example, Url => u64).
| //! | ||
| //! For example, consider a table `Base` which maps `u64` keys to `Foo` values where `Foo` has a | ||
| //! field `bar: Url` (and `bar` is a unique identifier for a `Foo` in `Base`). The schema would look | ||
| //! like `tables!(Base(Base => Foo; index(bar: Url)));` which will create a `Base` table and an |
There was a problem hiding this comment.
I don't think we discussed this in the design, but: if we found a valuable reason for doing so, could this macro shape be reasonable extended to support indexing by a key function? That is, a definition that would look something like (invalid syntax alert, probably):
fn node_to_flibble_key(n: &Node) -> Flibble {
// Do some semi-fancy but pure computation to calculate the key
...
}
tables!(Base(u64 => Node; index(flibble: |n| node_to_flibble_key(n)));
I suppose most obviously we could do that sort of thing by introducing a new keyword like computed_index(...) in addition to index.
Either way, not something we actively need right now, just musing because it feels like a tempting continuation of index support.
| //! | ||
| //! For example, consider a table `Base` which maps `u64` keys to `Foo` values where `Foo` has a | ||
| //! field `bar: Url` (and `bar` is a unique identifier for a `Foo` in `Base`). The schema would look | ||
| //! like `tables!(Base(Base => Foo; index(bar: Url)));` which will create a `Base` table and an |
There was a problem hiding this comment.
Where does this schema definition say that the main table is keyed by u64? Should this be tables!(Base(u64 => Foo; index(bar: Url))); ?
| type ArgValue; | ||
|
|
||
| /// Unwrap a `SinValue` into a typed value. | ||
| fn from_value(value: SinValue) -> Self::ArgValue; |
There was a problem hiding this comment.
Looking at the implementations below, all these trait method will unreachable! if the SinValue variant isn't the proper one for a particular singleton. Given that this trait is exported and documented as usable (albeit discouraged), these method docs should probably have a # Panics section? Or did I miss something in the code structure that makes those unreachable!s truly statically unreachable even if an outside caller pokes at the trait methods directly?
| #[doc(hidden)] | ||
| #[macro_export] | ||
| macro_rules! match_helper_lhs { | ||
| (U64, $value:ident) => { |
There was a problem hiding this comment.
Hmm, at first I couldn't figure out why you needed this explicit rule when the second rule seems to cover this case, but it's because U64 isn't a declared identifier anywhere in the code, it's just being used as a symbol at macro expansion time to handle the case of a naked u64. Hmm, annoying complication, but I don't see an obvious way around it.
| /// ``` | ||
| /// # Indexes | ||
| /// | ||
| /// The syntax of an indexes is `index(field: Type)` where `field` is the name of a field in the value |
| Q: ?Sized + std::hash::Hash + Eq + std::borrow::ToOwned<Owned = $key_ty> | ||
| { | ||
| $( | ||
| self.$field.data.insert(_value.$field.clone(), _key.to_owned()); |
There was a problem hiding this comment.
Does $value_ty need a Clone bound on it somewhere here?
| singleton!(Baz(u64 as Ref)); | ||
| singleton!(Qux(u64)); | ||
|
|
||
| assert_eq!(&42, Foo::from_value_ref(&Foo::to_value(42))); |
There was a problem hiding this comment.
Hmm, I was expecting singletons to be attached to a KvStore as well, but afaict in this implementation singletons end up being free-floating global identifiers, and any KvStore can store a singleton keyed by that identifier, it's up to the caller to not cross the streams if they're defining disjoint stores. Is that correct?
In practice that'll probably be fine, but as a design challenge I'm wondering if we could couple the singletons to specific stores a bit more tightly. My first thought was to allow for a variant table definition that only has a value, e.g. you could write tables!(Foo(&'static str => Node), Bar(u32 => Vec<String>), Single(Arc<Config>)) and have that turn into a pair of tables and a singleton that's strongly tied to this one store. I don't know if it's possible to express that variable geometry table schema syntax in rust macros without going mad though...
| //! into transactions which are atomic and serializable. Both singleton and tabular data can be part | ||
| //! of a transaction, and tables and transactions are orthogonal. Transactions may be read/write or | ||
| //! read-only. Transactions don't need to be explicitly committed - the effect is 'commit on drop' | ||
| //! but the implementation is that operations are individually committed. |
There was a problem hiding this comment.
Ditto — what happens on a panic unwind or error rethrow via ? — I assume we normally don't want that state to be committed
| //! ts-kvstore has a synchronous API and no async functions. It is safe to use it in an async, as | ||
| //! long as there are **no `await` points inside a transaction** (which can lead to degraded | ||
| //! performance or deadlock). |
There was a problem hiding this comment.
Are transactions !Send? That gets us enforcement of this property by the compiler for most futures we write
| #[doc(inline)] | ||
| pub use index::KvTableIndex; | ||
| #[doc(inline)] | ||
| pub use iter::{IndexIterator, TableIterator}; | ||
| #[doc(inline)] | ||
| pub use raw::KvTable; | ||
| #[doc(inline)] | ||
| pub use transactions::{KvTableRoTransactional, KvTableTransactional, RoTransaction, Transaction}; |
There was a problem hiding this comment.
Nit/question: do these need #[doc(inline)]? I was under the impression they didn't, since the mod is private, so they're automatically inlined because rustdoc doesn't have another visible item to alias to
| /// Helper type for using a reference to a [`RwLockWriteGuard`] as a generic argument | ||
| /// with a `Deref` bound. Required because checking trait bounds does not take into account | ||
| /// transitivity of `Deref`. | ||
| struct RefWriteGuard<'r, 'a, T>(&'r RwLockWriteGuard<'a, T>); |
There was a problem hiding this comment.
nit: maybe stick these in a module? crate root seems noisy
|
|
||
| /// A token indicating ownership of a KV singleton or table. See crate docs for what ownership means | ||
| /// for a store. | ||
| pub type Owner = &'static str; |
There was a problem hiding this comment.
Hmm — could we make these unique anonymous tokens / eliminate the possibility of collision? Along the lines of:
#[derive(Copy, Clone, PartialEq, Eq)]
pub struct Owner(TypeId);
impl Owner {
#[doc(hidden)]
pub const fn new(t: TypeId) {
Self(t)
}
}
#[macro_export]
macro_rules! owner {
() => {
const {
mod anon {
pub struct Anon;
}
$crate::Owner::new(TypeId::of::<anon::Anon>())
}
}
}Usage would look like:
mod mymod {
const OWNER: ts_kv_store::Owner = ts_kv_store::owner!();
}| /// A key-value store. | ||
| /// | ||
| /// See [`$crate::KvStore`] (which this type implicitly derefences to) for full docs. | ||
| pub struct KvStore($crate::KvStore<TableStorage>); |
There was a problem hiding this comment.
Ah, I see — so there's necessarily a single tables! declaration that produces the schema per-store? I.e. if multiple modules want to have their own tables, they either need to choose to have a shared tables! declaration (to use the same KvStore instance), or use completely separate KvStores? This makes sense, just verifying
| } | ||
|
|
||
| impl<D: schema::TableDesc, I: Default> Table<D, I> { | ||
| pub(crate) fn try_set_owner(&mut self, owner: Owner) -> Result<()> { |
There was a problem hiding this comment.
I'm not sure I understand why the owner is set on-first-use rather than in the tables! macro? Are we anticipating dynamic table owners?
| fn read_lock(self) -> Self::ReadLock; | ||
| } | ||
|
|
||
| pub(crate) trait SingletonOps<'a, TableStorage: schema::GeneratedStorage + 'a>: |
There was a problem hiding this comment.
Why does this take Owner if it doesn't check it?
| table.data.is_empty() | ||
| } | ||
|
|
||
| fn get<Q>(self, key: &Q, _owner: Owner) -> Option<D::Value> |
There was a problem hiding this comment.
Ditto for this and following methods: why take Owner?
| .map_singleton_value(|v| D::from_value(v)) | ||
| } | ||
|
|
||
| fn mutate<D: schema::MutSingleton, T>( |
There was a problem hiding this comment.
Nit: maybe with_mut for parity to with? (Ditto for similarly-named methods in other traits)
Adds a new crate, ts_kv_store, with the first increment of a typed, in-memory KV store. So far what is implemented:
I think this corresponds to the first two increments + indexes from the planning doc. Other than stuff discussed in meetings/on Slack, I don't think the implementation differs from the requirements or plan.
AI declaration: most tests are generated by Claude and reviewed by me. Everything else is written by human. I've had Claude review some of the code to help find bugs then addressed the bugs manually. I don't believe I actioned anything that wouldn't have been recommended by a human reviewer (but the code should absolutely still be reviewed by at least one human, sorry to that human in advance).
There are quire a few TODOs scattered around, I don't think any of them need fixing before landing. Some are things I intend to fix very soon, and some are longer term which I should turn into issues.
Closes #206, cc #169