New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
statedb: An in-memory database #24523
Conversation
2daf7c2
to
115dd57
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am interested to see if this is going to live up the expectation. I have high hopes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excited to see where this goes!
// (courtesy of github.com/hashicorp/go-memdb). It adapts the go-memdb library for | ||
// use with Hive by taking the table schemas as a group values from hive and provides | ||
// typed API (Table[Obj]) for manipulating tables. As the database is based on an | ||
// immutable data structure, all objects inserted into the database MUST NOT be mutated! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty afraid of this causing us "spook action at a distance" bugs. Without really thinking about the feasability - it'd be very nice to have something akin to the deadlock detecting lock library. Mabye a mutation detecting wrapper for objects stored in the DB, enabled through some debug build trickery? Or static analysis that looks at write txns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty afraid too, though since we haven't had that many bugs with cache.Store that is similarly considered immutable I'm not super worried. I added a trivial check in the commit to verify that we're not mutating the same object. We can more fancy mutation detection with something like https://github.com/zimmski/go-mutesting or similar libraries. client-go also has similar code to check that cache.Store objects aren't mutated. I'll leave integrating something like that for a later PR.
Have any concrete users of this API been identified already? |
This allows grouping cells into a single cell, but without the scoping and naming that comes with Module(). Signed-off-by: Jussi Maki <jussi@isovalent.com>
pkg/statedb provides the Hive with an in-memory database based on immutable radix trees. It supports multiple tables, each with their own indexes. Tables are registered by providing them to the statedb module as group values. The state module provides the central sequencing and snapshotting of the database for doing cross-table atomic transactions. pkg/statedb/example/main.go provides a fairly full-featured example of the API provided by the statedb module. Signed-off-by: Jussi Maki <jussi@isovalent.com>
Add an insert and transaction benchmarks for a rough estimate on statedb performance. On my laptop I'm getting ~350k insertions per second (UUID primary key) and ~100k single-insert write transactions per second. To support dumping large databases over e.g. REST API, change the ToJSON into WriteJSON that marshals the objects one by one into the writer rather than collecting the objects into one large map. Signed-off-by: Jussi Maki <jussi@isovalent.com>
Add a simple form of mutation detection that checks (by pointer comparison) that on modification to an existing value a DeepCopy()'d object is inserted. Extend the tests to do modifications. Validated manually that the mutation detection resulted in panic. Signed-off-by: Jussi Maki <jussi@isovalent.com>
Over a fairly long period of time, I'm going to propose that this would be used for:
Immediate follow up PRs I'm planning to make this a more useful foundation:
|
115dd57
to
83d3339
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we can't be adding this dependency: hashicorp/go-memdb. It is licensed under the Mozilla Public License 2.0 which isn't a license approved by the CNCF (see approved licenses). It is also not in the list of license exceptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed the licensing question offline. We will ask the CNCF to add go-memdb to the license exceptions list and expect it to be approved before Cilium v1.14 is released. This particular package is widely used and other popular MPL2.0 libraries from Hashicorp have already been added to the exception list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only a couple of non-blocking nits left inline. LGTM 🚀
b.Fatalf("Insert error: %s", err) | ||
} | ||
tx.Commit() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as before for b.StopTimer()
} | ||
} | ||
tx.Commit() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not much of a difference, but adding a b.StopTimer()
here we'll remove the accounting of the deferred hive.Stop()
in the benchmark.
First use-case: #24677 |
pkg/statedb provides the Hive with an in-memory database based on immutable radix trees. It supports multiple tables, each with their own indexes. Tables are registered by providing them to the state module as group values. The statedb module provides the central sequencing and snapshotting of the database for doing cross-table atomic transactions.
pkg/statedb/example/main.go provides a fairly full-featured example of the API provided by the statedb module, so I won't repeat that in the description here.
The use-cases for this are any components that have state that is concurrently accessed, potentially via multiple indices. Rather than hand-writing this around Go's builtin maps and mutexes this provides the faciility to implement this in a simpler way, with better safety (no locks for readers!) and performance when the use-case is mostly reading. The table events and "query invalidation" mechanisms provide an alternative to event streams and synchronous APIs to affect changes from one sub-system to another and enable an architecture where work can be done in larger batches and asynchronously without direct coupling between sub-systems.