Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

statedb: An in-memory database #24523

Merged
merged 4 commits into from Apr 3, 2023
Merged

Conversation

joamaki
Copy link
Contributor

@joamaki joamaki commented Mar 22, 2023

pkg/statedb provides the Hive with an in-memory database based on immutable radix trees. It supports multiple tables, each with their own indexes. Tables are registered by providing them to the state module as group values. The statedb module provides the central sequencing and snapshotting of the database for doing cross-table atomic transactions.

pkg/statedb/example/main.go provides a fairly full-featured example of the API provided by the statedb module, so I won't repeat that in the description here.

The use-cases for this are any components that have state that is concurrently accessed, potentially via multiple indices. Rather than hand-writing this around Go's builtin maps and mutexes this provides the faciility to implement this in a simpler way, with better safety (no locks for readers!) and performance when the use-case is mostly reading. The table events and "query invalidation" mechanisms provide an alternative to event streams and synchronous APIs to affect changes from one sub-system to another and enable an architecture where work can be done in larger batches and asynchronously without direct coupling between sub-systems.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Mar 22, 2023
@joamaki joamaki added the release-note/misc This PR makes changes that have no direct user impact. label Mar 22, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Mar 22, 2023
@joamaki joamaki force-pushed the pr/joamaki/state-package branch 3 times, most recently from 2daf7c2 to 115dd57 Compare March 22, 2023 16:34
@joamaki joamaki changed the title state: An in-memory database statedb: An in-memory database Mar 22, 2023
@joamaki joamaki marked this pull request as ready for review March 22, 2023 16:44
@joamaki joamaki requested review from a team as code owners March 22, 2023 16:44
@christarazi christarazi self-requested a review March 22, 2023 18:25
Copy link
Member

@dylandreimerink dylandreimerink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am interested to see if this is going to live up the expectation. I have high hopes.

Copy link
Member

@bimmlerd bimmlerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excited to see where this goes!

// (courtesy of github.com/hashicorp/go-memdb). It adapts the go-memdb library for
// use with Hive by taking the table schemas as a group values from hive and provides
// typed API (Table[Obj]) for manipulating tables. As the database is based on an
// immutable data structure, all objects inserted into the database MUST NOT be mutated!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty afraid of this causing us "spook action at a distance" bugs. Without really thinking about the feasability - it'd be very nice to have something akin to the deadlock detecting lock library. Mabye a mutation detecting wrapper for objects stored in the DB, enabled through some debug build trickery? Or static analysis that looks at write txns?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty afraid too, though since we haven't had that many bugs with cache.Store that is similarly considered immutable I'm not super worried. I added a trivial check in the commit to verify that we're not mutating the same object. We can more fancy mutation detection with something like https://github.com/zimmski/go-mutesting or similar libraries. client-go also has similar code to check that cache.Store objects aren't mutated. I'll leave integrating something like that for a later PR.

pkg/statedb/db.go Outdated Show resolved Hide resolved
@pchaigno
Copy link
Member

Have any concrete users of this API been identified already?

This allows grouping cells into a single cell, but without the scoping and
naming that comes with Module().

Signed-off-by: Jussi Maki <jussi@isovalent.com>
pkg/statedb provides the Hive with an in-memory database based on immutable radix
trees. It supports multiple tables, each with their own indexes. Tables are
registered by providing them to the statedb module as group values. The state
module provides the central sequencing and snapshotting of the database for
doing cross-table atomic transactions.

pkg/statedb/example/main.go provides a fairly full-featured example of the API
provided by the statedb module.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
Add an insert and transaction benchmarks for a rough estimate
on statedb performance. On my laptop I'm getting ~350k insertions
per second (UUID primary key) and ~100k single-insert write transactions
per second.

To support dumping large databases over e.g. REST API, change the ToJSON
into WriteJSON that marshals the objects one by one into the writer rather
than collecting the objects into one large map.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
Add a simple form of mutation detection that checks (by pointer comparison)
that on modification to an existing value a DeepCopy()'d object is inserted.

Extend the tests to do modifications. Validated manually that the mutation
detection resulted in panic.

Signed-off-by: Jussi Maki <jussi@isovalent.com>
@joamaki
Copy link
Contributor Author

joamaki commented Mar 28, 2023

Have any concrete users of this API been identified already?

Over a fairly long period of time, I'm going to propose that this would be used for:

  • rewrite of service load-balancing control-plane on top of this to clean up the merging of multiple sources for frontends/backends
  • rewrite of DeviceManager, the system devices and IP addresses would be exposed via a "devices" table that can be queried and watched
  • cleaner version of pkg/k8s/resource + cache.Store. Use statedb table + Reflector instead of informer and cache.Store.
  • cleaner more performant version of pkg/kvstore/store that doesn't do a full periodic sync and provides better querying capabilities
  • rewriting policy/ipcache/identity machinery to once and for all fix the deadlock issues arising from use of multiple mutexes and exposed internals.
  • rewriting pkg/endpoint and splitting it up into control-plane code that computes the desired state and datapath code that applies desired state (loading programs, updating per-endpoint maps).
  • resilient management of ephemeral kernel state: e.g. management of route tables would be done with a statedb table describing the desired state and a reconciliation loop
  • any new code that would otherwise reach for map(s) + mutexes

Immediate follow up PRs I'm planning to make this a more useful foundation:

  • REST API handlers and cilium utility support for querying, dumping and watching the state (like "kubectl get" for cilium-agent internal state)
  • Persistence support in the form of a commit hook to write specific tables' state to disk and restore on startup

Copy link
Member

@rolinh rolinh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we can't be adding this dependency: hashicorp/go-memdb. It is licensed under the Mozilla Public License 2.0 which isn't a license approved by the CNCF (see approved licenses). It is also not in the list of license exceptions.

Copy link
Member

@rolinh rolinh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed the licensing question offline. We will ask the CNCF to add go-memdb to the license exceptions list and expect it to be approved before Cilium v1.14 is released. This particular package is widely used and other popular MPL2.0 libraries from Hashicorp have already been added to the exception list.

Copy link
Member

@pippolo84 pippolo84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a couple of non-blocking nits left inline. LGTM 🚀

b.Fatalf("Insert error: %s", err)
}
tx.Commit()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before for b.StopTimer()

}
}
tx.Commit()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not much of a difference, but adding a b.StopTimer() here we'll remove the accounting of the deferred hive.Stop() in the benchmark.

@christarazi christarazi added area/misc Impacts miscellaneous areas of the code not otherwise owned by another area. area/modularization labels Mar 31, 2023
@joamaki joamaki added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Mar 31, 2023
@joamaki
Copy link
Contributor Author

joamaki commented Mar 31, 2023

First use-case: #24677

@squeed squeed merged commit 03bc4dd into cilium:master Apr 3, 2023
42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/misc Impacts miscellaneous areas of the code not otherwise owned by another area. area/modularization ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants