feat: introduce Module framework for persistent storage modules#407
Closed
zhanglei1949 wants to merge 2 commits into
Closed
feat: introduce Module framework for persistent storage modules#407zhanglei1949 wants to merge 2 commits into
zhanglei1949 wants to merge 2 commits into
Conversation
Add the Module abstract interface, ModuleFactory singleton registry, and ModuleDescriptor metadata struct. These provide the infrastructure that the upcoming checkpoint / snapshot store work will use to persist and restore extensible storage modules in a type-erased way. Module declares four lifecycle hooks (Open / Dump / Close / Fork) that operate against a Checkpoint instance. ModuleFactory maps short type-name strings to creator functions for deserialization, with NEUG_REGISTER_MODULE / NEUG_REGISTER_TEMPLATE_MODULE macros for static-init registration. ModuleDescriptor carries the metadata required to round-trip a module through JSON, including arbitrary key-value extras and recursive sub-module descriptors. Also adds a small UUIDGenerator utility used to name per-Dump and per-Fork sub-directories under the checkpoint runtime dir, and a StorageTypeName trait that maps storage value types to factory key suffixes. This PR introduces the framework only; no callers yet. The four base classes (CsrBase, VertexTimestamp, LFIndexer, ColumnBase) that inherit Module and the checkpoint / snapshot_store that drives them land in the follow-up workspace / checkpoint PR.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds the
Moduleabstract interface,ModuleFactorysingleton registry, andModuleDescriptormetadata struct — infrastructure for the upcoming checkpoint / snapshot store work to persist and restore extensible storage modules in a type-erased way.This PR is pure addition with no caller. Reviewers are asked to look at the API shape rather than usage; the callers (the four
Module-inheriting base classes and theCheckpoint/SnapshotStorethat drive them) land in the follow-up workspace / checkpoint PR.This is PR-B1 of the #370 split — it can be reviewed in parallel with #401 (PR-A: storage view layer refactor), since the two are independent.
What's in the framework
Module(include/neug/storages/module/module.h) — abstract base with four lifecycle hooks:Open(Checkpoint&, ModuleDescriptor&, MemoryLevel),Dump(Checkpoint&) -> ModuleDescriptor,Close(),Fork(Checkpoint&, MemoryLevel) -> unique_ptr<Module>. BothDumpandForkare expected to write into a UUID sub-directory under the checkpoint runtime dir.ModuleFactory(include/neug/storages/module/module_factory.h,src/storages/module/module_factory.cc) — singleton registry mappingmodule_typestrings to creator functions.NEUG_REGISTER_MODULE(Class)/NEUG_REGISTER_TEMPLATE_MODULE(Tmpl, T)macros perform static-init registration; the template variant uses__COUNTER__so types containing::(e.g.std::string_view) still get a stable unique function identifier.ModuleDescriptor(include/neug/storages/module_descriptor.h,src/storages/graph/module_descriptor.cc) — metadata struct round-trippable through rapidjson, carryingpath/size/module_typeplus arbitrary extra KV pairs and recursive sub-module descriptors. Pimpl is used for the sub-module map to avoid the incomplete-type issue of a self-referentialunordered_mapinside the struct definition.StorageTypeName<T>(include/neug/storages/module/type_name.h) — trait mapping storage value types (int32_t,std::string_view,Date, ...) to short factory key suffixes.UUIDGenerator(include/neug/utils/uuid.h,src/utils/uuid.cc) — small RFC-4122-style UUID generator used to name per-Dump/ per-Forksub-directories.Wiring
src/storages/module/CMakeLists.txt— newneug_storages_moduleOBJECT library (GLOB).src/storages/CMakeLists.txt— addsadd_subdirectory(module)and wires the new OBJECT files intoNEUG_STORAGES_OBJFILES.module_descriptor.ccis placed undersrc/storages/graph/(and picked up by the existing GLOB there) because it instantiates the pimpl that wraps a self-referential map ofModuleDescriptors — keeping it in the same translation unit family avoids a circular dependency betweenneug_storages_moduleandneug_property_graph.uuid.ccis placed undersrc/utils/(existing GLOB picks it up).No changes to existing files other than the two-line
src/storages/CMakeLists.txtedit.Review focus
Modulelifecycle hooks the right surface, given they will be driven byCheckpoint/SnapshotStorein the follow-up?ModuleDescriptorshape:extra_(free-form KV) +sub_modules_(recursive) + JSON round-trip is enough metadata for the downstream callers; reviewers may want to push back if a flatter shape would do.__attribute__((constructor))-based static-init registration is intentional (so callers don't need an explicit init step). The__COUNTER__indirection inNEUG_REGISTER_TEMPLATE_MODULEis for::-containing type names — please double-check it works on your toolchain.Fixes #408