RWDB (Read-Write Database) In-Memory Node Store #559
shortthefomo
started this conversation in
XLS Proposals
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
RWDB (Read-Write Database) In-Memory Node Store
1. Abstract
RWDB is an in-memory node store backend for xrpld that eliminates all persistent node-store disk I/O by operating in "null-backend" mode. In this mode,
fetch()operations against the node store always returnnotFoundandstore()operations are no-ops. Ledger state is retained entirely throughshared_ptrchains linking Ledgers to SHAMaps, with a configurable sliding window of recent ledgers kept resident in memory.The feature provides a lighter, faster node configuration suitable for validators, submission nodes, pathfinding nodes, and any workload that requires fast access to recent tran sactions without historical data retention on disk. A synced RWDB node typically consumes 11-13GB of memory.
Index
2. Motivation
Traditional xrpld nodes require fast, expensive SSD storage for the node database (RocksDB, SQLite, or Nudb). The node store is responsible for persisting SHAMap nodes (ledger state and transaction data), which generates significant disk I/O during validation and ledger building.
Current limitations of traditional node store backends:
RWDB addresses these gaps by providing:
3. Introduction
RWDB introduces a fundamentally different approach to node storage: instead of persisting SHAMap nodes to disk, it relies entirely on in-memory
shared_ptrreference chains. When a ledger is validated or received, its SHAMap structure is pinned in memory via the ledger's internal pointers. A sliding window of recent ledgers maintains these pointers, preventing garbage collection of active state.This approach trades historical data persistence for operational simplicity and performance. On restart, an RWDB node must resync from peers, but this is acceptable for workloads that prioritize current-state validation over historical queries.
3.1. Terminology
Core Concepts:
fetch()andstore()operations.fetch()always returnsNotFoundandstore()is a no-op. This is the default and only supported mode for RWDB.fullyWired_flag on theLedgerobject.visitDifferences), rather than walking the entire SHAMap tree. Used to efficiently wire new ledgers when a fully-wired base ledger exists.ledger_history. Older ledgers are released, allowing their SHAMap nodes to be garbage collected.Components:
NodeStore::Backend, responsible for node object storage (or short-circuiting in null mode).RelationalDatabase, responsible for ledger header and transaction data storage.std::vectorstorage.3.2. Scope
This XLS specifies the following components and behaviors:
NodeStore::Backendwith null-backend mode short-circuiting.RelationalDatabasefor ledger headers, transactions, and account-tx indexes.Ledgerobjects for delta walk optimization.clearCaches()to prevent irrecoverable data loss.3.3. Non-Goals
ledger_historywindow. Archive nodes require disk-backed storage.4. Specification
4.1. Overview
RWDB introduces two coordinated in-memory backends:
RWDB Node Store Backend (
RWDBFactory/RWDBBackend): An in-memory implementation of theNodeStore::Backendinterface usingstd::map<uint256, shared_ptr<NodeObject>>for storage. In null-backend mode (the default and only supported mode),fetch()always returnsStatus::NotFoundandstore()is a no-op.RWDB Relational Database Backend (
RWDBDatabase): An in-memory implementation of theRelationalDatabaseinterface usingstd::mapcontainers for ledger headers, transactions, and account-transaction indexes.4.2. Configuration
RWDB is activated by setting
type=rwdbin the[node_db]configuration section. This automatically enables null-backend mode — no additional configuration keys or environment variables are required.Required constraints:
ledger_historymust be greater than 0.online_deleteis automatically defaulted toledger_historyif not explicitly set, clamped to the minimum deletion interval.Environment Variable:
Setting
type=rwdbautomatically sets theXRPL_RWDB_NULLenvironment variable to"1", which is used by libxrpl helpers (which cannot access Config) to detect null mode. The parsing logic is:"0": null mode explicitly disabled"1","true"): null mode enabled4.3. RWDB Node Store Backend
The
RWDBBackendclass implements theNodeStore::Backendinterface:Storage:
std::map<uint256 const, std::shared_ptr<NodeObject>>for in-memory storage.reader_preferring_shared_mutex(see Section 4.7).Null Mode Behavior:
fetch(): ReturnsStatus::NotFoundimmediately whenXRPL_RWDB_NULLis set.store(): Returns immediately without storing whenXRPL_RWDB_NULLis set.fetchBatch(): Iterates hashes and callsfetch()for each, respecting null mode.storeBatch(): Callsstore()for each object in the batch, respecting null mode.Lifecycle:
open(): Sets internalisOpen_flag; throws if already open.close(): Swaps the internal map to a local variable (O(1)) then destroys outside the lock, preventing fetch() calls from being blocked by map destruction.4.4. RWDB Relational Database
The
RWDBDatabaseclass implements theRelationalDatabaseinterface with in-memory storage:Data Structures:
ledgers_:std::map<LedgerIndex, LedgerData>— stores ledger headers and transaction data per sequence.ledgerHashToSeq_:std::map<uint256, LedgerIndex>— maps ledger hash to sequence number.transactionMap_:std::map<uint256, AccountTx>— global transaction lookup by hash.accountTxMap_:std::map<AccountID, AccountTxData>— per-account transaction index, keyed by ledger sequence.Operations:
getMinLedgerSeq()/getMaxLedgerSeq(): Return first/last ledger sequences fromledgers_.getTransactionsMinLedgerSeq(): Returns minimum sequence with non-empty transactions.getAccountTransactionsMinLedgerSeq(): Returns minimum sequence across all account transaction maps.deleteBeforeLedgerSeq(): Removes ledgers and associated transaction data below a threshold.deleteTransactionByLedgerSeq(): Purges transactions for a specific ledger sequence.saveValidatedLedger(): Stores ledger header and transactions in memory.getLedgerInfoByIndex(): Retrieves ledger header by sequence number.4.5. In-Memory Peer Finder Store
The
InMemoryStoreclass replaces the SQLite-based peer finder store for RWDB mode:Storage:
entries_:std::vector<Entry>— holds peer endpoint and valence data.Operations:
load(): Iterates stored entries and invokes the callback for each.save(): Replaces the entire entry list with the provided vector.4.6. SHAMap Store Integration
When
type=rwdbis detected inSHAMapStoreImp:Non-rotating Database: A plain (non-rotating)
NodeStore::Databaseis created viamakeNodeStore(). NoDatabaseRotatingImpis instantiated, anddbRotating_remainsnullptr.Rotation Thread Behavior: The rotation cycle skips
clearCaches(),makeBackendRotating(), androtate()entirely. Only SQLite cleanup (clearPrior) runs during the rotation cycle. This is critical because theTreeNodeCacheIS the node store in null mode, and evicting it would cause irrecoverableSHAMapMissingNodeerrors.online_delete Default: When not explicitly configured,
online_deletedefaults tomax(ledger_history, minimum_deletion_interval).4.7. Reader-Preference Shared Mutex
A new
reader_preferring_shared_mutexutility provides cross-platform reader-preferring semantics:pthread_rwlock_tinitialized withPTHREAD_RWLOCK_PREFER_READER_NP, ensuring readers are not starved by frequent writer contention (which occurs with the defaultPTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP).std::shared_mutex, which is already reader-preferring on those platforms.The interface is identical to
std::shared_mutex, supportingstd::shared_lockandstd::unique_lock.A possible future extension not explored here is to prioritize writes for a validator configuration.
4.8. Ledger Fully-Wired Tracking
Ledgers now track a
fullyWired_atomic boolean flag:isFullyWired(): Returns whether all SHAMap nodes have been pinned in memory.setFullyWired(): Marks the ledger as fully wired (idempotent).fullWireForUse(): Forces a complete walk of state and tx maps to pin all nodes, then sets the wired flag. Returnsfalseif aSHAMapMissingNodeis encountered.The genesis ledger is marked as fully wired by default. In null-backend mode, this flag is used to determine if a delta walk can be used for wiring (faster) versus a full tree walk (expensive on mainnet with 70M+ leaves).
4.9. Ledger Retention and Delta Walk Wiring
LedgerMaster Sliding Window:
mRetainedLedgersmaintains a sliding window of N recent ledgers (configurable vialedger_history).getClosestFullyWiredLedger()efficiently finds the closest fully-wired ledger on the same network as a target, used as a base for delta walks.InboundLedger Delta Walk:
When an inbound ledger needs to be wired in null-backend mode:
visitDifferences) against the base, which only visits changed state nodes.canonicalizeChild, and only wire the small tx map.fullWireForUse()always returnstrueimmediately (existing behavior unchanged).InboundLedgers Cache:
recentHistoryLedgers_caches recently fetched inbound ledgers for retention.onLedgerFetched()passes the inbound ledger for retention in the sliding window.4.10. SHAMapSync FullBelowCache
In null-backend mode, the shared
FullBelowCacheis disabled to prevent cross-SHAMap subtree-skipping (which could cause incorrect results when entries are irrecoverable). LocalisFullBelow()checks are preserved for per-map efficiency.The
useFullBelowCache()helper checksXRPL_RWDB_NULLand returnsfalsewhen null mode is active.4.11. Peer Fallback
When the node store returns empty results,
PeerImpfalls back to:TreeNodeCachelookupThis ensures peers can still serve node requests even without persistent storage.
4.12. Cache Eviction Prevention
clearCaches()returns early in null-backend mode. TheTreeNodeCacheandFullBelowCacheare not evicted because entries are irrecoverable without a real backend.5. Rationale
5.1. Why Null-Backend Mode?
The original Xahau implementation designed RWDB as a memory-only backend from the outset. The null-backend mode provides the most significant performance improvement for the target use cases by eliminating all node-store disk I/O. A persistent in-memory database would add operational complexity and external dependencies while providing minimal benefit over existing disk-backed options like RocksDB. Therefore, RWDB type always implies null backend.
5.2. Why Delta Walk Instead of Full Tree Walk?
On mainnet, a full SHAMap tree walk touches 70M+ leaves, which takes longer than a consensus round. Delta walks against a recent fully-wired base touch only the changed nodes (typically thousands), making wiring fast enough to keep up with consensus.
5.3. Why Disable Shared FullBelowCache?
The shared
FullBelowCacheallows different SHAMap syncs to share "full below" information. In null-backend mode, where entries cannot be recovered from disk, incorrect cache state could cause permanent data loss. Disabling the shared cache while keeping per-map checks provides safety without significant performance impact.5.4. Why Reader-Preference Mutex?
Linux's default
pthread_rwlock_tprefers writers, causing reader starvation under write contention. The RWDB backend has frequent reads (fetch operations) and periodic writes (store operations), making reader preference the correct choice for throughput.again we can alter this to be adjusted for validators... atm not implemented.
5.5. Alternate Designs Rejected
6. Backwards Compatibility
This feature introduces no backwards incompatibilities:
type=rwdb).XRPL_RWDB_NULLenvironment variable is internal and does not affect network behavior.7. Test Plan
This feature is accompanied by comprehensive unit tests across the rippled codebase. The test strategy validates both correctness of the in-memory backends and proper integration with the SHAMap store, ledger management, and peer protocols.
7.1. Backend Correctness
XRPL_RWDB_NULLenvironment variable parsing (unset, "1", "true", "0", empty string). Verifies null-mode short-circuiting of fetch/store operations. Tests cleanup regression to ensure environment variable state is not polluted between test runs.7.2. Relational Database Correctness
RWDBDatabaseimplements theRelationalDatabaseinterface correctly.account_txRPC with RWDB relational backend. Verifies transaction pagination and filtering work correctly with in-memory storage.txRPC with RWDB relational backend. Verifies range requests and CTID (Canonical Transaction ID) resolution.7.3. SHAMap Store Integration
online_deleteconfigured. Verifies that rotation skips cache clearing and backend rotation in null mode. Tests ledger retrieval via RPC after rotation cycle completes.7.4. Ledger Wiring and Retention
setFullyWired()idempotency (calling multiple times produces same result). TestsfullWireForUse()success path and thread-safe concurrent access to the wired flag.ledger_history > 0requirement is enforced.7.5. Concurrency and Infrastructure
try_lock_shared()andtry_lock()behavior. Validates reader preference under contention (readers are not starved by frequent writer contention).8. Operational Considerations
8.1. Deployment Recommendations
8.2. Configuration Guidance
ledger_historytuning: Setledger_historybased on available memory and query requirements. Higher values provide a larger retention window but increase memory usage proportionally. A value of 512 (default) provides approximately 2-3 hours of history on mainnet.online_deletebehavior: When not explicitly configured,online_deletedefaults tomax(ledger_history, minimum_deletion_interval). This ensures old ledger data is cleaned up from the relational database in sync with the sliding window.advisory_delete: Recommended to set to0for RWDB, as there is no persistent node store data to advisory-delete.8.3. Monitoring and Alerting
ledger_historybefore OOM kills occur.8.4. Recovery Procedures
ledger_history, or adjusting cgroup limits before restarting.type=rwdbtotype=rocksdb(or desired backend) in[node_db], then restart. A full resync will be required as no persistent node store data exists to migrate.type=rocksdbtotype=rwdbin[node_db], then restart. The existing RocksDB data files will be ignored; the node will resync from peers.9. Performance Analysis
9.1. Memory Footprint
RWDB memory consumption.
Memory usage scales with:
ledger_historyvalue (higher = more retained ledgers = more memory).9.2. Sync Timing
RWDB nodes must resync from peers after every restart, as no persistent node store data survives process termination:
Factors affecting sync speed:
9.3. Throughput and Latency
Compared to disk-backed node stores (RocksDB, SQLite):
store()is a no-op.ledger_history)These are representative estimates based on the architectural differences between in-memory and disk-backed storage. Actual measurements will vary by hardware, workload, and configuration.
10. Security Considerations
10.1. Data Loss on Restart
RWDB nodes store all node data in memory. On restart, the node must resync from peers. This is a design trade-off, not a vulnerability, but operators should understand the implications:
Impact:
ledger_historywindow is unavailable.Mitigations:
10.2. Memory Exhaustion
The in-memory storage grows with the number of retained ledgers and SHAMap nodes pinned in memory.
Attack Scenarios:
Mitigations:
ledger_historyappropriately for available memory; lower values reduce memory footprint.10.3. No Attack Surface Expansion
RWDB does not introduce new network protocols, RPC methods, or transaction types. The feature is purely an internal storage backend change, visible only through configuration. Peer interactions remain identical to other node types.
10.4. Cache Integrity
Disabling the shared
FullBelowCachein null mode prevents potential cache poisoning across SHAMap instances. The early return inclearCaches()prevents accidental eviction of irrecoverable data.Why this matters:
SHAMapMissingNodeerrors that cannot be recovered.clearCaches()to prevent this scenario.10.5. Environment Variable Isolation
The
XRPL_RWDB_NULLenvironment variable is set automatically bySHAMapStoreImpwhentype=rwdbis configured. Tests properly save and restore the environment variable state to prevent test pollution (verified byRWDBBackend_test::testNullModeCleanupRegression).11. Appendix A: FAQ
A.1: Can I use RWDB for an archive node?
No. RWDB is designed for nodes that do not require historical data persistence beyond the
ledger_historywindow. Archive nodes should use RocksDB or SQLite with appropriate retention settings.A.2: What happens if the node runs out of memory?
The node will be terminated by the OS OOM killer (or container orchestrator). On restart, it will resync from peers. Operators should monitor RSS memory usage and configure
ledger_historyto stay within available memory bounds. Consider setting cgroup memory limits with appropriate swap behavior.A.3: Can I switch between RWDB and RocksDB/NuDB?
Yes, by changing the
typevalue in the[node_db]section and restarting xrpld. Note that switching to RWDB will require a full resync from peers, as RWDB does not read RocksDB/NuDB data files. Switching from RWDB to RocksDB will also require a resync, as no persistent data exists to migrate.A.4: Does RWDB affect consensus or validation?
No. RWDB nodes participate in consensus and validation identically to other node types. The protocol behavior, ledger signing, and validation logic are unchanged. The difference is only in how node data is stored (memory vs. disk).
A.5: Can RWDB nodes serve historical ledger queries?
Only for ledgers within the
ledger_historyretention window. Queries for older ledgers will return errors (e.g.,lgrNotFound), similar to a non-archive node configured with limited history.A.6: Is RWDB suitable for public RPC servers?
RWDB is suitable for RPC servers that primarily serve recent data (e.g., transaction submission, account balances, recent ledgers, pathfinding). It is not suitable for servers that need to serve historical data queries (e.g.,
account_txfor old ledgers, historical ledger state lookups).A.7: How long does resync take after a restart?
Resync time depends on network conditions and peer availability. NuDB/RocksDB nodes will typically sync back to the network faster if they hold valid state from a fast restart. This is because a memory node has no prior history. Mainnet typically suncs around ledger index ~125 (it starts counting at 0 as has no state).
A.8: Can I run multiple RWDB nodes in a cluster?
Yes, but each node maintains its own independent in-memory state. There is no built-in replication or state sharing between RWDB nodes. For high availability, deploy multiple independent RWDB nodes behind a load balancer.
A.9: Does RWDB work with Clio?
Yes. RWDB is an excellent backend for xrpld instances that are paired with Clio. Clio handles historical data queries and persistence, while the RWDB-backed xrpld focuses on validation and recent ledger processing.
A.10: What is the minimum recommended memory for an RWDB node?
For mainnet, a synced RWDB node typically consumes 11-13GB of RAM. We recommend deploying with at least 16GB to provide headroom for peak usage during sync and ledger processing. For testnet/devnet, significantly less memory is required.
A.11: Can I use RWDB on testnet or devnet?
Yes. RWDB works on any XRPL network. The memory footprint will be proportional to the network's ledger size. Testnet and devnet have much smaller ledgers, requiring less memory.
A.12: Can I use RWDB for local development?
Yes. RWDB works on any XRPL network. The none disk load can actually help with SSD life times as they have a limited read/write cycle.
A.13: Does RWDB affect peer-to-peer node propagation?
No. RWDB nodes participate in the peer overlay network identically to other node types. Node requests that cannot be satisfied from memory will fall back to the TreeNodeCache or in-memory ledgers, then propagate to peers if unavailable locally.
A.14: What happens during a network amendment with RWDB?
RWDB nodes process amendments identically to other node types. The amendment process is independent of the node store backend. If an amendment requires ledger state that is outside the retention window, the node will need to fetch it from peers (same as any non-archive node).
A.15: Does RWDB persist its state to disck on shutdown?
No. RWDB in the current implementation does not performa this on node shut down (as it could speed up a nodes restart). This is left to future adjustments to the protocol.
Acknowledgements
I would like to thank the Xahau project for their pioneering implementation of in-memory RWDB backends, which served as the foundation for this feature. Additionally, I thank the XRPL community members and node operators who provided feedback on the design and use cases during development.
Additional Links
Beta Was this translation helpful? Give feedback.
All reactions