Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address Tracker for subscriptions to changed UTXOs notifications #427

Merged
merged 135 commits into from
Apr 9, 2024

Conversation

tiram88
Copy link
Collaborator

@tiram88 tiram88 commented Feb 28, 2024

Subscriptions to UtxosChanged notifications is getting a deep revamp optimizing memory consumption (highly) and execution speed (moderately).

Address Tracker

The subscription sub-system uses a new ubiquitous SubscriptionContext, containing essentially an address Tracker whose role is to track addresses involved in UtxosChangedSubscriptions, indexing them and counting references. All subscriptions use now address indexes instead of ScriptPublicKeys, reducing memory consumption significantly.

The Tracker features an optional maximum address count. When provided, it allows to

  • pre-allocate memory, effectively reducing memory fragmentation
  • limit and protect the node ressources allocated to this feature

Kaspad Argument

The maximum tracked addresses can be defined in the kaspad command with argument --max-tracked-addresses with default value set to 1,835,007.

The component used internally (an IndexMap) allocates maps with a capacity computed with following formula:

((max_addresses + 1) * 8 / 7).next_power_of_two() * 7 / 8

The value passed by argument always gets expanded to the actual usable allocated capacity. For a target of 1M addresses the expanded capacity is 1,835,008. The tracker reserves the last entry for address recycling so the actual maximum is 1,835,007.

Subscription propagation and memory optimizations

Every Notifier is provided a new UtxosChangedMutationPolicy that defines how an incoming UtxosChangedScope mutations will be propagated to its parents. The policy supports two modes: fine-grained AddressSet or all or nothing Wildcard.

The notification system is configured to propagate address sets upstream from all RPC Servers up to the Index Processor. This appears to be the optimal balance in memory consumption + processing effort between upstream subscriptions and downstream notifications.

The processing of an incoming UtxosChangedScope mutation with its embedded address vector is refactored in a way that retains the same vector all the way up from RPC server to the index processor, helping reduce memory fragmentation.

A Notifier and its Broadcasters now share the same subscriptions instances, significantly reducing the allocated memory for large address sets.

UtxosChangedSubscription

The struct is refactored so that it gets inner mutability of its data field (containing state & address index set) and an immutable listener ID (used to identify the instance, notably for equality).

A subscription has 3 states: None, Selected and All. The last 2 are indicative of an active subscription. Selected indicates an address set.

In a Broadcaster, the subscriptions with Selected mode are used as such in the broadcasting Plan, meaning that 2 listeners of this state are always considered distinct. The rationale here for not trying to compare their respective address set for possible equality is that the this outcome has a probability close to null while the computing cost of the comparison is high.

In a Broadcaster, all subscriptions with All state are grouped under a unique ubiquitous instance provided by the SubscriptionContext, allowing to mutate the state of a subscription without affecting the broadcasting Plan.

Benchmarking

New framework

A new test framework allows the quick and idiomatic setup of an integration test featuring ie. a node, a miner and any additional task the test needs, like submitting large sets of transactions (mempool benchmarking) or cycles of subscriptions/unsubscriptions for many clients at once.

As an example, the mempool benchmark setup gets refactored as:

let client_manager = Arc::new(ClientManager::new(args));
let mut tasks = TasksRunner::new(Some(DaemonTask::build(client_manager.clone())))
    .launch().await
    .task(
        MinerGroupTask::build(
            network,
            client_manager.clone(),
            SUBMIT_BLOCK_CLIENTS,
            params.bps(),
            BLOCK_COUNT,
            Stopper::Signal
        ).await,
    )
    .task(
        TxSenderGroupTask::build(
            client_manager.clone(),
            SUBMIT_TX_CLIENTS,
            false,
            txs,
            TPS_PRESSURE,
            MEMPOOL_TARGET,
            Stopper::Signal,
        ).await,
    );
tasks.run().await;
tasks.join().await;

Memory benchmarking of UTXOs changed subscriptions

A set of benchmarks is added, dedicated to measuring the memory consumption in various UTXOs changed subscriptions setups. In all cases, 500 clients connect to a mining node, processing ~100 TPS. 495 client are considered "wallets" and each do subscribe to 800 unique addresses. The 5 clients left are "monitoring services" that subscribe to addresses sets of length 200K to 1M by 200K increments, sets mainly overlapping with the wallets addresses.

Clients run subscription cycles of predefined duration. During 2/3 of the cycle, the client is subscribed to its addresses and the remaining 1/3 of the time, the client is fully unsubscribed.

The benchmark runs until 1.5M transactions are mined. The cycle duration appears to have some moderate impact on the overall benchmark duration. The sending and processing of the 500 subscriptions all in a row slows down the node TPS a bit during ~30 seconds, probably because the network bandwidth and gRPC server are getting briefly saturated.

The benchmark is structured into a client process spawning a child server process. The parent process runs all client activities: mining, submitting transactions and the 500 gRPC clients subscribing to UtxosChanged notifications cyclically. The server runs a node. This architecture has the desired property of having the client and the server running each in isolation, in particular each managing its own memory.

Memory consumption of the server process is tracked every 5 seconds all along the benchmark run. An average value is then computed on the data excluding the first 60 minutes considered as warmup.

The various benchmark setups are:

# Subscriptions cycle Average memory consumption in GB Delta vs A (GB)
A No subscriptions at all (aka. control sample) 2.7
B Single cycle of infinite duration (no unsubscribing) 3.1 0.5
C 2 hours cycles 3.8 1.1
D 30 minutes cycles 4.0 1.3
E 3 minutes cycles 5.4 2.8

tiram88 and others added 30 commits November 13, 2023 18:52
@tiram88 tiram88 marked this pull request as ready for review March 2, 2024 19:36
Copy link
Collaborator

@coderofstuff coderofstuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still reviewing but I thought I should get some of these out.

Copy link
Contributor

@michaelsutton michaelsutton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few initial comments on inner tracker logic. Still reviewing otherwise.

notify/src/address/tracker.rs Show resolved Hide resolved
notify/src/address/tracker.rs Show resolved Hide resolved
notify/src/address/tracker.rs Show resolved Hide resolved
Copy link
Collaborator

@coderofstuff coderofstuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lol, I had comments here but it looks like I never submitted them.

testing/integration/src/subscribe_benchmarks.rs Outdated Show resolved Hide resolved
kaspad/src/daemon.rs Show resolved Hide resolved
notify/src/address/tracker.rs Outdated Show resolved Hide resolved
notify/src/subscription/compounded.rs Show resolved Hide resolved
notify/src/subscription/compounded.rs Show resolved Hide resolved
notify/src/subscription/single.rs Show resolved Hide resolved
notify/src/subscription/single.rs Show resolved Hide resolved
rpc/grpc/client/src/lib.rs Outdated Show resolved Hide resolved
rpc/grpc/client/src/lib.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@michaelsutton michaelsutton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving this, but there are a few technical debts to address in following PRs:

  1. Change UtxosChangedScope to an enum and allow a variant with a vector of indexes. This way internal mutations can avoid the translation back and forth to an address. This is both an optimization and a way to avoid the hacked usage of the mainnet address prefix in the translation
  2. In GrpcClient, obtain the network from the server on start-up and use it to maintain the correct address prefix (important for re-registration on client re-connection). This too will avoid the usage of mainnet address prefix during re-connection
  3. In AddressTracker, use a Guard-style mechanism to manage registrations. The guard will allow modifying the set of registered indexes, but most importantly, will unregister on Drop, which makes it safer and will avoid future errors

@michaelsutton michaelsutton merged commit 5c437cb into kaspanet:master Apr 9, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants