RPC getPerformanceSamples: Add `numNonVoteTransaction` #29353

ilya-bobyr · 2022-12-21T05:26:48Z

Problem

Allow interested parties to see both total and non-vote transaction counts in each performance sample.

Summary of Changes

As new stats need to be stored in the blockstore database, but as they have a different binary representation they are recorded into a new column: perf_samples_v2.

An alternative would have been to make PerfSample deserialization code smart enough to allow for deserialization of both old and new sample records. But it is a non-trivial problem. serde is designed to allow for deserialization of streams, and thus, it does not provide a mechanism for backtracking. bincode does not encode enough information to know if the data being read is the v1 version of the PerfSample record or the v2 version. And rocksdb access layer is written with a pretty deep assumption that types stored in the columns can be deseraialized using serde.

In order to support deserialization of bincode encoded records based on their size, one needs to either extend bincode itself, to allow for some kind of backtracking, or the rocksdb interaction layer would need to be changed to allow for additional flexibility. Both options seems to be pretty complex, considering that they would allow an implementation that is not very extendable, and feels like a hack.

On the other hand, putting new PerfSample version into a separate column seems more robust, and will scale with no problem, allowing any number of future additions to PerfSample, should we be adding more stats.

Fixes #29159

explorer/src/providers/stats/solanaPerformanceInfo.tsx

Allow interested parties to see both total and non-vote transaction counts in each performance sample. As new stats need to be stored in the blockstore database, but as they have a different binary representation they are recorded into a new column: `perf_samples_v2`. An alternative would have been to make `PerfSample` deserialization code smart enough to allow for deserialization of both old and new sample records. But it is a non-trivial problem. `serde` is designed to allow for deserialization of streams, and thus, it does not provide a mechanism for backtracking. `bincode` does not encode enough information to know if the data being read is the v1 version of the `PerfSample` record or the v2 version. And `rocksdb` access layer is written with a pretty deep assumption that types stored in the columns can be deseraialized using `serde`. In order to support deserialization of `bincode` encoded records based on their size, one needs to either extend `bincode` itself, to allow for some kind of backtracking, or the `rocksdb` interaction layer would need to be changed to allow for additional flexibility. Both options seems to be pretty complex, considering that they would allow an implementation that is not very extendable, and feels like a hack. On the other hand, putting new `PerfSample` version into a separate column seems more robust, and will scale with no problem, allowing any number of future additions to `PerfSample`, should we be adding more stats.

CriesofCarrots · 2022-12-22T05:54:58Z

Thanks for working on this, @ilya-bobyr. I can see a lot of this is pretty close (and thanks for all the tests!), but I think the changes will be easier to review and merge if we split it into a few different, smaller PRs. I would recommend at least 3 PRs:

The bank changes, which add a new field to struct Bank, and increment it during transaction processing.
The blockstore column (serde, writing/reading) changes
The SamplePerformanceService changes, which query the new bank api and write it to blockstore, plus the RPC changes, which return the new data

Do you mind splitting?

A couple comments that you may want to start addressing when you do the split:

Regarding Bank changes: We will need to add non_vote_transaction_count to the BankFieldsToDeserialize and BankFieldsToSerialize structs, so that the value will be persisted in and recoverable from snapshots. I think this will require a new snapshot version, as there aren't any unused fields to steal. Sorry in advance that this may be titchy and tedious. The plus side is that there is a different new field (or two) that I need to add to Bank and snapshots, so we can do it all in one new snapshot version. I will start working on this tomorrow, so let's touch base to save work :)
Regarding Blockstore changes: We can actually store 2 different formats of data in once rocksdb column; we'll need more complex code in blockstore_db.rs to handle serde of the 2 types, but then we only have to check one column when reading. This would be my preference.

steviez · 2022-12-22T07:44:49Z

We will need to add non_vote_transaction_count to the BankFieldsToDeserialize and BankFieldsToSerialize structs, so that the value will be persisted in and recoverable from snapshots.

@CriesofCarrots - Not to derail this PR too much, but can you please confirm why we need to persist / recover the value from snapshots? The value is not directly included in bank hash, but I guess the idea is that we want consistency for this field regardless of whether we actually replayed the slot and froze the bank vs. if we created the bank from the snapshot ?

steviez · 2022-12-22T07:58:46Z

Regarding Blockstore changes: We can actually store 2 different formats of data in once rocksdb column; we'll need more complex code in blockstore_db.rs to handle serde of the 2 types, but then we only have to check one column when reading. This would be my preference.

To build on Tyera's note here, adding a new column introduces a compatibility break.

In order to open the RocksDB database, all columns must be present
We set a RocksDB flag to automatically create missing columns
In addition to above requirement, all columns must be opened. That is, there can't be any extra columns

Suppose we have some release R, and that the new column shows in release R+1. If a validator upgrades from R to R+1, the new column will be identified as missing and automatically created per 2. to satisfy 1.. So, the upgrade case works just fine. However, suppose this validator begins to observe some poor behavior with version R+1, and they want to revert back to R to get things working again. Due to 3., this extra column will now make the existing ledger unable to be opened with R without some manual intervention.

We do have a precedent to work around the situation above (ie we add new columns from time to time), but there is extra process to ensure we have a safe upgrade/downgrade path. So, reusing the same column if possible is nice to avoid that process

CriesofCarrots · 2022-12-22T08:06:43Z

the idea is that we want consistency for this field regardless of whether we actually replayed the slot and froze the bank vs. if we created the bank from the snapshot

Yes, this is exactly what I was thinking.

But pondering this more... we don't actually have a way to seed this value correctly (without reprocessing all the blocks since slot 0), so this value isn't ever going to be "correct". It can only be correct from whenever we start counting onward. That's unfortunate, but I don't see any way around it. So there might be no need to snapshot this new field after all. If that's the case, we will need to comment liberally to make sure futures users of the api know that it is only valuable for comparison between Banks.

I'll think more on whether there's any value to validators agreeing on that "wrong" value. In the meantime, no need to start on that Bank serde/snapshot stuff I mentioned yet, ilya-bobyr.

ilya-bobyr · 2022-12-23T09:30:04Z

I made a type when naming a branch for this change.
As I can not rename the source branch, and as I'm splitting this change anyway, closing it in favor of #29383 and #29388.

ilya-bobyr · 2022-12-23T09:44:46Z

@CriesofCarrots take a look please.
bank changes are now in #29383

I've split the bankstore changes into a separate commit in #29388.
I'll create a PR after the bank changes are reviewed, if that is OK.
If you want, you can leave comments in #29388 or here?
bankstore changes depend on the bank changes, as SamplePerformanceService reads from the bank.
And making bankstore changes without a matching update in the SamplePerformanceService seems to be more work.

Turns out, I missed that put_bytes/iter API in Database actually operates on bytes.
So, it was relatively easy to use different deserializers depending on the binary representation size.
I've also changed the Rust side representation for different versions PerfSample, compared to what you reviewed.
An enum seems like a better representation for two different versions of PerfSample.

ilya-bobyr · 2022-12-23T21:23:24Z

@CriesofCarrots bankstore changes are now in a separate PR: #29404
Also updated the "combined stack" as I found a bug in the count computation.

ilya-bobyr requested a review from CriesofCarrots December 21, 2022 05:26

github-actions bot added the explorer Related to the Solana Explorer webapp label Dec 21, 2022

ilya-bobyr mentioned this pull request Dec 21, 2022

Make getRecentPerformanceSamples() return the number of vote and non-vote transactions #29159

Closed

ilya-bobyr force-pushed the getRecentPerformanceSamples-spit-votes branch 4 times, most recently from e162c8c to e5f2b14 Compare December 21, 2022 06:14

steviez reviewed Dec 21, 2022

View reviewed changes

explorer/src/providers/stats/solanaPerformanceInfo.tsx Outdated Show resolved Hide resolved

steviez self-requested a review December 21, 2022 06:34

ilya-bobyr force-pushed the getRecentPerformanceSamples-spit-votes branch from e5f2b14 to 87792ba Compare December 21, 2022 08:37

github-actions bot removed the explorer Related to the Solana Explorer webapp label Dec 21, 2022

ilya-bobyr closed this Dec 23, 2022

ilya-bobyr deleted the getRecentPerformanceSamples-spit-votes branch December 23, 2022 09:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RPC getPerformanceSamples: Add `numNonVoteTransaction` #29353

RPC getPerformanceSamples: Add `numNonVoteTransaction` #29353

ilya-bobyr commented Dec 21, 2022

CriesofCarrots commented Dec 22, 2022

steviez commented Dec 22, 2022

steviez commented Dec 22, 2022

CriesofCarrots commented Dec 22, 2022 •

edited

ilya-bobyr commented Dec 23, 2022

ilya-bobyr commented Dec 23, 2022

ilya-bobyr commented Dec 23, 2022

RPC getPerformanceSamples: Add numNonVoteTransaction #29353

RPC getPerformanceSamples: Add numNonVoteTransaction #29353

Conversation

ilya-bobyr commented Dec 21, 2022

Problem

Summary of Changes

CriesofCarrots commented Dec 22, 2022

steviez commented Dec 22, 2022

steviez commented Dec 22, 2022

CriesofCarrots commented Dec 22, 2022 • edited

ilya-bobyr commented Dec 23, 2022

ilya-bobyr commented Dec 23, 2022

ilya-bobyr commented Dec 23, 2022

RPC getPerformanceSamples: Add `numNonVoteTransaction` #29353

RPC getPerformanceSamples: Add `numNonVoteTransaction` #29353

CriesofCarrots commented Dec 22, 2022 •

edited