-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RPC getPerformanceSamples: Add numNonVoteTransaction
#29353
RPC getPerformanceSamples: Add numNonVoteTransaction
#29353
Conversation
e162c8c
to
e5f2b14
Compare
Allow interested parties to see both total and non-vote transaction counts in each performance sample. As new stats need to be stored in the blockstore database, but as they have a different binary representation they are recorded into a new column: `perf_samples_v2`. An alternative would have been to make `PerfSample` deserialization code smart enough to allow for deserialization of both old and new sample records. But it is a non-trivial problem. `serde` is designed to allow for deserialization of streams, and thus, it does not provide a mechanism for backtracking. `bincode` does not encode enough information to know if the data being read is the v1 version of the `PerfSample` record or the v2 version. And `rocksdb` access layer is written with a pretty deep assumption that types stored in the columns can be deseraialized using `serde`. In order to support deserialization of `bincode` encoded records based on their size, one needs to either extend `bincode` itself, to allow for some kind of backtracking, or the `rocksdb` interaction layer would need to be changed to allow for additional flexibility. Both options seems to be pretty complex, considering that they would allow an implementation that is not very extendable, and feels like a hack. On the other hand, putting new `PerfSample` version into a separate column seems more robust, and will scale with no problem, allowing any number of future additions to `PerfSample`, should we be adding more stats.
e5f2b14
to
87792ba
Compare
Thanks for working on this, @ilya-bobyr. I can see a lot of this is pretty close (and thanks for all the tests!), but I think the changes will be easier to review and merge if we split it into a few different, smaller PRs. I would recommend at least 3 PRs:
Do you mind splitting? A couple comments that you may want to start addressing when you do the split:
|
@CriesofCarrots - Not to derail this PR too much, but can you please confirm why we need to persist / recover the value from snapshots? The value is not directly included in bank hash, but I guess the idea is that we want consistency for this field regardless of whether we actually replayed the slot and froze the bank vs. if we created the bank from the snapshot ? |
To build on Tyera's note here, adding a new column introduces a compatibility break.
Suppose we have some release We do have a precedent to work around the situation above (ie we add new columns from time to time), but there is extra process to ensure we have a safe upgrade/downgrade path. So, reusing the same column if possible is nice to avoid that process |
Yes, this is exactly what I was thinking. But pondering this more... we don't actually have a way to seed this value correctly (without reprocessing all the blocks since slot 0), so this value isn't ever going to be "correct". It can only be correct from whenever we start counting onward. That's unfortunate, but I don't see any way around it. So there might be no need to snapshot this new field after all. If that's the case, we will need to comment liberally to make sure futures users of the api know that it is only valuable for comparison between Banks. I'll think more on whether there's any value to validators agreeing on that "wrong" value. In the meantime, no need to start on that Bank serde/snapshot stuff I mentioned yet, ilya-bobyr. |
@CriesofCarrots take a look please. I've split the Turns out, I missed that |
@CriesofCarrots |
Problem
Allow interested parties to see both total and non-vote transaction counts in each performance sample.
Summary of Changes
As new stats need to be stored in the blockstore database, but as they have a different binary representation they are recorded into a new column:
perf_samples_v2
.An alternative would have been to make
PerfSample
deserialization code smart enough to allow for deserialization of both old and new sample records. But it is a non-trivial problem.serde
is designed to allow for deserialization of streams, and thus, it does not provide a mechanism for backtracking.bincode
does not encode enough information to know if the data being read is the v1 version of thePerfSample
record or the v2 version. Androcksdb
access layer is written with a pretty deep assumption that types stored in the columns can be deseraialized usingserde
.In order to support deserialization of
bincode
encoded records based on their size, one needs to either extendbincode
itself, to allow for some kind of backtracking, or therocksdb
interaction layer would need to be changed to allow for additional flexibility. Both options seems to be pretty complex, considering that they would allow an implementation that is not very extendable, and feels like a hack.On the other hand, putting new
PerfSample
version into a separate column seems more robust, and will scale with no problem, allowing any number of future additions toPerfSample
, should we be adding more stats.Fixes #29159