Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPC: fix numNonVoteTransactions in getPerformanceSamples #1189

Merged
merged 1 commit into from
May 7, 2024

Conversation

jstarry
Copy link

@jstarry jstarry commented May 5, 2024

Problem

The value numNonVoteTransactions returned in the RPC getPerformanceSamples endpoint was actually reporting the number of successful non vote transactions that were executed. That field is documented in the RPC docs with the following statement:

To get a number of voting transactions compute numTransactions - numNonVoteTransaction"

But before this fix, numTransactions - numNonVoteTransaction is actually computing the number of vote transactions + errored non-vote transactions.

Summary of Changes

  • Increment executed_non_vote_transactions_count whether the non-vote tx failed or succeeded. This value will get fixed on restart since we don't persist the value in snapshots

Fixes #

@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.1%. Comparing base (7daa369) to head (542b784).
Report is 13 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1189   +/-   ##
=======================================
  Coverage    82.1%    82.1%           
=======================================
  Files         886      886           
  Lines      236397   236397           
=======================================
+ Hits       194227   194250   +23     
+ Misses      42170    42147   -23     

@jstarry jstarry requested a review from ilya-bobyr May 5, 2024 12:17
@ilya-bobyr
Copy link

Summary of Changes

* Increment `executed_non_vote_transactions_count` whether the non-vote tx failed or succeeded. This value will get fixed on restart since we don't persist the value in snapshots

I'm actually not completely familiar with how the snapshots are generated, but we do store these values in RocksDB.
Here is the struct that holds then:

/// Version of the [`PerfSample`] introduced in 1.15.x.
#[derive(Clone, Debug, PartialEq, Eq, Deserialize, Serialize)]
pub struct PerfSampleV2 {
// `PerfSampleV1` part
pub num_transactions: u64,
pub num_slots: u64,
pub sample_period_secs: u16,
// New fields.
pub num_non_vote_transactions: u64,
}

@jstarry
Copy link
Author

jstarry commented May 6, 2024

Yeah good point, there will be a period of time where the perf samples returned by rpc have a mix of old incorrect values and new correct values. The SAMPLE_INTERVAL is one minute and the PERFORMANCE_SAMPLES_LIMIT is 720 so a restarted node will return old values for 12 hours. And then if for some reason a node rolls back its software version, it could start reporting incorrect values again.

I think the alternative is to purge old incorrect values from blockstore but that feels like a worse end user experience that the proposed change. @CriesofCarrots do you have any preference here?

@jstarry jstarry added the v1.18 label May 6, 2024
Copy link

mergify bot commented May 6, 2024

Backports to the beta branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule. Exceptions include CI/metrics changes, CLI improvements and documentation updates on a case by case basis.

@CriesofCarrots
Copy link

CriesofCarrots commented May 6, 2024

I think the alternative is to purge old incorrect values from blockstore but that feels like a worse end user experience that the proposed change. @CriesofCarrots do you have any preference here?

I think serving the old, incorrect values for ~12 hours after restart is acceptable (and agreed, better ux than the purge alternative).

@ilya-bobyr
Copy link

I think the other discrepancy would be that different node might be returning different counts, depending on if there were upgraded or not.
Not sure how big of an issue is it.

It seems that a feature flag might be a cleaner approach to fix this?
Though, maybe, it is too much work for this case.

@CriesofCarrots
Copy link

different node might be returning different counts, depending on if there were upgraded or not.

getRecentPerformanceSamples is already a node-specific view (and could be totally wonky when a node is booting or on a fork), so I personally don't think it's worth doing any gymnastics to coordinate activation of this fix between nodes.

@jstarry jstarry merged commit 3f6c567 into anza-xyz:master May 7, 2024
39 checks passed
mergify bot pushed a commit that referenced this pull request May 7, 2024
jstarry added a commit that referenced this pull request May 9, 2024
…port of #1189) (#1208)

RPC: fix numNonVoteTransactions in getPerformanceSamples (#1189)

(cherry picked from commit 3f6c567)

Co-authored-by: Justin Starry <justin@anza.xyz>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants