Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update queried metric to determine tps in system tests #31531

Merged
merged 1 commit into from
May 8, 2023

Conversation

steviez
Copy link
Contributor

@steviez steviez commented May 8, 2023

Problem

The test report is querying a metric that has been removed for being too noisy.

Summary of Changes

So, update the query to use a metric that we report once per slot. The metric bank-process_transactions was removed in #31398.

As for the two metrics ...

  • bank-process_transactions was a counter that was updated for any transactions from any slot
  • replay-slot-stats is submitted when a slot is completed

So, replay-slot-stats is inherently excluding transactions from discarded blocks (ie one that are abandoned and result in TooFewTicks). Looking at a previous test run before the metric was pulled out, I see the following values when I set dashboard time to 2023-05-03 06:23 ==> 2023-05-03 06:33:

    // New mean query: 21,122
    SELECT ROUND(MEAN("median_sum")) as "mean_tps" FROM (
      SELECT MEDIAN(sum_total_transactions) AS "median_sum" FROM (
        SELECT SUM("total_transactions") AS "sum_total_transactions"
          FROM "bare-perf-cpu-only"."autogen"."replay-slot-stats"
          WHERE time > :dashboardTime: AND time < :upperDashboardTime:
          GROUP BY time(1s), host_id)
      GROUP BY time(1s)
    )

    // Old mean query: 21,191
    SELECT ROUND(MEAN("median_sum")) as "mean_tps" FROM (
      SELECT MEDIAN(sum_count) AS "median_sum" FROM (
        SELECT SUM("count") AS "sum_count"
          FROM "bare-perf-cpu-only"."autogen"."bank-process_transactions"
          WHERE time > :dashboardTime: AND time < :upperDashboardTime: AND count > 0
          GROUP BY time(1s), host_id)
      GROUP BY time(1s)
    )

    // New max query: 80,262
    SELECT MAX("median_sum") as "max_tps" FROM (
      SELECT MEDIAN(sum_total_transactions) AS "median_sum" FROM (
        SELECT SUM("total_transactions") AS "sum_total_transactions"
          FROM "bare-perf-cpu-only"."autogen"."replay-slot-stats"
          WHERE time > :dashboardTime: AND time < :upperDashboardTime: AND total_transactions > 0
          GROUP BY time(1s), host_id)
      GROUP BY time(1s)
    )

    // Old max query: 71,359
    SELECT MAX("median_sum") as "max_tps" FROM (
      SELECT MEDIAN(sum_count) AS "median_sum" FROM (
        SELECT SUM("count") AS "sum_count"
          FROM "bare-perf-cpu-only"."autogen"."bank-process_transactions"
          WHERE time > :dashboardTime: AND time < :upperDashboardTime: AND count > 0
          GROUP BY time(1s), host_id)
      GROUP BY time(1s)
    )  

  • The difference in means looks negligible
    • Also as mentioned, the per-slot metric doesn't count abandoned slots, but it looks like there was only one abandoned slot in this particular run.
  • The difference in maxes is more significant, but I don't think this is problematic
    • We're grouping things by 1s, so we seemingly had multiple blocks finish being replayed within the 1 second window. The per-slot metric includes all transactions for that slot, even though some of the tx's might have technically been replayed outside of the 1 second window.

The test report is querying a metric that has been removed for being too
noisy. So, update the query to use a metric that we report once per
slot.
@steviez steviez requested review from yihau and apfitzge May 8, 2023 06:56
Copy link
Member

@yihau yihau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just tested this PR on local-cluster and I think the result looks good

Test Configuration:

CLOUD_PROVIDER = bare
NUMBER_OF_VALIDATOR_NODES = 5
ENABLE_GPU = false
NUMBER_OF_CLIENT_NODES = 1
CLIENT_OPTIONS = bench-tps=1=--tx_count 10000 --thread-batch-sleep-ms 250
TEST_DURATION_SECONDS = 600
USE_PUBLIC_IP_ADDRESSES = false
ALLOW_BOOT_FAILURES = false
TEST_TYPE = fixed_duration

Result Details:

mean_tps: 24533
max_tps: 67469
mean_confirmation_ms: 804
max_confirmation_ms: 4286
99th_percentile_confirmation_ms: 2669
max_tower_distance: 43
last_tower_distance: 31
slots_per_second: 2.651

Copy link
Contributor

@apfitzge apfitzge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants