Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Gossip Loop metrics #26195

Merged
merged 18 commits into from
Jun 29, 2022
Merged

Conversation

gregcusack
Copy link
Contributor

Problem

There is little insight into how well the gossip protocol is running. We need gossip timing metrics for:

  • Gossip loop containing push/pull request run_gossip()
  • Gossip listen loop run_listen()
  • Gossip Loop processing push/pull requests from other users process_packets()

Summary of Changes

Added two timers the measure the loop time of the run_gossip() and run_listen() methods respectively.

  • gossip_transmit_loop_time
  • gossip_listen_loop_time
    Note a timer metric called: process_gossip_packets_time already exists for process_packets().

Added three loop counters that count the total number of loop iterations since the last metric report for all the of the above loops/

  • gossip_transmit_loop_itrs_since_last_report
  • gossip_listen_loop_itrs_since_last_report
  • process_gossip_packets_itrs_since_last_report

Fixes #

gregcusack and others added 14 commits June 23, 2022 12:58
- fix nav link
- add bounty split policy for duplicate reports
…a-labs#25688)

* Define shuffle to prep using same shuffle for multiple slices

* Determine transaction indexes and plumb to execute_batch

* Pair transaction_index with transaction in TransactionStatusService

* Add new ReplicaTransactionInfoVersion

* Plumb transaction_indexes through BankingStage

* Prepare BankingStage to receive transaction indexes from PohRecorder

* Determine transaction indexes in PohRecorder; add field to WorkingBank

* Add PohRecorder::record unit test

* Only pass starting_transaction_index around PohRecorder

* Add helper structs to simplify test DashMap

* Pass entry and starting-index into process_entries_with_callback together

* Add tx-index checks to test_rebatch_transactions

* Revert shuffle definition and use zip/unzip

* Only zip/unzip if randomize

* Add confirm_slot_entries test

* Review nits

* Add type alias to make sender docs more clear
finish filling out the table....
@mergify mergify bot added the community Community contribution label Jun 23, 2022
@mergify mergify bot requested a review from a team June 23, 2022 23:57
@gregcusack gregcusack closed this Jun 23, 2022
@gregcusack gregcusack reopened this Jun 24, 2022
@@ -113,6 +113,10 @@ pub struct GossipStats {
pub(crate) gossip_pull_request_verify_fail: Counter,
pub(crate) gossip_pull_response_verify_fail: Counter,
pub(crate) gossip_push_msg_verify_fail: Counter,
pub(crate) gossip_transmit_loop_time: Counter,
pub(crate) gossip_transmit_loop_itrs_since_last_report: Counter,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: itrs -> iterations we usually spell out all words without abbreviations

behzadnouri
behzadnouri previously approved these changes Jun 24, 2022
Comment on lines 245 to 249
(
"process_gossip_packets_itrs_since_last_report",
stats.process_gossip_packets_itrs_since_last_report.clear(),
i64
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for easier queries, can you please report this in the same table as the other ones?

@@ -138,6 +142,7 @@ pub struct GossipStats {
pub(crate) packets_sent_pull_responses_count: Counter,
pub(crate) packets_sent_push_messages_count: Counter,
pub(crate) process_gossip_packets_time: Counter,
pub(crate) process_gossip_packets_itrs_since_last_report: Counter,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please keep the fields in this struct alphabetically sorted ?
that makes it easier to resolve merge conflicts or skim through the list of metrics.

@mergify mergify bot dismissed behzadnouri’s stale review June 24, 2022 15:20

Pull request has been modified.

@gregcusack gregcusack merged commit 032bee1 into solana-labs:master Jun 29, 2022
@gregcusack gregcusack deleted the ftr-gossip-loop-metrics branch July 20, 2022 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Community contribution
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants