Skip to content

extend cluster topic metrics cleanup to cover all per-topic metric families#8564

Merged
vishalchangrani merged 2 commits into
leo/cleanup-stale-cluster-metricsfrom
vishal/leo/cleanup-stale-cluster-metrics
May 20, 2026
Merged

extend cluster topic metrics cleanup to cover all per-topic metric families#8564
vishalchangrani merged 2 commits into
leo/cleanup-stale-cluster-metricsfrom
vishal/leo/cleanup-stale-cluster-metrics

Conversation

@vishalchangrani
Copy link
Copy Markdown
Contributor

Extends #8562 to address remaining per-topic metrics that also accumulate stale cluster-ID-based label series across epoch transitions.

Problem

PR #8562 introduced OnClusterTopicMetricsCleanup but only covered a subset of the per-topic metrics tracked by NetworkCollector. The following metric families were missed and continue to accumulate stale time series when a collection node transitions to a new cluster each epoch:

Metric Type Missed in #8562
gossipsub_time_in_mesh_quantum_count HistogramVec GossipSubScoreMetrics
gossipsub_mesh_message_delivery HistogramVec GossipSubScoreMetrics
gossipsub_first_message_delivery_count HistogramVec GossipSubScoreMetrics
gossipsub_invalid_message_delivery_count HistogramVec GossipSubScoreMetrics
current_messages_processing GaugeVec NetworkCollector
direct_messages_in_progress GaugeVec NetworkCollector
engine_message_processing_time_seconds CounterVec NetworkCollector
unauthorized_messages_count CounterVec NetworkCollector
rate_limited_unicast_messages_count CounterVec NetworkCollector
reported_misbehavior_total CounterVec AlspMetrics

Changes

  • gossipsub_score.go: adds OnClusterTopicMetricsCleanup to GossipSubScoreMetrics covering the 4 per-topic scoring histograms
  • network.go: extends NetworkCollector.OnClusterTopicMetricsCleanup to also delegate to GossipSubScoreMetrics and delete the remaining per-channel metrics on NetworkCollector and AlspMetrics

No new interfaces or mock changes needed — GossipSubScoreMetrics.OnClusterTopicMetricsCleanup is called internally by NetworkCollector and is not part of any exported interface.

🤖 Generated with Claude Code

…milies

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vishalchangrani vishalchangrani requested a review from a team as a code owner May 20, 2026 18:33
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b1750899-bd54-4f0b-9296-1d8481ca1a79

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch vishal/leo/cleanup-stale-cluster-metrics

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread module/metrics/network.go Outdated
nc.rateLimitedUnicastMessagesCount.DeletePartialMatch(prometheus.Labels{LabelChannel: topic})

// ALSP misbehavior counter (multi-label: channel + misbehavior)
nc.AlspMetrics.reportedMisbehaviorCount.DeletePartialMatch(prometheus.Labels{LabelChannel: topic})
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of calling private method, it's better to create a public method for AlspMetrics, and calling that.
// In alsp.go
func (a *AlspMetrics) OnClusterTopicMetricsCleanup(topic string) {
a.reportedMisbehaviorCount.DeletePartialMatch(prometheus.Labels{LabelChannel: topic})
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made the change (through Claude).

…essing private field

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vishalchangrani vishalchangrani merged commit d027fde into leo/cleanup-stale-cluster-metrics May 20, 2026
@vishalchangrani vishalchangrani deleted the vishal/leo/cleanup-stale-cluster-metrics branch May 20, 2026 19:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants