Skip to content

feat(monitor): add Prometheus Remote Write receiver monitoring template#4104

Closed
abhyudayareddy wants to merge 1 commit into
apache:masterfrom
abhyudayareddy:feat/prometheus-remote-write-template
Closed

feat(monitor): add Prometheus Remote Write receiver monitoring template#4104
abhyudayareddy wants to merge 1 commit into
apache:masterfrom
abhyudayareddy:feat/prometheus-remote-write-template

Conversation

@abhyudayareddy
Copy link
Copy Markdown
Contributor

Summary

Adds a new HertzBeat monitoring template for Prometheus Remote Write receiver endpoints (e.g. Thanos Receive, Cortex, Mimir, VictoriaMetrics).

Closes/relates to #1945"Historical data storage supports Prometheus remote write and remote read protocols"

What's changed?

  • New template app-prometheus-remote-write.yml under hertzbeat-manager/src/main/resources/define/
  • Uses parseType: prometheus (HTTP collection) consistent with HertzBeat's existing Prometheus integration
  • 4 metric groups:
    • availability — scrape up/down check
    • throughput — samples appended, remote write requests received/failed
    • latency — histogram quantile P50/P95/P99 for remote write duration
    • wal_health — WAL corruptions and truncation failures
  • Configurable params: host, port, metrics path (/metrics), SSL, basic auth
  • Bilingual labels (zh-CN + en-US) following existing template conventions

Checklist

  • I have read the Contributing Guide
  • I have written the necessary doc or comment.
  • I have added tests or this PR does not require test coverage.
  • The template follows existing YAML structure conventions.

This template monitors the health, throughput, and error rate of a Prometheus Remote Write receiver endpoint. It includes parameters for host, port, metrics path, and SSL configuration.

Signed-off-by: abhyudayareddy <54602866+abhyudayareddy@users.noreply.github.com>
@Duansg
Copy link
Copy Markdown
Member

Duansg commented Mar 30, 2026

This issue appears to involve implementing a protocol adapter that supports Prometheus remote write and remote read via hertzbeat. pls confirm its relevance to this PR.

@abhyudayareddy
Copy link
Copy Markdown
Contributor Author

Hi @Duansg, thanks for reviewing!

To clarify the distinction: issue #1945 is about HertzBeat itself acting as a Prometheus remote write storage backend — i.e., other systems pushing metrics into HertzBeat via the remote write protocol. That is a different feature entirely.

This PR adds a monitoring template so that HertzBeat can monitor external services that expose a Prometheus Remote Write receiver endpoint (e.g., Thanos Receive, Cortex, Mimir, VictoriaMetrics). HertzBeat scrapes their /metrics endpoint using its existing parseType: prometheus HTTP collection — no protocol adapter is needed.

In short:

@Duansg
Copy link
Copy Markdown
Member

Duansg commented Mar 31, 2026

@abhyudayareddy Hi, overall, this PR has the following issues:

  1. This PR has not advanced [Task] Historical data storage supports promethus remote write and remote read protocols #1945
  2. There are numerous errors in the current template.
  3. If the goal is simply to collect data from the metrics endpoint, this is already supported. I don’t think there’s a need to add an additional template.

Given this, I cannot approve this PR. @zqr10159 could you please review it?

@zqr10159
Copy link
Copy Markdown
Member

Sorry, these are clearly two different things, and I cannot pass this PR.

@zqr10159 zqr10159 closed this Mar 31, 2026
@abhyudayareddy
Copy link
Copy Markdown
Contributor Author

Thank you both @Duansg and @zqr10159 for taking the time to review!

I want to address point 3 specifically, as I think there may be a misunderstanding: Prometheus Remote Write receiver monitoring is fundamentally different from scraping a /metrics endpoint.

  • A Prometheus scrape target exposes a /metrics endpoint that HertzBeat polls — the existing Prometheus template covers this.
  • A Prometheus Remote Write receiver (Thanos Receive, Cortex, Mimir, VictoriaMetrics) accepts time-series data pushed via HTTP POST. The monitoring goal here is to check the health/readiness of that receiver — e.g., /-/healthy, /ready, and write-path availability. This is a distinct topology and is not covered by existing templates.

On point 2: I'd genuinely appreciate if you could point out the specific errors in the template so I can fix them. I want to make sure the YAML is correct.

On point 1: I understand this is separate from issue #1945 (HertzBeat ingesting remote write data itself). This PR's scope is narrower — monitoring external remote write receivers. I can update the PR title/description to make that distinction clearer if that would help reconsider.

@abhyudayareddy
Copy link
Copy Markdown
Contributor Author

@Duansg @zqr10159 — following up on my previous comment (no response since Apr 14).

I want to make sure the distinction is clear: this PR monitors external Prometheus Remote Write receiver services (Thanos Receive, Cortex, Mimir, VictoriaMetrics) by checking their health/readiness endpoints (/-/healthy, /ready). It does not overlap with scraping a /metrics endpoint — that is a fundamentally different operation. @zqr10159's comment that "these are clearly two different things" is exactly correct and actually supports the use case.

On the YAML errors: @Duansg mentioned "numerous errors in the current template" but did not enumerate them. I am unable to fix errors I cannot see. Could you please:

  1. List the specific fields or sections that are incorrect
    1. Point to an existing template that follows the correct pattern
      I am happy to rewrite the template from scratch to match HertzBeat conventions if given specific guidance. If the project does not want this type of monitoring template at all (policy decision), I will respect that — but if it is fixable, I would like to fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants