Skip to content

Conversation

xeniape
Copy link
Member

@xeniape xeniape commented Oct 15, 2025

Description

Part of stackabletech/issues#747
This PR adds a metrics service with the additional Prometheus annotations. It also adds some documentation on monitoring for the TLS case since, similar to NiFi, HBase also exposes metrics behind a port which gets secured by TLS

Follow up monitoring Stack PR because moving to the metrics service, the port name changed: stackabletech/demos#316

Definition of Done Checklist

  • Not all of these items are applicable to all PRs, the author should update this template to only leave the boxes in that are relevant
  • Please make sure all these things are done and tick the boxes

Author

  • Changes are OpenShift compatible
  • CRD changes approved
  • CRD documentation for all fields, following the style guide.
  • Helm chart can be installed and deployed operator works
  • Integration tests passed (for non trivial changes)
  • Changes need to be "offline" compatible
  • Links to generated (nightly) docs added
  • Release note snippet added

Reviewer

  • Code contains useful comments
  • Code contains useful logging statements
  • (Integration-)Test cases added
  • Documentation added or updated. Follows the style guide.
  • Changelog updated
  • Cargo.toml only contains references to git tags (not specific commits or branches)

Acceptance

  • Feature Tracker has been updated
  • Proper release label has been added
  • Links to generated (nightly) docs added
  • Release note snippet added
  • Add type/deprecation label & add to the deprecation schedule
  • Add type/experimental label & add to the experimental features tracker

@xeniape xeniape moved this to Development: Waiting for Review in Stackable Engineering Oct 15, 2025
Copy link
Member

@Techassi Techassi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good to me, just some minor suggestions and a few questions.

I only looked at the Rust code - let me know if I should look at the docs as well.

Comment on lines +551 to +565
pub fn metrics_ports(&self, role: &HbaseRole) -> Vec<(String, u16)> {
match role {
HbaseRole::Master => vec![(
HBASE_METRICS_PORT_NAME.to_string(),
HBASE_MASTER_METRICS_PORT,
)],
HbaseRole::RegionServer => vec![(
HBASE_METRICS_PORT_NAME.to_string(),
HBASE_REGIONSERVER_METRICS_PORT,
)],
HbaseRole::RestServer => {
vec![(HBASE_METRICS_PORT_NAME.to_string(), HBASE_REST_METRICS_PORT)]
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: The mapping of role to metrics port exists in two places: Here and below at line 575 which inevitably contains the risk to drift apart. I think it makes sense to contain this mapping in a single place instead.

One possible solution is to drop the associated metrics_port function and user .map() to extract only the port numbers when needed.

Comment on lines +554 to +562
HBASE_METRICS_PORT_NAME.to_string(),
HBASE_MASTER_METRICS_PORT,
)],
HbaseRole::RegionServer => vec![(
HBASE_METRICS_PORT_NAME.to_string(),
HBASE_REGIONSERVER_METRICS_PORT,
)],
HbaseRole::RestServer => {
vec![(HBASE_METRICS_PORT_NAME.to_string(), HBASE_REST_METRICS_PORT)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Use to_owned instead.

Suggested change
HBASE_METRICS_PORT_NAME.to_string(),
HBASE_MASTER_METRICS_PORT,
)],
HbaseRole::RegionServer => vec![(
HBASE_METRICS_PORT_NAME.to_string(),
HBASE_REGIONSERVER_METRICS_PORT,
)],
HbaseRole::RestServer => {
vec![(HBASE_METRICS_PORT_NAME.to_string(), HBASE_REST_METRICS_PORT)]
HBASE_METRICS_PORT_NAME.to_owned(),
HBASE_MASTER_METRICS_PORT,
)],
HbaseRole::RegionServer => vec![(
HBASE_METRICS_PORT_NAME.to_owned(),
HBASE_REGIONSERVER_METRICS_PORT,
)],
HbaseRole::RestServer => {
vec![(HBASE_METRICS_PORT_NAME.to_owned(), HBASE_REST_METRICS_PORT)]

.map(|(name, value)| ServicePort {
name: Some(name),
port: i32::from(value),
protocol: Some("TCP".to_string()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Use to_owned instead.

Suggested change
protocol: Some("TCP".to_string()),
protocol: Some("TCP".to_owned()),

&rolegroup.role_group,
))
.context(ObjectMetaSnafu)?
.with_label(Label::try_from(("prometheus.io/scrape", "true")).context(LabelBuildSnafu)?)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Does it make sense to pull this key out into a constant? Or make it an associated function on Label, like Label::prometheus_scrape()?

Comment on lines +831 to +832
type_: Some("ClusterIP".to_string()),
cluster_ip: Some("None".to_string()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Use to_owned instead.

Suggested change
type_: Some("ClusterIP".to_string()),
cluster_ip: Some("None".to_string()),
type_: Some("ClusterIP".to_owned()),
cluster_ip: Some("None".to_owned()),

Comment on lines +847 to +865
fn prometheus_annotations(hbase: &v1alpha1::HbaseCluster, hbase_role: &HbaseRole) -> Annotations {
Annotations::try_from([
("prometheus.io/path".to_owned(), "/prometheus".to_owned()),
(
"prometheus.io/port".to_owned(),
hbase.metrics_port(hbase_role).to_string(),
),
(
"prometheus.io/scheme".to_owned(),
if hbase.has_https_enabled() {
"https".to_owned()
} else {
"http".to_owned()
},
),
("prometheus.io/scrape".to_owned(), "true".to_owned()),
])
.expect("should be valid annotations")
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Similar question to above - does it make sense to pull these out into contants or associated function on Annotation?

@Techassi Techassi moved this from Development: Waiting for Review to Development: In Review in Stackable Engineering Oct 17, 2025
@sbernauer sbernauer assigned maltesander and unassigned xeniape Oct 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Development: In Review

Development

Successfully merging this pull request may close these issues.

3 participants