Skip to content

jmx metrics: require explicit unit in yaml #13796

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
May 20, 2025

Conversation

SylvainJuge
Copy link
Contributor

@SylvainJuge SylvainJuge commented Apr 29, 2025

Related discussion: #13777
Resolves: #13812

With JMX metrics, the default metric unit is currently null, which means metrics are not properly exported unless a metric is explicitly provided in the YAML. Unfortunately this is quite easy to miss and it's probably better to report with an unexpected metric rather than not report anything.

The Java SDK currently explicitly tests for empty units, which means it should be something supported and thus reported.

As an alternative, we could maybe treat a missing metric as an error and throw an exception during parsing.

Bonus: take that opportunity to refactor and simplify a bit yaml parsing.

Update: as discussed in the comments, we switched the approach to throw an error when the unit is not explicitly set, this PR has thus been updated to reflect this new approach.

@SylvainJuge SylvainJuge requested a review from a team as a code owner April 29, 2025 13:23
@SylvainJuge SylvainJuge self-assigned this Apr 29, 2025
@SylvainJuge SylvainJuge added the bug Something isn't working label Apr 29, 2025
@trask
Copy link
Member

trask commented Apr 29, 2025

I added to this week's SIG agenda to discuss

  • require unit in yaml, vs
  • default to empty unit

…etry/instrumentation/jmx/engine/RuleParserTest.java

Co-authored-by: Jay DeLuca <jaydeluca4@gmail.com>
@SylvainJuge
Copy link
Contributor Author

I added to this week's SIG agenda to discuss

  • require unit in yaml, vs
  • default to empty unit

Thanks, I won't be able to attend both May 1st and May 8th SIG meetings though, my take on this would be to:

  • if we provide a default, we need to make sure it's the same default as the SDK, if there is any.
  • if we decide to throw an exception, then it makes mis-configuration early and and prevents unit-less metrics capture which might create other complications downstream.

@jack-berg
Copy link
Member

As an alternative, we could maybe treat a missing metric as an error and throw an exception during parsing.

Maybe you meant throw an exception if the unit isn't provided?

We discussed in the 5/1/2025 java SIG and like this approach:

  • Make unit a required part of the schema, and throw if the key is not present or the value is null
  • Allow user to set the unit to an empty string "", since this is allowed in the SDK.
  • The effect is to strongly encourage users to set the unit since its an important part of the metrics data model, but still allow them to set an empty unit in cases where the unit is unknown

@trask
Copy link
Member

trask commented May 1, 2025

@SylvainJuge @PeterF778 one open question is how disruptive this change might be for existing users, and whether we want to wait to make the change in 3.0 or not?

@PeterF778
Copy link
Contributor

PeterF778 commented May 1, 2025

@SylvainJuge @PeterF778 one open question is how disruptive this change might be for existing users, and whether we want to wait to make the change in 3.0 or not?

I think that all JMX metrics do have well-defined units, so it should be fine to demand that the units are specified in the yaml file. Also, as I understand, even today the users are forced to specify units in order to get the metrics exported (see the original motivation for the PR), even if this seems to be incorrect behavior on the SDK part.

I'm all behind the solution as summarized by @jack-berg .

@SylvainJuge
Copy link
Contributor Author

Thanks for discussing this during last SIG meeting, I also think it's unlikely to impact users as they currently don't have metrics without units being exported.

Implementation-wise, while the parsing will throw an exception, the end result of this parsing error would be a warning in the agent log so we should not have to wait for 3.0 here.

For existing users with unspecified metrics they would still get more value with a warning rather than a silent failure.

If you all agree here I will update this PR accordingly.

@github-actions github-actions bot added the test native This label can be applied to PRs to trigger them to run native tests label May 6, 2025
@SylvainJuge SylvainJuge changed the title jmx metrics: use empty string as unit by default jmx metrics: require explicit unit in yaml May 6, 2025
@@ -34,10 +33,6 @@ private BeanGroup(@Nullable QueryExp queryExp, List<ObjectName> namePatterns) {
this.namePatterns = namePatterns;
}

public static BeanGroup forSingleBean(String bean) throws MalformedObjectNameException {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[for reviewer] removed as not used anymore.

Comment on lines +53 to +58
public void addBean(String bean) {
if (beans == null) {
beans = new ArrayList<>();
}
beans.add(validateBean(bean));
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[for reviewer] simplify a bit by allowing to add any number of beans using either bean (single) or beans (list)

Comment on lines +209 to +210
null,
"",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[for reviewer] using an empty string as unit seems an appropriate choice for state metrics.

metricType = jmxRule.getMetricType();
}
if (metricType == null) {
metricType = MetricInfo.Type.GAUGE;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[for reviewer] The same default is applied in MetricInfo constructor, moving it here makes it happen a bit earlier. I expect a later refactor of this would bring the complete building and validation logic into this method.

Comment on lines +63 to +67
try {
this.metricType = MetricInfo.Type.valueOf(t);
} catch (IllegalArgumentException e) {
throw new IllegalArgumentException("Invalid metric type: " + t, e);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[for reviewer] this allows to make the error message nicer for the end user when type is invalid.

@@ -167,7 +174,7 @@ static MetricAttribute buildMetricAttribute(String key, String target) {
// an optional modifier may wrap the target
// - lowercase(param(STRING))
boolean lowercase = false;
String lowerCaseExpr = tryParseFunction("lowercase", targetExpr, target);
String lowerCaseExpr = tryParseFunction("lowercase", targetExpr, errorMsg);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[for reviewer] this fixes a typo that made the error message harder to read, same for others below.

() -> {
JmxConfig config = parseConf(yaml);
private static void runNegativeTest(String yaml, String message) {
assertThatThrownBy(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[for reviewer] using assertj to make testing for exception message easier.

try {
rule.buildMetricDef();
} catch (Exception e) {
throw new IllegalStateException(e.getMessage(), e);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[for reviewer] here throwing the original message allows to keep the assertion on message content the same.

@@ -616,7 +641,7 @@ void testNoBeans() {
+ " - mapping: # still no beans\n"
+ " A:\n"
+ " metric: METRIC_NAME\n";
runNegativeTest(yaml);
runNegativeTest(yaml, "No ObjectName specified");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[for reviewer] previously any parsing error would make the test pass, adding the expected exception message allows to ensure that we trigger the validation error we expect.

@SylvainJuge
Copy link
Contributor Author

I've updated the PR to make the unit mandatory in YAML, this allowed to uncover that state metrics need to have a unit too, so I've picked empty string as unit for now.

@trask
Copy link
Member

trask commented May 6, 2025

I've picked empty string as unit for now.

maybe "1" would be a better default unit?

https://ucum.org/ucum#section-Semantics:~:text=rather%20than%20the-,default%20unit%201,-.%20%C2%A0%C2%A0%E2%96%A03%20Curly

cc @jack-berg

@PeterF778
Copy link
Contributor

I've picked empty string as unit for now.

maybe "1" would be a better default unit?

https://ucum.org/ucum#section-Semantics:~:text=rather%20than%20the-,default%20unit%201,-.%20%C2%A0%C2%A0%E2%96%A03%20Curly

cc @jack-berg

I have been thinking about it as well, and it was my first thought. However, the unit of 1 suggests cardinality which is not really here. With state metrics we really deal with boolean values, which, by necessity, are mapped into 0 or 1.

@trask
Copy link
Member

trask commented May 6, 2025

With state metrics we really deal with boolean values, which, by necessity, are mapped into 0 or 1.

yeah, though I think we're reporting them as updowncounters, which implies they can be aggregated?

@PeterF778
Copy link
Contributor

yeah, though I think we're reporting them as updowncounters, which implies they can be aggregated?

Good point. @SylvainJuge , do you see a use case for aggregation of state metrics? If not, perhaps we should revisit the type of the metric.
But the question of the proper metric unit remains.

@SylvainJuge
Copy link
Contributor Author

SylvainJuge commented May 7, 2025

For state metrics, the aggregation can be used to provide a "count" of the total number of observed items. Because the value is always 0 or 1 I think aggregation should always produce something useful as we are "counting things that have a state".

For example, with the Tomcat connector example:

  • the tomcat.connector aggregated (no attribute) metric provides the total number of connectors for which the state is known
  • the tomcat.connector metric provides breakdown per network port and connector state
  • the tomcat.connector aggregated metric could be used to measure the relative coverage of the connector states compared to the total number of tomcat instances. For example if you expect 2 connectors per instance and you have 5 instances, then when tomcat.connector is less than 10 it means something is missing on at least once instance.

I think we can get similar usages with the hw.state metric.

So, if the meaning attached to the 1 unit is to convey that it is something that be counted, then it should probably make sense to use it over empty string, however reading the semconv metrics recommendations we might as well interpret this as a "counting states" metric, in which case we could also use {state} (sorry for introducing another option here).

So we could have at least 4 possible choices for the unit of state metrics:

I don't have any strong opinion here, but the empty string seems the simplest path for now:

  • it aligns with the current hw.state metric that has no unit
  • if we want to change the unit, it requires to work with HW semconv to ensure we are moving in the same direction, so likely quite a bit of effort to change things and deal with the downstream changes.
  • as a secondary quest, we could generalize the concept of "state metrics" into semantic conventions, so both the HW semconv and the JMX metrics documentations could refer to it, this would also cover the common definition of state metrics that are updowncounters with a value of 0 or 1 and whatever we decide for the metric unit.

if (beans == null) {
beans = new ArrayList<>();
}
beans.add(validateBean(bean));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] beans could actually be a Set to prevent duplicates

@trask trask merged commit 1bc15f0 into open-telemetry:main May 20, 2025
87 checks passed
@SylvainJuge SylvainJuge deleted the jmx-use-empty-unit-as-default branch May 28, 2025 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working test native This label can be applied to PRs to trigger them to run native tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

JMX Rule unit is required but not documented or validated
10 participants