Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publish metrics to MC and JMX #15560

Merged
merged 7 commits into from Sep 25, 2019

Conversation

@blazember
Copy link
Contributor

blazember commented Sep 17, 2019

Metrics are recorded to a ringbuffer that Management Center can read out
by using ReadMetricsMessageTask. The task reads the metrics recorded
since the last seen sequence number. The next sequence number that MC
can continue reading the Metrics with is returned in the response. The
ringbuffer, the ReadMetricsOperation and the task are almost identical
copies of the similar Jet classes.

Besides the described mechanism used for recording and publishing,
configuring the metrics system has also changed. Before this commit,
metrics could be configured with JVM arguments, now it can be done with
MetricsConfig both on the member and on the client side. This
configuration is shared with the diagnostic's MetricsPlugin.

The client protocol changes required for the ReadMetricsMessageTask is
also included in this commit.

@blazember blazember added this to the 4.0 milestone Sep 17, 2019
@blazember blazember requested a review from hazelcast/clients as a code owner Sep 17, 2019
@blazember blazember self-assigned this Sep 17, 2019
@cangencer

This comment has been minimized.

Copy link
Contributor

cangencer commented Sep 18, 2019

@blazember we've made since some additions to the metrics, maybe you want to have a look at the latest state? https://github.com/hazelcast/hazelcast-jet/tree/master/hazelcast-jet-core/src/main/java/com/hazelcast/jet/impl/metrics In general, the compressor and so on have been made into separate classes. We'd also need some hooks to register additional publishers. (see JobMetricsPublisher) for example.

@blazember blazember force-pushed the blazember:4.0/mc-metrics/mc-renderer2 branch 2 times, most recently from a34c29a to e519b3e Sep 18, 2019
@blazember blazember changed the title Publish metrics to MC and JMX [WIP] Publish metrics to MC and JMX Sep 18, 2019
@blazember blazember force-pushed the blazember:4.0/mc-metrics/mc-renderer2 branch 2 times, most recently from c450d7a to 55bcebc Sep 18, 2019
@blazember blazember changed the title [WIP] Publish metrics to MC and JMX Publish metrics to MC and JMX Sep 19, 2019
publishers.add(publisher);
}

if (config.isJmxEnabled()) {

This comment has been minimized.

Copy link
@cangencer

cangencer Sep 19, 2019

Contributor

we also changed the logic here, so the isEnabled is like a master switch, I don't know how you want to make the configuration. so if isEnabled is false, you can't enable just JMX.

This comment has been minimized.

Copy link
@blazember

blazember Sep 19, 2019

Author Contributor

That means you can't disable publishing to MC (buffers) if you want JMX only. How if we have this in the config?

public boolean isMetricsEnabled() {
  return isMcEnabled() || isJmxEnabled();
}

And we could still switch MC or JMX independently.

This comment has been minimized.

Copy link
@eminn

eminn Sep 19, 2019

Collaborator

How about adding an access to enabled/disabled information via MetricsService ? This way we don't need to go and fetch the Config object from the NodeEngine and access MetricsConfig from there. We already have MetricsService and it has the MetricsConfig.

This comment has been minimized.

Copy link
@puzpuzpuz

puzpuzpuz Sep 19, 2019

Contributor

2 cents from me. DS stats are enabled with the following check:
https://github.com/hazelcast/hazelcast/pull/15560/files#diff-b3e262104ca2692463bc0f125cb5fd22R217

If I'm getting everything right, it means that users won't be able to enable JMX and DS stats only, without enabling MC metrics. If it's correct, then it looks a bit weird to me.

This comment has been minimized.

Copy link
@blazember

blazember Sep 19, 2019

Author Contributor

Yes, that's indeed better than doing it in the config.

This comment has been minimized.

Copy link
@blazember

blazember Sep 19, 2019

Author Contributor

@puzpuzpuz this is a valid point, especially if we consider registering custom publishers. I think

  • enabling DS metrics should be independent from enabling MC or JMX publishers
  • enabling MC and JMX publishers should also be independent from each other

This comment has been minimized.

Copy link
@cangencer

cangencer Sep 19, 2019

Contributor

Well MC collection doesn't do anything by itself except to store the latest blob. If metrics will be rendered anyway I think you can always enable MC collection? MC "publisher" actually doesn't send anything to MC. In Jet we didn't make them independent from each other, simply because the MC publisher does almost nothing.

If you want them independent then you would need to add yet another config option to enable MC metrics.

This comment has been minimized.

Copy link
@cangencer

cangencer Sep 19, 2019

Contributor

How about adding an access to enabled/disabled information via MetricsService ? This way we don't need to go and fetch the Config object from the NodeEngine and access MetricsConfig from there. We already have MetricsService and it has the MetricsConfig.

Well you pass a function to register publisher and if metrics are disabled, the function shouldn't be called?

This comment has been minimized.

Copy link
@puzpuzpuz

puzpuzpuz Sep 20, 2019

Contributor

If metrics will be rendered anyway I think you can always enable MC collection?

In this case, rendering and storing those blobs will consume some CPU time and memory even if blobs aren't used at all. I'd prefer to pay for it only when MC metrics collection is explicitly enabled in config.

This comment has been minimized.

Copy link
@blazember

blazember Sep 20, 2019

Author Contributor

Finally, introduced a master switch along with mc-enabled flag, so populating metrics to the ringbuffer can be turned off. The master switch is needed because we support programmatic publisher registration that is expected to ignore the registration if the metrics system is entirely switched off. It is a valid case that both MC and JMX publishers are disabled, but we want to allow programmatic registration of publishers. It can't be done without a master switch.

try {
writeName(name);
// convert to long with specified precision
dos.writeLong(Math.round(value * DOUBLE_TO_LONG));

This comment has been minimized.

Copy link
@puzpuzpuz

puzpuzpuz Sep 19, 2019

Contributor

A really minor consideration. Looks like we're loosing information about the type of metric value here. It's fine, but it creates additional coupling: MC has to know which metrics have double type. Maybe, it could be avoided by adding a special tag for such metrics (say, type=double) or an additional byte header with the type id. Then decompressingIterator could be restoring those doubles (with some precision loss) and saving them into Metric objects.

In the future, those data types may be expanded with, say, histogram metrics (int[] or long[]).

This comment has been minimized.

Copy link
@cangencer

cangencer Sep 19, 2019

Contributor

It only supports long type of metric, and the only double metrics are things like cpu usage where you don't care about precision to that degree.

This comment has been minimized.

Copy link
@puzpuzpuz

puzpuzpuz Sep 20, 2019

Contributor

It's not about precision loss, it's about lost value type. It's absolutely fine that the compression is lossy. But it would be nice if the iterator could recover types when data points are uncompressed. In this case, it'll be easier to support additional data point types in the future (see my considerations above).

And as I mentioned above, MC will work fine with those lost types, as it's aware of metrics and can convert metrics with double values based on their names.

This comment has been minimized.

Copy link
@blazember

blazember Sep 20, 2019

Author Contributor

I agree we need to carry the types. I'd address this is in a separate PR so that integration with MC side gets unblocked.

@blazember blazember force-pushed the blazember:4.0/mc-metrics/mc-renderer2 branch from 55bcebc to 7d07d67 Sep 20, 2019
@puzpuzpuz puzpuzpuz self-requested a review Sep 20, 2019
@asimarslan asimarslan self-requested a review Sep 23, 2019
@blazember blazember force-pushed the blazember:4.0/mc-metrics/mc-renderer2 branch from 7d07d67 to 63e2cf8 Sep 23, 2019
@blazember blazember force-pushed the blazember:4.0/mc-metrics/mc-renderer2 branch from 67f0e48 to c7e4fee Sep 25, 2019
blazember added 3 commits Sep 17, 2019
Metrics are recorded to a ringbuffer that Management Center can read out
by using ReadMetricsMessageTask. The task reads the metrics recorded
since the last seen sequence number. The next sequence number that MC
can continue reading the Metrics with is returned in the response. The
ringbuffer, the ReadMetricsOperation and the task are almost identical
copies of the similar Jet classes.

Besides the described mechanism used for recording and publishing,
configuring the metrics system has also changed. Before this commit,
metrics could be configured with JVM arguments, now it can be done with
MetricsConfig both on the member and on the client side. This
configuration is shared with the diagnostic's MetricsPlugin.

The client protocol changes required for the ReadMetricsMessageTask is
also include in this commit.
- Typos
- Make ConcurrentArrayRingbuffer#getCapacity() not synchronized
- Introducing master-switch and mc-enabled flag to turn off collecting
to MC buffers
- Along with the master-switch, register custom publishers only if the
master-switch is enabled
@blazember blazember force-pushed the blazember:4.0/mc-metrics/mc-renderer2 branch from c7e4fee to fb39a99 Sep 25, 2019
@blazember

This comment has been minimized.

Copy link
Contributor Author

blazember commented Sep 25, 2019

Things noticed in this PR and will be addressed in a separate PR:

  • Add type information to the metrics for MC
  • Client configuration should be reduced to enable, jmx-enable and minimum-level. The rest is relevant for member only.
@Override
public MetricsResultSet decodeClientMessage(ClientMessage clientMessage) {

MetricsReadMetricsCodec.ResponseParameters response = serializationService

This comment has been minimized.

Copy link
@asimarslan

asimarslan Sep 25, 2019

Member

The serialization service not needed here

This comment has been minimized.

Copy link
@blazember

blazember Sep 25, 2019

Author Contributor

👍, removed

public MetricsResultSet(long nextSequence, List<Map.Entry<java.lang.Long, byte[]>> collections) {
// this.nextSequence = slice.nextSequence();
this.nextSequence = nextSequence;
this.collections = collections.stream()

This comment has been minimized.

Copy link
@asimarslan

asimarslan Sep 25, 2019

Member

A minor improvement can be removing this convertion.
We can just keep List<Map.Entry<java.lang.Long, byte[]>> instead of a List<MetricsCollection>
MetricCollection just provide an iterator which delegates to decompressingIterator(bytes)
MC codebase can directly call this static method.

This comment has been minimized.

Copy link
@blazember

blazember Sep 25, 2019

Author Contributor

Removed the conversion

blazember added 2 commits Sep 25, 2019
Copy link
Member

asimarslan left a comment

Client only

@blazember blazember merged commit 73b3a2c into hazelcast:master Sep 25, 2019
1 check passed
1 check passed
default Test PASSed.
Details
@blazember blazember deleted the blazember:4.0/mc-metrics/mc-renderer2 branch Sep 25, 2019
@blazember

This comment has been minimized.

Copy link
Contributor Author

blazember commented Sep 25, 2019

Thank you a lot for the review rounds and for your help @asimarslan @cangencer @eminn @puzpuzpuz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.