Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GEODE-6295: Add Micrometer-based metrics collector #3153

Closed

Conversation

demery-pivotal
Copy link
Contributor

@demery-pivotal demery-pivotal commented Feb 2, 2019

This PR is superseded by #3180.

This PR adds a Micrometer-based metrics system to Geode.

On startup, Geode configures a "primary" meter registry. Developers use
the primary registry to create meters (gauges, counters, timers, and
others) to instrument Geode and client code.

Internally, the meter registry is a "composite" registry, which allows
attaching any number of "downstream" registries. Each downstream
registry will typically publish the metrics to an external monitoring
system, or provide an endpoint where an external monitoring system can
"scrape" the metrics.

In Geode, when a meter is added or removed from the primary registry,
the primary registry adds or removes a corresponding meter in each
downstream registry. When a meter in the primary registry is updated
(e.g. incremented), the primary registry forwards each operation to the
corresponding meter in each downstream registry.

A MetricsCollector provides access to the primary registry, and includes
methods for adding and removing downstream registries.

The MetricsCollector itself is currently available via a method on
InternalDistributedSystem. Eventually we plan to make the
MetricsCollector available through DistributedSystem, or some other
non-internal class or interface.

The key components of this change are:

  • MetricsCollector (interface) allows access to the primary registry,
    and allows adding and removing downstream registries.
  • CompositeMetricsCollector is the concrete implementation of
    MetricsCollector, and creates the internal composite registry.
  • InternalDistributedSystem creates the MetricsCollector and makes it
    available via a new getMetricsCollector() method.
  • The Micrometer system for instrumenting and maintaining metrics (added
    to Geode as a dependency). See http://micrometer.io/ for details.

Thank you for submitting a contribution to Apache Geode.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?

  • Has your PR been rebased against the latest commit within the target branch (typically develop)?

  • Is your initial contribution a single, squashed commit?

  • Does gradlew build run cleanly?

  • Have you written or updated unit tests to verify your changes?

  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?

Note:

Please ensure that once the PR is submitted, you check travis-ci for build issues and
submit an update to your PR as soon as possible. If you need help, please send an
email to dev@geode.apache.org.

The Micrometer system includes a rich suite of Meter types for
instrumenting code, and several styles of meter registries for
maintaining the meters.  It also includes (as separate jars) additional
registry types for publishing metrics to a variety of external
monitoring systems (e.g. Prometheus, JMX, and Influx).

See http://micrometer.io/ for details about Micrometer,

This PR adds a Micrometer-based metrics system to Geode.

On startup, Geode configures a "primary" meter registry. Developers use
the primary registry to create meters (gauges, counters, timers, and
others) to instrument Geode and client code.

Internally, the meter registry is a "composite" registry, which allows
attaching any number of "downstream" registries. Each downstream
registry will typically publish the metrics to an external monitoring
system, or provide an endpoint where an external monitoring system can
"scrape" the metrics.

In Geode, when a meter is added or removed from the primary registry,
the primary registry adds or removes a corresponding meter in each
downstream registry.  When a meter in the primary registry is updated
(e.g. incremented), the primary registry forwards each operation to the
corresponding meter in each downstream registry.

A MetricsCollector provides access to the primary registry, and includes
methods for adding and removing downstream registries.

The MetricsCollector itself is currently available via a method on
InternalDistributedSystem.  Eventually we plan to  make the
MetricsCollector available through DistributedSystem, or some other
non-internal class or interface.

The key components of this change are:
- MetricsCollector (interface) allows access to the primary registry,
  and allows adding and removing downstream registries.
- CompositeMetricsCollector is the concrete implementation of
  MetricsCollector, and creates the internal composite registry.
- InternalDistributedSystem creates the MetricsCollector and makes it
  available via a new getMetricsCollector() method.
- The Micrometer system for instrumenting and maintaining metrics (added
  to Geode as a dependency).
Copy link
Contributor

@upthewaterspout upthewaterspout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Dale,

I'm a little bit confused by this PR. The classes you've added appear to be an internal API - how does the user actually configure and enable micrometer metrics? Also, I don't see any actual metrics being collected in this PR, unless I'm missing something? Just the registry?

What does CompositeMetricsCollector and MetricsCollector add on top of micrometer's existing CompositeMeterRegistry? I could see creating these interfaces if you were trying to completely hide the micrometer API, but these classes have hard dependencies on micrometer anyway.

Based on this PR, I don't really see any need to add micrometer to geode-core. If all you want to do have something that is started when a InternalDistributedSystem is created, have a look at CacheService and DistributedSystemService.

Maybe in later changes you intend to add some direct calls to micrometer meters inside the core? Or will micrometer just be sampling existing stats that geode is already collecting? If the latter, I think you could make micrometer a separate sub-module that just implements a CacheService. The geode-lucene module is a good example of how this can work.

@demery-pivotal
Copy link
Contributor Author

Hi Dan,

This PR does not add meters. We have identified a small number of meters add to geode-core very soon. The goal of this PR is to get the system in place to do that.

The API is temporarily internal in case we want to tweak it before moving it to a non-internal place. Specifically, we intend to make the MeterCollector interface and the method to obtain one externally available. We considered putting getMetricsCollector() in DistributedSystem and marking it experimental, but one of the questions yet to be decided is whether DistributedSystem is the right place to expose the metrics system.

We intend to expose the CompositeMeterRegistry only as a MeterRegistry, to allow us to substitute other implementations later if we choose. The MeterRegistry (abstract class) is all that’s needed to create meters.

The CompositeMetricsCollector allows us to move the configuration of the CompositeMeterRegistry out of InternalDistributedSystem, which doesn’t need yet more responsibilities. (Though, uh, in the code I submitted, the configuration is done in InternalDistributedSystem. That should be moved into CompositeMetricsCollector.)

MetricsCollector (the primary hub for accessing the metrics system) is an interface in order to allow us to substitute other implementations later, and make it easier to test things that don’t need the concrete implementation.

Thank you for your review. Have I left you more puzzled? Less puzzled? Differently puzzled?

@upthewaterspout
Copy link
Contributor

Hi Dale,

That helps. I missed the part where you explicitly said you were going to make this external at a later point in time. So the intent is to keep this feature hidden on develop until it is more fully baked - that makes sense.

Regarding MetricsCollector vs. MeterRegistry - I see. So you are not trying to hide micrometer from the user, but trying to hide CompositeMeterRegistry.

@demery-pivotal
Copy link
Contributor Author

Yes. The intention is for the MetricsCollector interface to expose everything needed by two kinds of users. Developers wanting to instrument code will use it to get Geode's meter registry, so they can add meters. Developers wanting to publish metrics will use it to add/remove downstream registries.

Copy link
Contributor

@kirklund kirklund left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 I think this looks great for initial addition of metrics. Later, we might want to move MetricsCollector (or a similar partial interface) to User API. I like deferring the User API until after this feature takes more solid form and more devs get the chance to review/discuss.

return internalDistributedSystem;
}

private void configureMeterRegistry() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this non-static is in the middle of the "static methods" section in this class. Maybe it should be moved lower?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking of moving the tag-adding stuff into the CompositeMetricsCollector. If we do that, what's left in configureMeterRegistry() might be small enough to inline. Otherwise, I'll move it out of the static methods section. And rename it configureMetricsCollector().

Copy link
Contributor

@galen-pivotal galen-pivotal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a chat with @kohlmu-pivotal and he made the point that what we're really doing is holding onto a CompositeMeterRegistry and exposing two aspects of it: adding new meters / configuring the root registry, and adding downstream meters. Adding downstream registries is configuration that should happen somewhere that is basically orthogonal to use. For using the registry, we should just pass the composite registry around as a MeterRegistry (since its users don't care what's at the other end). What do you think?

@galen-pivotal
Copy link
Contributor

Also, he made the point that the best place for users to get a hold of the meter registry would probably be the Cache, since that's our best holds-the-entire-world object right now.

@upthewaterspout
Copy link
Contributor

One other concern with this API - users can only add MeterRegistries after a distributed system is already created. But we will likely end up creating an updating a number of meters when the distributed system is created. If a user can't add MeterRegistries until after the distributed system creation, they will miss events.

Maybe the user's downstream MeterRegistries should be passed in to the CacheFactory?

@demery-pivotal
Copy link
Contributor Author

@galen-pivotal Yeah, those two kinds of uses (instrumenting and publishing) are distinct, so the methods for them don't belong in a single interface.

So we may as well expose getMeterRegistry() directly on whatever class ends up owning the registry.

Cache may indeed be the right place for the registry.

@kohlmu-pivotal
Copy link
Contributor

So.. in a nutshell, what @galen-pivotal and I spoke about:

  • The notion of a MetricsCollector as it stands and is currently used, makes no sense. @upthewaterspout has already pointed out that it abstracts the usage of the micrometer.io CompositeMeterRegistry and at the same time it does not. To me, the simplest approach is to pass in the MeterRegistry into the InternalDistributedSystem from "outside" (maybe as @upthewaterspout has indicated the CacheFactory.
  • IMO, some thought has to be put into the current management layer, to enable the creation, amendment of a constructed CompositeMeterRegistry. If there would be a MetricsConfigurationService, it could potentially hold onto all configurations for downstream registries. With the correct amount of design/abstraction and implementation, this *ConfigurationService could "discovered" using the extensions framework. With a concrete MetricsConfigurationService all configurations can be distributed/replicated to other servers with the cluster. (also include persistence)
  • The adding and removing of MeterRegistry from the CompositeMeterRegistry is a great feature! e.g Adding or removing a JMXRegistry at runtime is nice. BUT in lieu of a properly defined lifecycle and management service, I would not make that the only reason to have a MetricController interface.

@demery-pivotal
Copy link
Contributor Author

@upthewaterspout I'll take a look at injecting the downstream registry (and maybe multiple registries) via the cache factory.

@demery-pivotal
Copy link
Contributor Author

  • The notion of a MetricsCollector as it stands and is currently used, makes no sense. @upthewaterspout has already pointed out that it abstracts the usage of the micrometer.io CompositeMeterRegistry and at the same time it does not. To me, the simplest approach is to pass in the MeterRegistry into the InternalDistributedSystem from "outside" (maybe as @upthewaterspout has indicated the CacheFactory.

By "the" do you mean that the user would pass in the one-and-only registry for the system to use? Or a downstream registry to be added to the internally constructed/configured composite? Or something else?

  • IMO, some thought has to be put into the current management layer, to enable the creation, amendment of a constructed CompositeMeterRegistry. If there would be a MetricsConfigurationService, it could potentially hold onto all configurations for downstream registries. With the correct amount of design/abstraction and implementation, this *ConfigurationService could "discovered" using the extensions framework. With a concrete MetricsConfigurationService all configurations can be distributed/replicated to other servers with the cluster. (also include persistence)

Yes, the life cycle and configuration are the big puzzles that make us want to keep this internal, but to get it into develop so we can get some experience.

Do you have any scenarios in mind that would help flesh out what the life cycle might look like?

  • The adding and removing of MeterRegistry from the CompositeMeterRegistry is a great feature! e.g Adding or removing a JMXRegistry at runtime is nice. BUT in lieu of a properly defined lifecycle and management service, I would not make that the only reason to have a MetricController interface.

MetricsController is doomed.

@demery-pivotal
Copy link
Contributor Author

Superseded by #3180.

@demery-pivotal demery-pivotal deleted the feature/GEODE-6295 branch March 8, 2019 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants