-
Notifications
You must be signed in to change notification settings - Fork 797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MetricsFormatter to allow develops could collect metrics by their own way. #788
Conversation
@tjiuming has a test, for fun... Example initial version of a Collector
While I like the Update: branch name |
@dhoard Do you means that this PR would be hard to get adopted? |
This is just my opinion given the amount of work that I feel would be required to implement it correctly. Implement your solution and provide benchmarks that prove your solution increases performance/decreases memory usage (GCs), etc. (which it may well do.) |
@dhoard This PR is providing a |
@tjiuming just pushed some changes to my branch Based on my profiling with YourKit... the performance isn't as good using a similar "TextFormat" class similar to the current one. Update: branch name |
@dhoard you can try #782 (The code is too messy and needs to be refactored), there are some optimization methods.
Each of the above methods may have only a little performance improvement, but it's very useful for these systems which has many meters. |
@tjiuming you need to close one of the PRs. It's not clear which branch of code you are actually working with. |
I closed that PR. Please help approve this PR. |
@tjiuming In my opinion (not an official maintainer)... You need to provide reproducible evidence that the PR provides value. In my preliminary tests of the PR (ignoring the actual code to write the metrics... since this is what you have done in
The stated goal of the original PR was to decrease the creation of The stated goal of this PR is to "allow developers to collect metrics in their own way." Given that the |
@dhoard @fstab I pushed some commits to the branch https://github.com/tjiuming/client_java/tree/dev/collect_performance, you can run |
Hi @dhoard , First, thanks for taking the time to read and experiment with this PR. I think I can contribute a bit about the motivation for this PR. When an application which uses Prometheus Client library for defining and exposing metrics in Prometheus, has a very large number of times series (metric + labels) and by that I mean millions, it means every time you are scraping the Applications which only generates thousands of samples per scrape, wouldn't really care/notice. Those that are in the other end of the spectrum are heavily affected. One of those examples is Apache Pulsar - it is messaging system smilier to Apache Kafka. As opposed to Kafka, a broker can support up to 1 million topics, each with its own Producers and Consumers. Each topic has its own metrics. You can easily reach millions of samples per scrape. This PR attempts to take a stab in reducing the memory consumption and memory allocations by introducing an additional mechanism to collect the samples by using the visitor pattern. A Prometheus Collector will have additional method for obtaining the samples:
and SamplesVisitor would have a different method for visiting:
This allows to pass the samples information without any memory allocations. Apache Puslar in this example would implement its own SamplesVisitor. In our case we write to multiple ByteBuf (from Netty) which is located off-heap (thus no GC), and flush those in certain order to the Http OutputStream. That's the gist of the idea. I do believe we can create a DevNullSamplesObserver and run in on 1M and 5M samples scrape, and compare amount of memory allocated that is garage collectible, and also CPU time as comparison, but I'm not sure it's required for this case. I haven't used the exact implementation names and design of @tjiuming , because I wanted to convey the idea first. What do you think? |
@asafm Based on the initial description provided by @tjiuming, as well as your description, I am in alignment on the purpose of the changed. My branch https://github.com/dhoard/client_java/tree/WIP-collector-registry-visitor-support contains my initial implementation of the change (as you described.) I personally like the approach The 2 PRs (as presented), in my opinion, are disjoint (not clear/messy) and provide no basis (profiling, proof, testing, etc.) that it's an improvement. My basic profiling of the change using YourKit (ignoring the actual I'm not a maintainer, so other than my on interest/curiosity, my opinions are just that - one person's opinions. @fstab @tomwilkie @brian-brazil and others would need to weigh in on this alternate approach to collect metrics after @tjiuming and you can provide the necessary proof it's a good valid solid change. This may/may not be a direction they want to go with the library - I can't speak for them. |
Java GC is pretty smart when it comes to very short lived garbage, I'm not surprised. Even then we're probably only talking a few hundred MB, which isn't much given the use case. Considering how often this is called, this sort of micro-optimisation is unlikely to make sense in broader terms when considered against the additional API complexity. The API we have already provides a generic API to allow for other output formats. More generally a single target exposing millions of samples is far beyond what is typical, and will likely cause cardinality problems on the Prometheus end, if you can even manage to consistently scrape it successfully once per scrape interval. I would suggest you look at solutions that don't involve trying to exposing per-topic information in systems with millions of topics. |
@brian-brazil Thanks for taking the time to review this. On the other hand, on super sensitive to latency systems such as Apache Pulsar, or databases in general, a 3rd party library, especially observability one, should aim to inflict as little impact as possible - be a "fly on the wall" so to speak in the "room". We're definitely looking at applicative ways to overcome the 1M unique time series issues. I understand if you decide to leave things as is. |
We have to buffer the output at least, otherwise we don't know if an error occurs during collection which requires a HTTP 500 to be returned. |
I'm not familiar enough with the client - what can cause an error when iterating over existing collectors? It's all under client control no? |
A collector could throw an exception, or in future there may be additional checking that the output would be valid. |
@brian-brazil I've read the code a few times now, and I can't find where the http server catches an exception during collection and return 500. I couldn't find any buffering. I saw that each time we get a collector, we ask for its samples and iterate over it, writing it one by one to the stream. Could you please guide me to that location? |
It's not code that we explicitly have, the HTTP servers do it for us. You can see output buffering in https://github.com/prometheus/client_java/blob/master/simpleclient_httpserver/src/main/java/io/prometheus/client/exporter/HTTPServer.java#L86 for example. |
I searched through Sun Http Server and HTTP Server and couldn't find anything that catches an exception and sends 500. @Test
public void testSimpleRequest() throws IOException {
HTTPServer httpServer = new HTTPServer(new InetSocketAddress(0), registry);
Collector collector = new Collector() {
@Override
public List<MetricFamilySamples> collect() {
throw new RuntimeException("Something went wrong during collect");
}
};
registry.register(collector);
try {
HttpResponse response = createHttpRequestBuilder(httpServer, "/metrics").build().execute();
System.out.println("Status code = " + response.getResponseCode());
} finally {
httpServer.close();
}
} The result is that connection is dead in } catch (Exception e4) {
logger.log (Level.TRACE, "ServerImpl.Exchange (4)", e4);
closeConnection(connection);
} catch (Throwable t) {
logger.log(Level.TRACE, "ServerImpl.Exchange (5)", t);
throw t;
} in Regarding the buffering - thanks for the reference, I somehow missed that. |
@asafm You are correct - a The For a failed collection, there are two approaches to handle the
The code currently implements approach 1. There are arguments for both approaches... but ideally, in my opinion, we should at least catch/log the |
Thanks for taking the time to verify @dhoard. I do agree that if the library already has buffering in place, it makes sense to try-catch and throw 500 in case the collection failed. One idea that can be used, in tandem with what I suggested or not, is ByteBuffer-backed OutputStream. That Then using the new Visitor Pattern, the default HTTP server can use it and write directly off-heap, without any additional objects allocated, thereby minimizing the impact on the application this library is running at. |
@asafm the code uses thread-local client_java/simpleclient_httpserver/src/main/java/io/prometheus/client/exporter/HTTPServer.java Lines 89 to 90 in 4ddd60c
This eliminates the GC issue around the buffer. (conversation on the Moving to a |
Interesting. Didn't think of those implications - having the same buffer 5 times, although you only need one does seem not so good. Circling back to the original issue - the visitor pattern can reduce the burden on memory allocations, with a bit of added complexity. As I said previously I believe an observability library should aim to minimize its impact on a running application as much as possible. I think the price to pay is quite small to get one more footing towards that goal. Combine it with moving to ByteBufferOutputStream, and you reduce both heap memory consumption and reduce the garbage collection work with the visitor pattern. Both serve to minimize the impact on the hosting application. |
@asafm As I pointed out earlier, I like the visitor pattern, which is why I am in the conversation/provided feedback. Adoption/inclusion is up to the maintainers. (I'm not a maintainer.) |
Ok, I'll wait to see what @brian-brazil thinks about all of this conversation. |
Hi, first of all, thanks a lot for the discussion, I really appreciate everyone is taking the time to look into this and to find a good solution. Few remarks on points that were mentioned above but aren't directly related to the PR:
We start only 1 thread by default, so there is only one thread local buffer (see executor.setCorePoolSize(1)). The number of threads will increase up to 5 if there are multiple scrapes in parallel, but as long as scrapes are sequential there should be only a single thread and only a single buffer.
Yes, your analysis is correct. I just want to add that closing the connection without sending an HTTP response will make the Prometheus server assume a failed scrape, and the
Technically it is caught and logged in the Anyway, here are some comments on the actual PR:
|
One thought that crossed my mind today. Apache Pulsar can easily accommodate 100k topics per Broker and plan to support 1M topics per Broker. Each topic has roughly 70 unique metrics (metric names). Given 1M topics times 70 unique metrics, we get 70M garbage-collected objects per collection cycle. For a system that has the performance of less than 5ms latency per message, this gets the GC to chock and increase latency. Of course, we are fully aware that no human can consume so many metrics, so we planned to have a "filter" that aggregates it to reasonable groups of topics, limiting the final amount going out. Regarding JMH in general, in my opinion, it's hard to measure its impact on the garbage collector on a stand-alone test. Those 70M objects to be garbage collected could be the added work for the GC to reduce latency well beyond the 5ms required. That's something you can't see in a lab JMH test. |
Add MetricsFormatter to allow develops collect metrics by their own way.
For more context, you can see #782