HBASE-20904 Prometheus /metrics http endpoint for monitoring#1814
HBASE-20904 Prometheus /metrics http endpoint for monitoring#1814mmpataki wants to merge 9 commits intoapache:masterfrom
Conversation
|
Hi! Can you comment on the jira issue so I can assign it to you? |
|
@busbey I have commented on it. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
@busbey I am not understanding this Apache pre-commit builds, why the build fails in first and succeeds in second attempt |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
@mmpataki We have flakey tests. |
saintstack
left a comment
There was a problem hiding this comment.
Looks good but we export protmetheus metrics even when they are not of use, not being accessed?
I am sorry, I ran tests from the IDE and not from mvn, so this build issue came up.
thank you for the review, I will add a configuration to enable it (disabled by default). |
|
added a missing update to hbase-default.xml |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Updated the conf key name in HttpServer.java |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
🎊 +1 overall
This message was automatically generated. |
| } else { | ||
| synchronized (waitingRoom) { | ||
| try { | ||
| waitingRoom.wait(5000L); |
There was a problem hiding this comment.
It's not worth waiting for more than X seconds to get the metrics. (X should be 1s) if we can't produce in 1s we should/can return the metrics which were produced by earlier producer. (1s old metrics are acceptable?)
It's guaranteed that the old metrics will be there when we come out of this wait.
Once the thread comes out of wait, it tries to get back the lock which is unnecessary is there way to avoid it?
I solved this by adding a Helper class in to hadoop-metrics2 package (in hadoop compat project) |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
This same question popped into my head. 😸 glad you found a work around, but I think a good argument could be made for just doing the native metrics. that said, I don't have the cycles right now to do a metrics review to make sure we have a good set to start with. |
|
a glance shows the unit test failures are all in the backup/restore system? is that right? |
Yes, the test failures don't seem to be related. I will try to check them tomorrow. |
Yes the failures are in backup-restore component. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
Since all the metricsources have not moved to new metrics API, I added two servlets.
a. /prom (PrometheusServlet) uses the hbase native metrics API.
b. /prom-old (PrometheusHadoop2Servlet) uses the hadoop2 metrics API
PrometheusHadoop2Servlet can be thrown out once all the metricsources start using the hbase native metrics collection API.
Side note