Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upConfusion in the documentation: How to store labels with many different values? #1015
Comments
This comment has been minimized.
This comment has been minimized.
|
http://prometheus.io/docs/practices/instrumentation/#do-not-overuse-labels You should use a general purpose processing system, such as Hadoop or Spark. Prometheus is intended as a real-time monitoring system that you can easily run and depend on in an emergency. High cardinality metrics tend to only grow and require work disproportionate to the benefit they bring you in such a system, and dealing with them in a less-realtime fashion is more appropriate. |
This comment has been minimized.
This comment has been minimized.
|
@xiegeo How large is the cardinality of your intended label values? What matters most in the end is the total cardinality of what you're selecting in queries... |
This comment has been minimized.
This comment has been minimized.
|
For my current project the values are going to be small, I am mostly concerned about the correct way to do things such that the tools can present and analyze my data without too much modification, and as an learning experience. If I understand correctly, Prometheus is for someone who already have a Hadoop cluster or such to do the number crunching for the business side and uses Prometheus to monitor such clusters. My problem is different but the solution can be similar so I want to gave Prometheus a try. I have 2 servers for redundancy and locality. I will put the Prometheus client on them since anything interesting the users do will go though them. The two servers run independently of each other. I want to build a dashboard that shows what going on on both servers using Prometheus. I only have around 10 users (associates in a highly specialized field, so it will never get large). I also need to know where they are connecting from (ip addresses in the 100s, preferably also allow me to follow each tcp connection) and what projects (10s per user) they are working on, and other events (errors, commands of different types, bandwidth usage, network delay) should be easy to add when need. |
This comment has been minimized.
This comment has been minimized.
|
I think you've going to be near the edge of what a single Prometheus server can handle, and I wouldn't track individual TCP connections. If you're looking for network flow tracking there are tools out there designed for that purpose. I'd more generally recommend something like the ELK stack as this sounds more like an event logging use case than a timeseries monitoring use case. |
This comment has been minimized.
This comment has been minimized.
|
So there is no way to put unique metadata per event in Prometheus? I am open to writing something my self that can take advantage of the go process monitoring, data collection and some of the graphing features of Prometheus, without the weight of a large mammal sized stack. |
This comment has been minimized.
This comment has been minimized.
No, Prometheus isn't an event logging database. Have you considered InfluxDB? |
This comment has been minimized.
This comment has been minimized.
|
No, I did not. In my preliminary searches I dismissed InfluxDB as a cloud database provider. Now that you propped me for a better look, it does look like want I need. |
This comment has been minimized.
This comment has been minimized.
|
Great, glad that we could help you. |
brian-brazil
closed this
Aug 23, 2015
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 24, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
xiegeo commentedAug 20, 2015
In http://prometheus.io/docs/practices/naming/, there is a large warning:
But if not labels, what should I use?