-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prometheus persistent storage settings for PVCs always get deleted. Node disks will be flooded. #1454
Comments
@jomeier can you elaborate or "will get flooded"? |
Hi, when Prometheus is using emptydir, it will just write its database in the containers filesystem, which will a) have an impact on all other containers io-performance running on that node Openshift Doc even mentions these and recommends using local storage for Prometheus, because of IO Since youre speaking about problems with AZ in Azure, would that mean, that using azure block storage is generally not recommended to use? Its even preconfigured in the ARO cluster. Please clarify that. https://kubernetes.io/docs/concepts/storage/storage-classes/#allowed-topologies kind regards |
its not only Azure. All cloud providers as per my knowledge does not move block storage across the availability zones. This is known issue for ages. This is why people runs all "block storage applications" in quorums (ETCD, Casandra, Redis etc) spanned over 3 availability zones. Applications relying on single data point in block storage and being bound to only 1 availability will go down in case of region outage (hackers news has number of discussions around this, problem old as cloud itself). All this is just application architecture. Not much we can do here. Now related to original issue. We noticed that cluster upgrades get stuck due to fact if prometheus gets persisted to block storage and it is attempted to move it to other node in different zone during upgrade. This is why we removed persistence. This is system prometheus and we forward all data we need to support the service on ingestion so we can afford data lose (will happen on each upgrade). If you need prometheus, you should create application workload prometheus on your own. Is this need to add disk is coming from the reasons you seen performance issues already or just as a precautions? |
ok so what is the recommendation about using azure disk in ARO in general? wouldnt this help, to have prometheus be deployed only in one AZ ? https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/ We are not having any performance issues right now, we just try to follow best practices. |
Same as in any other cloud - use it as long as its fits your architecture :) There is no such general recommendation.
And overall if you see performance issues due to emptyDir, please raise support ticket with details so we can investigate. It is possible but same applied - region outage will cause it to go down. |
okay thanks for the detailed answer. I think we will just live with this then. It would be nice if this was mentioned in the docs somewhere though. |
The problem we have is, that our currently Openshift 3 Cluster in Azure is spread only across a single availability zone. So we have to find a solution for migrating those appliactions to a Cluster that is using several AZ. |
okay we just made a test with the storageclass parameter "zoned".
When this is set for azure disk storageclass, the kubernetes scheduler will make sure, that a pod is only assigned to a node, that belongs to the same zone as the disk. When there is no more node available in that zone, you will get the following error: 0/9 nodes are available: 2 node(s) were unschedulable, 3 node(s) had taint {node-role.kubernetes.io/master: }, that the pod didn't tolerate, 4 node(s) had volume node affinity conflict. This seems like a viable solution for us, so we could tell the prometheus to use a "zoned" storageclass, which will prevent its pods from being stuck. Now the only thing, that prevents us from doing this, is the ARO operator, that makes this kind of setups impossible. |
For now this will not change. As said, this prometheus is system prometheus managed by SRE team. We will have to come back to this later. |
could you elaborate, what "managed by SRE team" means? is there someone actively looking into our clusters health? |
Step back, You are talking about Azure Redhat Openshift managed OpenShift which you created using |
yes we are talking about ARO clusters, on-premises we dont have that problem, because there is no ARO operator, that cleanses our openshift-monitoring-config configmap. so dont you think, we will run into problems, when we have prometheus running on emptydir? on our on-premises cluster we have 21GB of data, that prometheus collects per day. |
You should not. If this happens and you suspect this to be a cause, raise a support ticket so we can look into this. |
Hi,
we tried to set persistence for Prometheus (default is emptyDir) with PVCs and block storageClass, because that is the proposed setup for OpenShift for production environments.
If we try do add that to the cluster-monitoring-config configmap, the storage (and retention settings) constantly get overwritten by this operator at this codeline here:
ARO-RP/pkg/operator/controllers/monitoring/monitoring_controller.go
Line 114 in c60dc9c
If we can't set persistent storage with PVCs or set the retention time / retention size, the nodes will get flooded and get stability problems.
Don't touch the settings in this configmap, please.
@mjudeikis
Thanks and greetings,
Josef
The text was updated successfully, but these errors were encountered: