Skip to content
This repository has been archived by the owner on Feb 22, 2022. It is now read-only.

[stable/prometheus-operator] mkdir /prometheus/wal: permission denied #12176

Closed
sunyl527 opened this issue Mar 13, 2019 · 13 comments
Closed

[stable/prometheus-operator] mkdir /prometheus/wal: permission denied #12176

sunyl527 opened this issue Mar 13, 2019 · 13 comments

Comments

@sunyl527
Copy link

Is this a request for help?
Yes.

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT

Version of Helm and Kubernetes:
helm version:

Client: &version.Version{SemVer:"v2.11.0", GitCommit:"2e55dbe1fdb5fdb96b75ff144a339489417b146b", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.11.0", GitCommit:"2e55dbe1fdb5fdb96b75ff144a339489417b146b", GitTreeState:"clean"}

The issue is the same with helm v2.12.3.
kubernetes version:

Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.3", GitCommit:"721bfa751924da8d1680787490c54b9179b1fed0", GitTreeState:"clean", BuildDate:"2019-02-01T20:08:12Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.3", GitCommit:"721bfa751924da8d1680787490c54b9179b1fed0", GitTreeState:"clean", BuildDate:"2019-02-01T20:00:57Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}

Which chart:
stable/prometheus-operator

What happened:
the container prometheus does not work:
prometheus-test-prometheus-operator-prometheus-0 2/3 CrashLoopBackOff 6
the logs:

kubectl logs -f prometheus-test-prometheus-operator-prometheus-0 -c prometheus
level=warn ts=2019-03-13T11:06:40.410423052Z caller=main.go:295 deprecation_notice="\"storage.tsdb.retention\" flag is deprecated use \"storage.tsdb.retention.time\" instead."
level=info ts=2019-03-13T11:06:40.410620036Z caller=main.go:302 msg="Starting Prometheus" version="(version=2.7.1, branch=HEAD, revision=62e591f928ddf6b3468308b7ac1de1c63aa7fcf3)"
level=info ts=2019-03-13T11:06:40.410660121Z caller=main.go:303 build_context="(go=go1.11.5, user=root@f9f82868fc43, date=20190131-11:16:59)"
level=info ts=2019-03-13T11:06:40.410706994Z caller=main.go:304 host_details="(Linux 3.10.0-957.1.3.el7.x86_64 #1 SMP Thu Nov 29 14:49:43 UTC 2018 x86_64 prometheus-test-prometheus-operator-prometheus-0 (none))"
level=info ts=2019-03-13T11:06:40.410747325Z caller=main.go:305 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2019-03-13T11:06:40.410781187Z caller=main.go:306 vm_limits="(soft=unlimited, hard=unlimited)"
level=info ts=2019-03-13T11:06:40.413968368Z caller=main.go:620 msg="Starting TSDB ..."
level=info ts=2019-03-13T11:06:40.414104205Z caller=web.go:416 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=info ts=2019-03-13T11:06:40.414561626Z caller=main.go:489 msg="Stopping scrape discovery manager..."
level=info ts=2019-03-13T11:06:40.414595497Z caller=main.go:503 msg="Stopping notify discovery manager..."
level=info ts=2019-03-13T11:06:40.414606866Z caller=main.go:525 msg="Stopping scrape manager..."
level=info ts=2019-03-13T11:06:40.414617696Z caller=main.go:499 msg="Notify discovery manager stopped"
level=info ts=2019-03-13T11:06:40.414657126Z caller=main.go:485 msg="Scrape discovery manager stopped"
level=info ts=2019-03-13T11:06:40.41467679Z caller=main.go:519 msg="Scrape manager stopped"
level=info ts=2019-03-13T11:06:40.414699787Z caller=manager.go:736 component="rule manager" msg="Stopping rule manager..."
level=info ts=2019-03-13T11:06:40.414719773Z caller=manager.go:742 component="rule manager" msg="Rule manager stopped"
level=info ts=2019-03-13T11:06:40.414748959Z caller=notifier.go:521 component=notifier msg="Stopping notification manager..."
level=info ts=2019-03-13T11:06:40.414769872Z caller=main.go:679 msg="Notifier manager stopped"
level=error ts=2019-03-13T11:06:40.415381657Z caller=main.go:688 err="opening storage failed: create dir: mkdir /prometheus/wal: permission denied"

What you expected to happen:
the pod should be running.

How to reproduce it (as minimally and precisely as possible):

helm install stable/prometheus-operator --name test  \
    --set prometheusOperator.securityContext.fsGroup=2000 \
    --set prometheusOperator.securityContext.runAsNonRoot=true \
    --set prometheusOperator.securityContext.runAsUser=1000 \
    --set prometheus.prometheusSpec.securityContext.fsGroup=2000 \
    --set prometheus.prometheusSpec.securityContext.runAsNonRoot=true \
    --set prometheus.prometheusSpec.securityContext.runAsUser=1000 \
    --set alertmanager.alertmanagerSpec.securityContext.securityContext.fsGroup=2000 \
    --set alertmanager.alertmanagerSpec.securityContext.runAsNonRoot=true \
    --set alertmanager.alertmanagerSpec.securityContext.runAsUser=1000 \
    --set prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage=1Gi \
    --set prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.selector.matchLabels.app="prometheus" \
    --set alertmanager.alertmanagerSpec.storage.volumeClaimTemplate.spec.resources.requests.storage=1Gi \
    --set alertmanager.alertmanagerSpec.storage.volumeClaimTemplate.selector.matchLabels.app="alertmanager" 

And I created two PersistentVolume, but I tried to use nfs and hostPath ,it did not work.
Anything else we need to know:
Here is the same issue: prometheus-operator/prometheus-operator#966 .
Had it been fixed?

@sunyl527 sunyl527 changed the title [stable/prometheus-operator] [stable/prometheus-operator] mkdir /prometheus/wal: permission denied Mar 13, 2019
@sunyl527
Copy link
Author

@justlaputa @vsliouniaev Could anyone help me?I'm sorry to bother you.

@vsliouniaev
Copy link
Collaborator

This seems like an issue not specific to the helm chart. The closest I've come across was an issue that was something to do with the volume provisioner here: prometheus-operator/prometheus-operator#2182

@sunyl527
Copy link
Author

sunyl527 commented Mar 21, 2019

@vsliouniaev Thanks for your reply! It helps me! I edit the statefulset with this:

  securityContext:
    fsGroup: 0
    runAsNonRoot: false
    runAsUser: 0

It works!

By the way,it seams that could not use --set XXX.securityContext.runAsUser=0 for some reasons of the helm.

@nobody4t
Copy link

Is it good set it run as root in a prod env?

@vsliouniaev
Copy link
Collaborator

@dongwangdw it is not recommended to do this. From a security point of view this is a bad idea. There are some other setups where even this wouldn't help.

@nobody4t
Copy link

@vsliouniaev
Yes, nowadays many platform run the containers with no root access.
I think this is not a good solution.

I got some issue regarding this, but none gave the root cause and solution.
I saw you got the same problem in issue.
prometheus-operator/prometheus-operator#2182.

Did you get it resolved?

@nobody4t
Copy link

I make it by adding the securityContext without setting the user root.

I can not understand why should set a securityContext.
What is the root cause?

@vsliouniaev
Copy link
Collaborator

This isn't required, for the last 5 months. Are you having the same issue?

@nobody4t
Copy link

yes, I made it resolved by setting a securityContext.
But I got another issue because of this added securityContext.

@maresja1
Copy link

I added this to my Prometheus definition to avoid running as root:

  ...
  securityContext:
    fsGroup: 489
    runAsUser: 489
  ...
  initContainers:
  - command:
    - chown
    - -R
    - 489:489
    - /prometheus
    image: busybox:1.30
    imagePullPolicy: IfNotPresent
    name: init-chown-data
    resources: {}
    securityContext:
      runAsUser: 0
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /prometheus
      name: prometheus-service-prometheus-db
      subPath: prometheus-db

@jdomag
Copy link

jdomag commented Feb 5, 2020

@maresja1
How did you add it? Do you use helm chart stable/prometheus-operator? I see that it's possible to set securityContext via values.yaml, but can't see any option for init containers

@jrockway
Copy link

@jdomag I just had this problem and maresja's advice works perfectly. The config he pasted goes in your "prometheus" object. I don't know where the helm chart puts that, but you should have one, as that's the resource that the operator uses to create your prometheus instance. You may need to adjust the name of the volume mount.

@csh995426531
Copy link

@sunyl527 thinks! This is useful.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants