You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
christiansargusingh 20:21:33 k3s/monitoring/manifests > kubectl describe pod prometheus-k8s-0 -n monitoring
Name: prometheus-k8s-0
Namespace: monitoring
Priority: 0
Node: master/192.168.2.111
Start Time: Thu, 27 Apr 2023 03:17:09 -0400
Labels: app=prometheus
controller-revision-hash=prometheus-k8s-749f5b9588
prometheus=k8s
statefulset.kubernetes.io/pod-name=prometheus-k8s-0
Annotations: <none>
Status: Running
IP: 10.42.0.14
IPs:
IP: 10.42.0.14
Controlled By: StatefulSet/prometheus-k8s
Containers:
prometheus:
Container ID: containerd://57c7b5fc9b085d75062e5638e0d54c94ae4f41928ffef688ff4038f1d2d39969
Image: prom/prometheus:v2.19.1
Image ID: docker.io/prom/prometheus@sha256:efe62fa8804e9fd2612a945b70c630cc27e21b5fb8233ccc8be4cfbe06d26b04
Port: 9090/TCP
Host Port: 0/TCP
Args:
--web.console.templates=/etc/prometheus/consoles
--web.console.libraries=/etc/prometheus/console_libraries
--config.file=/etc/prometheus/config_out/prometheus.env.yaml
--storage.tsdb.path=/prometheus
--storage.tsdb.retention.time=15d
--web.enable-lifecycle
--storage.tsdb.no-lockfile
--web.external-url=http://prometheus.192.168.2.104.nip.io
--web.route-prefix=/
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Message: rrentHeadChunk(0x8fb2840, 0x51de1b0)
/app/tsdb/head.go:1991 +0x22c
github.com/prometheus/prometheus/tsdb.(*memSeries).cutNewHeadChunk(0x8fb2840, 0xe1104a28, 0x187, 0x51de1b0, 0x1)
/app/tsdb/head.go:1962 +0x24
github.com/prometheus/prometheus/tsdb.(*memSeries).append(0x8fb2840, 0xe1104a28, 0x187, 0x0, 0x0, 0x0, 0x0, 0x51de1b0, 0x1)
/app/tsdb/head.go:2118 +0x3a4
github.com/prometheus/prometheus/tsdb.(*Head).processWALSamples(0x5d00000, 0xdfc45a00, 0x187, 0xdcca880, 0xdcca840, 0x0, 0x0)
/app/tsdb/head.go:365 +0x284
github.com/prometheus/prometheus/tsdb.(*Head).loadWAL.func5(0x5d00000, 0x14a86038, 0x14a86040, 0xdcca880, 0xdcca840)
/app/tsdb/head.go:459 +0x3c
created by github.com/prometheus/prometheus/tsdb.(*Head).loadWAL
/app/tsdb/head.go:458 +0x268
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0xc pc=0x1532d88]
goroutine 297 [running]:
bufio.(*Writer).Available(...)
/usr/local/go/src/bufio/bufio.go:608
github.com/prometheus/prometheus/tsdb/chunks.(*ChunkDiskMapper).WriteChunk(0x51de1b0, 0x71c0, 0x0, 0xe0a02338, 0x187, 0xe10d8b08, 0x187, 0x240cee0, 0x5e5ce20, 0x0, ...)
/app/tsdb/chunks/head_chunks.go:252 +0x500
github.com/prometheus/prometheus/tsdb.(*memSeries).mmapCurrentHeadChunk(0x8fb1970, 0x51de1b0)
/app/tsdb/head.go:1988 +0x6c
github.com/prometheus/prometheus/tsdb.(*memSeries).cutNewHeadChunk(0x8fb1970, 0xe1104a28, 0x187, 0x51de1b0, 0x1)
/app/tsdb/head.go:1962 +0x24
github.com/prometheus/prometheus/tsdb.(*memSeries).append(0x8fb1970, 0xe1104a28, 0x187, 0x8cab4ba2, 0x3fded782, 0x0, 0x0, 0x51de1b0, 0x1)
/app/tsdb/head.go:2118 +0x3a4
github.com/prometheus/prometheus/tsdb.(*Head).processWALSamples(0x5d00000, 0xdfc45a00, 0x187, 0xdcca780, 0xdcca740, 0x0, 0x0)
/app/tsdb/head.go:365 +0x284
github.com/prometheus/prometheus/tsdb.(*Head).loadWAL.func5(0x5d00000, 0x14a86038, 0x14a86040, 0xdcca780, 0xdcca740)
/app/tsdb/head.go:459 +0x3c
created by github.com/prometheus/prometheus/tsdb.(*Head).loadWAL
/app/tsdb/head.go:458 +0x268
Exit Code: 2
Started: Sat, 06 May 2023 20:18:05 -0400
Finished: Sat, 06 May 2023 20:18:37 -0400
Ready: False
Restart Count: 958
Requests:
memory: 400Mi
Liveness: http-get http://:web/-/healthy delay=0s timeout=3s period=5s #success=1 #failure=6
Readiness: http-get http://:web/-/ready delay=0s timeout=3s period=5s #success=1 #failure=120
Environment: <none>
Mounts:
/etc/prometheus/certs from tls-assets (ro)
/etc/prometheus/config_out from config-out (ro)
/etc/prometheus/rules/prometheus-k8s-rulefiles-0 from prometheus-k8s-rulefiles-0 (rw)
/prometheus from prometheus-k8s-db (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-zhmwj (ro)
prometheus-config-reloader:
Container ID: containerd://7e43f5fa9323d87cf0d3597bbda806087f609c869d9e27e25b1ea68531852916
Image: carlosedp/prometheus-config-reloader:v0.40.0
Image ID: docker.io/carlosedp/prometheus-config-reloader@sha256:218f9f49a51a072af66ac67696c092a4962fd5108cd5525dbbcea5c239fe3862
Port: <none>
Host Port: <none>
Command:
/bin/prometheus-config-reloader
Args:
--log-format=logfmt
--reload-url=http://localhost:9090/-/reload
--config-file=/etc/prometheus/config/prometheus.yaml.gz
--config-envsubst-file=/etc/prometheus/config_out/prometheus.env.yaml
State: Running
Started: Thu, 27 Apr 2023 03:18:00 -0400
Ready: True
Restart Count: 0
Limits:
cpu: 100m
memory: 25Mi
Requests:
cpu: 100m
memory: 25Mi
Environment:
POD_NAME: prometheus-k8s-0 (v1:metadata.name)
Mounts:
/etc/prometheus/config from config (rw)
/etc/prometheus/config_out from config-out (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-zhmwj (ro)
rules-configmap-reloader:
Container ID: containerd://0f378a4b49b78d9aad618dba27139e4132d379ebe5aa42e6ca788b5aa8d96706
Image: carlosedp/configmap-reload:latest
Image ID: docker.io/carlosedp/configmap-reload@sha256:cd9f05743ab6024e445ea6e0da4416122eae5e1d0149dd33232be0601096c8d4
Port: <none>
Host Port: <none>
Args:
--webhook-url=http://localhost:9090/-/reload
--volume-dir=/etc/prometheus/rules/prometheus-k8s-rulefiles-0
State: Running
Started: Thu, 27 Apr 2023 03:18:01 -0400
Ready: True
Restart Count: 0
Limits:
cpu: 100m
memory: 25Mi
Requests:
cpu: 100m
memory: 25Mi
Environment: <none>
Mounts:
/etc/prometheus/rules/prometheus-k8s-rulefiles-0 from prometheus-k8s-rulefiles-0 (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-zhmwj (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config:
Type: Secret (a volume populated by a Secret)
SecretName: prometheus-k8s
Optional: false
tls-assets:
Type: Secret (a volume populated by a Secret)
SecretName: prometheus-k8s-tls-assets
Optional: false
config-out:
Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: SizeLimit: <unset> prometheus-k8s-rulefiles-0: Type: ConfigMap (a volume populated by a ConfigMap) Name: prometheus-k8s-rulefiles-0 Optional: false prometheus-k8s-db: Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
kube-api-access-zhmwj:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 32m (x7497 over 9d) kubelet Readiness probe failed: HTTP probe failed with statuscode: 503
Warning BackOff 118s (x22276 over 4d8h) kubelet Back-off restarting failed container prometheus in pod prometheus-k8s-0_monitoring(2994d714-16dc-4b85-92b4-2232b1a9d8c6)
It's possible since the retention policy was set to 15 days that it overloaded some internal memory buffer. When I checked the disk space on the nodes they looked pretty clean:
Description
After a 9 day test run of the monitoring cluster deployments the
prometheus-k8s-0
pod went intoCrashLoopBackOff
.Details
Prometheus crashed abruptly trying to write to invalid memory:
Full output from pod describe:
It's possible since the retention policy was set to 15 days that it overloaded some internal memory buffer. When I checked the disk space on the nodes they looked pretty clean:
The text was updated successfully, but these errors were encountered: