-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Promtail pod CrashLoopBackoff - error creating promtail #1457
Comments
This is interesting. It appears to be a yaml issue, but it's unusual that you're only seeing this on one pod and the problem persisted across restarts. If your configuration is not sensitive, would you mind posting it? |
client:
backoff_config:
maxbackoff: 5s
maxretries: 20
minbackoff: 100ms
batchsize: 102400
batchwait: 1s
external_labels: {}
timeout: 10s
positions:
filename: /run/promtail/positions.yaml
server:
http_listen_port: 3101
target_config:
sync_period: 10s
scrape_configs:
- job_name: kubernetes-pods-name
pipeline_stages:
- docker: {}
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels:
- __meta_kubernetes_pod_label_name
target_label: __service__
- source_labels:
- __meta_kubernetes_pod_node_name
target_label: __host__
- action: drop
regex: ''
source_labels:
- __service__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
replacement: $1
separator: /
source_labels:
- __meta_kubernetes_namespace
- __service__
target_label: job
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: instance
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: container_name
- replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_uid
- __meta_kubernetes_pod_container_name
target_label: __path__
- job_name: kubernetes-pods-app
pipeline_stages:
- docker: {}
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: drop
regex: .+
source_labels:
- __meta_kubernetes_pod_label_name
- source_labels:
- __meta_kubernetes_pod_label_app
target_label: __service__
- source_labels:
- __meta_kubernetes_pod_node_name
target_label: __host__
- action: drop
regex: ''
source_labels:
- __service__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
replacement: $1
separator: /
source_labels:
- __meta_kubernetes_namespace
- __service__
target_label: job
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: instance
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: container_name
- replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_uid
- __meta_kubernetes_pod_container_name
target_label: __path__
- job_name: kubernetes-pods-direct-controllers
pipeline_stages:
- docker: {}
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: drop
regex: .+
separator: ''
source_labels:
- __meta_kubernetes_pod_label_name
- __meta_kubernetes_pod_label_app
- action: drop
regex: '[0-9a-z-.]+-[0-9a-f]{8,10}'
source_labels:
- __meta_kubernetes_pod_controller_name
- source_labels:
- __meta_kubernetes_pod_controller_name
target_label: __service__
- source_labels:
- __meta_kubernetes_pod_node_name
target_label: __host__
- action: drop
regex: ''
source_labels:
- __service__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
replacement: $1
separator: /
source_labels:
- __meta_kubernetes_namespace
- __service__
target_label: job
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: instance
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: container_name
- replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_uid
- __meta_kubernetes_pod_container_name
target_label: __path__
- job_name: kubernetes-pods-indirect-controller
pipeline_stages:
- docker: {}
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: drop
regex: .+
separator: ''
source_labels:
- __meta_kubernetes_pod_label_name
- __meta_kubernetes_pod_label_app
- action: keep
regex: '[0-9a-z-.]+-[0-9a-f]{8,10}'
source_labels:
- __meta_kubernetes_pod_controller_name
- action: replace
regex: '([0-9a-z-.]+)-[0-9a-f]{8,10}'
source_labels:
- __meta_kubernetes_pod_controller_name
target_label: __service__
- source_labels:
- __meta_kubernetes_pod_node_name
target_label: __host__
- action: drop
regex: ''
source_labels:
- __service__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
replacement: $1
separator: /
source_labels:
- __meta_kubernetes_namespace
- __service__
target_label: job
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: instance
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: container_name
- replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_uid
- __meta_kubernetes_pod_container_name
target_label: __path__
- job_name: kubernetes-pods-static
pipeline_stages:
- docker: {}
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: drop
regex: ''
source_labels:
- __meta_kubernetes_pod_annotation_kubernetes_io_config_mirror
- action: replace
source_labels:
- __meta_kubernetes_pod_label_component
target_label: __service__
- source_labels:
- __meta_kubernetes_pod_node_name
target_label: __host__
- action: drop
regex: ''
source_labels:
- __service__
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
replacement: $1
separator: /
source_labels:
- __meta_kubernetes_namespace
- __service__
target_label: job
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: instance
- action: replace
source_labels:
- __meta_kubernetes_pod_container_name
target_label: container_name
- replacement: /var/log/pods/*$1/*.log
separator: /
source_labels:
- __meta_kubernetes_pod_annotation_kubernetes_io_config_mirror
- __meta_kubernetes_pod_container_name
target_label: __path__ |
Ok, that appears to be valid yaml and after taking a look at the error message, it looks like it arises after parsing the config. My next guess would be to look at the positions file, which, according to your configuration can be found at This is likely mounted via a hostpath. That would help explain why it fails restarting even after pod deletion -- it would try to mount the same host path (with the same positions file). Can you take a look at the It should look like: positions:
fileA: "1234"
fileB: "5678" |
Looks like that's the problem! The last line of |
Ha, good to know we found it! Honestly, I'd delete the last line that was truncated and let it restart from there. You may have some out of order errors for a bit from tailing pods on that node which you've tailed before, but that can't really be helped. If you're still seeing problems with promtail not catching up due to the out of order errors, I'd suggest deleting pods on that node and letting them be rescheduled so they that promtail can start tailing them anew. Since they'll be new pods, you shouldn't see out of order issues (they'll be assigned to new log streams). Hopefully there's nothing delicate like stateful sets that were affected there. I'm curious what caused this truncation -- it's an interesting failure condition. Feel free to close the issue if you get everything resolved. I'll open an issue for discussion about this. |
Thank you for the help! |
I'm guessing the fix was introduced in fd25e6d which was not included in the |
Describe the bug
A pod in the promtail DaemonSet is in CrashLoopBackOff.
We've had the
loki-stack
chart deployed. As we've added nodes to our Kubernetes cluster, new pods that run correctly have been added to the DaemonSet. However, one of the oldest pods entered this CrashLoopBackOff state a period of time after initial deployment.I've attempted to delete the pod, but the restarted pod produces the same error.
To Reproduce
loki-stack
Helm chart version0.18.1
with defaults.Expected behavior
All promtail pods run properly without crashing.
Environment:
Screenshots, Promtail config, or terminal output
The text was updated successfully, but these errors were encountered: