Skip to content
This repository has been archived by the owner on Oct 23, 2024. It is now read-only.

Fluentd not able to tail logs #72

Closed
bjornm82 opened this issue Mar 16, 2018 · 9 comments
Closed

Fluentd not able to tail logs #72

bjornm82 opened this issue Mar 16, 2018 · 9 comments
Assignees
Labels

Comments

@bjornm82
Copy link

Create new cluster:
Mesosphere DC/OS Version 1.11.0

Apply Kubernetes cloud provider support (don't think there is relevance though):
https://aws.amazon.com/blogs/opensource/cloud-provider-support-kubernetes-dcos/
Kubernetes version v1.9.4

Starting Fluentd daemonset:
https://github.com/fluent/fluentd-kubernetes-daemonset/blob/master/fluentd-daemonset-elasticsearch.yaml

Logs fluentd pod output:

2018-03-16 19:26:34 +0000 [warn]: #0 'type' is deprecated parameter name. use '@type' instead.
2018-03-16 19:26:34 +0000 [info]: adding source type="tail"
2018-03-16 19:26:34 +0000 [warn]: #0 'type' is deprecated parameter name. use '@type' instead.
2018-03-16 19:26:34 +0000 [info]: adding source type="tail"
2018-03-16 19:26:34 +0000 [warn]: #0 'type' is deprecated parameter name. use '@type' instead.
2018-03-16 19:26:34 +0000 [info]: adding source type="tail"
2018-03-16 19:26:34 +0000 [warn]: #0 'type' is deprecated parameter name. use '@type' instead.
2018-03-16 19:26:34 +0000 [info]: adding source type="tail"
2018-03-16 19:26:34 +0000 [warn]: #0 'type' is deprecated parameter name. use '@type' instead.
2018-03-16 19:26:34 +0000 [info]: adding source type="tail"
2018-03-16 19:26:34 +0000 [warn]: #0 'type' is deprecated parameter name. use '@type' instead.
2018-03-16 19:26:34 +0000 [info]: adding source type="tail"
2018-03-16 19:26:34 +0000 [warn]: #0 'type' is deprecated parameter name. use '@type' instead.
2018-03-16 19:26:34 +0000 [info]: adding source type="tail"
2018-03-16 19:26:34 +0000 [warn]: #0 'type' is deprecated parameter name. use '@type' instead.
2018-03-16 19:26:34 +0000 [info]: adding source type="tail"
2018-03-16 19:26:34 +0000 [warn]: #0 'type' is deprecated parameter name. use '@type' instead.
2018-03-16 19:26:34 +0000 [info]: adding source type="tail"
2018-03-16 19:26:34 +0000 [warn]: #0 'type' is deprecated parameter name. use '@type' instead.
2018-03-16 19:26:34 +0000 [info]: adding source type="tail"
2018-03-16 19:26:34 +0000 [warn]: #0 'type' is deprecated parameter name. use '@type' instead.
2018-03-16 19:26:34 +0000 [info]: adding source type="tail"
2018-03-16 19:26:34 +0000 [info]: #0 starting fluentd worker pid=15 ppid=1 worker=0
2018-03-16 19:26:34 +0000 [warn]: #0 /var/log/containers/kube-dns-754f9cd4f5-pnfnk_kube-system_dnsmasq-e75753d29a5e4879f712525b268575a6c9fbaf55a17d677037980d0e2bd75495.log unreadable. It is excluded and would be examined next time.
2018-03-16 19:26:34 +0000 [warn]: #0 /var/log/containers/metrics-server-54974fd587-cdjc7_kube-system_metrics-server-270577277bc1213532129c467f07c6bf6413c7f25b7d219717e9422cc437117d.log unreadable. It is excluded and would be examined next time.
2018-03-16 19:26:34 +0000 [warn]: #0 /var/log/containers/fluentd-cxkfr_kube-system_fluentd-2b138775674b9b090d1f2492bec0732346a89bf8b2a3758c6fd4d8d9abcb19a9.log unreadable. It is excluded and would be examined next time.
2018-03-16 19:26:34 +0000 [warn]: #0 /var/log/containers/kubernetes-dashboard-5cfddd7d5b-cj8c7_kube-system_kubernetes-dashboard-171e2225ecb06d685a38cd61fe04f380cabf91c4bea77c4739117155a3dbe674.log unreadable. It is excluded and would be examined next time.
2018-03-16 19:26:34 +0000 [warn]: #0 /var/log/containers/kube-dns-754f9cd4f5-pnfnk_kube-system_kubedns-eb34aa08266b0e7b317fb7e21f1857ac727ee98354559d291ec425d08601d3ad.log unreadable. It is excluded and would be examined next time.
2018-03-16 19:26:34 +0000 [warn]: #0 /var/log/containers/kube-dns-754f9cd4f5-pnfnk_kube-system_sidecar-c6c1709f49fe1c16630531ed199a027a1b48fab093ead6f339015d6936b6aec2.log unreadable. It is excluded and would be examined next time.
2018-03-16 19:26:34 +0000 [info]: #0 fluentd worker is now running worker=0

As earlier today the logs ended by a non able to follow symlink. Seems like sort of the same issue as given at kubernetes/kubernetes#39225, however the thread is rather old.

@bjornm82
Copy link
Author

Some additional information since yesterday:

When sshing into the master node:

Last login: Sat Mar 17 07:08:21 UTC 2018 from 10.0.6.72 on pts/0
Container Linux by CoreOS stable (1235.12.0)
Update Strategy: No Reboots
Failed Units: 2
  format-var-lib-ephemeral.service
  update-engine.service

$ systemctl status format-var-lib-ephemeral.service


● format-var-lib-ephemeral.service - AWS Setup: Formats the /var/lib ephemeral drive
   Loaded: loaded (/etc/systemd/system/format-var-lib-ephemeral.service; static; vendor preset: disabled)
   Active: failed (Result: exit-code) since Sat 2018-03-17 07:45:10 UTC; 16s ago
  Process: 28019 ExecStart=/bin/bash -c (blkid -t TYPE=ext4 | grep xvdb) || (/usr/sbin/mkfs.ext4 -F /dev/xvdb) (code=exited, status=1/F
 Main PID: 28019 (code=exited, status=1/FAILURE)

Mar 17 07:45:10 ip-10-0-0-206.us-west-2.compute.internal systemd[1]: Starting AWS Setup: Formats the /var/lib ephemeral drive...
Mar 17 07:45:10 ip-10-0-0-206.us-west-2.compute.internal bash[28019]: mke2fs 1.42.13 (17-May-2015)
Mar 17 07:45:10 ip-10-0-0-206.us-west-2.compute.internal bash[28019]: The file /dev/xvdb does not exist and no size was specified.
Mar 17 07:45:10 ip-10-0-0-206.us-west-2.compute.internal systemd[1]: format-var-lib-ephemeral.service: Main process exited, code=exited
Mar 17 07:45:10 ip-10-0-0-206.us-west-2.compute.internal systemd[1]: Failed to start AWS Setup: Formats the /var/lib ephemeral drive.
Mar 17 07:45:10 ip-10-0-0-206.us-west-2.compute.internal systemd[1]: format-var-lib-ephemeral.service: Unit entered failed state.
Mar 17 07:45:10 ip-10-0-0-206.us-west-2.compute.internal systemd[1]: format-var-lib-ephemeral.service: Failed with result 'exit-code'.

Searching for all .log files

find / -name *.log -ls

Returns a list where the logs should be located, but the location only contains non-existing symlinks:
cd /var/lib/mesos/slave/volumes/roles/kubernetes-role/82e9efc0-aeb3-457a-8174-57490b4615c1/new/log/containers/

ls -alh

screen shot 2018-03-17 at 8 54 40 am

Does the above mean that /var/lib should be created by AWS directed by Kubernetes? So it might be in the setup and Kubernetes permissions?

@pires
Copy link
Contributor

pires commented Mar 24, 2018

This is a limitation on our end and, at the moment, there's no quick, safe resolution we can think of. We will come back to this in the future.

@pires pires added the bug label Apr 5, 2018
@pires
Copy link
Contributor

pires commented Jun 7, 2018

@bjornm82 this is been worked on as we speak. We expect it to be released with DC/OS 1.12.

@bjornm82
Copy link
Author

bjornm82 commented Jun 7, 2018

Thanks @pires !

@blublinsky
Copy link

There are several problems here. THe most fundamental one is how to configure fluentd to access kubernetes logs.
On the straight kubernetes the configuration looks as follows:
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
This assumes a fixed location of everything on the host. In the case of DC/OS the location of kubernetes files is selected by Marathon at run time. Unless this location can be symblink to something stable, fluentd approach will not work.
Any planned resolution for this?

@pires
Copy link
Contributor

pires commented Jul 24, 2018

@blublinsky

What does straight kubernetes mean?

Anyway, regarding a planned resolution, yes, as mentioned above it's being worked on and we expect to release it as part of DC/OS 1.12.

@blublinsky
Copy link

I meant native kubernetes deployed on bare metal

@pires
Copy link
Contributor

pires commented Aug 29, 2018

As mentioned before, this will be released as part of our DC/OS 1.12 release and Kubernetes package 2.x.

@hectorj2f
Copy link

this issue should be fixed in 2.0.0-1.12.1 https://docs.mesosphere.com/services/kubernetes/2.0.0-1.12.1/

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants