You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Debug Kata in an enviroment like K8s increase the complexity to find good intial data to identify what is broken. Today Kata is working solutions to improve the observability.
Debug Kata in an enviroment like K8s increase the complexity to find good intial data to identify what is broken. Today Kata is working solutions to improve the observability.
Tracing : From runtime/shim to agent
Logging: Kata today provides different log levels that will go to syslog.
Included: runtime/shim/hypervisor
While tracing may be helpful for some cases specially to identify where the times goes in Kata.
I would like to get a new script similar to kata collection script but for a Pod.
This is given a K8s pod I get information of it .
./get-pod-logs.sh
Find the associated kata container IDs for that pod
Get cronological ordered for logs of the pod (running in the host)
The general idea would be something like
journalctl -t kata -t kata-vmm -t virtiofsd | grep kata-pod-id
so the output is a cronological ordered logs
From inital container
vmm logs
vmm tty logs (so agent logs as well)
virtiofsd logs
Solutions:
In all the solutions first we the script tracks from k8s pod id to get the name kata containers ids to get kata information. Then
just check in jorunal the output logs for the containers and assets associated with it.
journalctl -t kata -t virtiofsd -t kata-vmm | grep kata[$shim-pid|virtiofsd[$virtiosd_pid|$virtiofsd_pid
Pros
Not a lot of modifications in kata stack
Cons
If the container process is gone or some of the components are gone it may be more difficult to find the pids
In the case of virtiofsd the main process is not the one that gives logs is the child of this process, but this is
implementation specific and is not sure that will change in the future.
2) Kata runtime provide information about pids of:
More clean interface to query kata specific information
If some of the process asociated to the container are gone, we still have the pids and we can still filter logs by pids
Cons
Still virtiofsd logging is not enough by filter with PID, as the real logs are by a fork process.
Other changes for external components
For some external components would be nice that they add extra metadata provided by kata.
###virtiofsd:
virtiofsd -o debug_prefix "container-id" so regardless of the pid or internal implementation of virtiofsd we can just filter data with
journalclt -t virtiofsd | grep container-id
or instead of ask virtiofsd to log to syslog we can use a systemd-cat redirection and save the pid of systemd-cat(as we do in VMMs
For other components it may be a nice to have, but today components like cloud-hypervisor log collection is enough as output is sent to syslog by using systemd-cat so the PID to filter is the systemd-cat pid
\cc @egernst as has some interest on get better debug/observability
\cc @jodh-intel is doing tracing and wrote kata collect script
\cc @cmaf is doing tracing
\cc @fidencio@chavafg@GabyCT that may face some issues in integrations in the past.
Please ping someone else that may be interested to get a nice sequence of events to debug specific kata issues using k8s (that is the main way to use kata today). @bergwolf in case has a nice way to track this on debugging complex kata issues.
The text was updated successfully, but these errors were encountered:
There we check whether CRI-O's logLevel is set to debug, and if so, we pass -debug to containerd-shim-kata-v2 process.
Isn't this weird? In my mind, we'd always pass "-debug" to containerd-shim-kata-v2 process, and then set the logLevel we need on Kata Containers' configuration file. Does this sound reasonable?
Then we'd need to change Kata Containers' configuration file to actually get such information and to allow us to distinguish between all the different log levels as, nowadays, we have "true|false" and nothing else, which ends up being waaaaay too verbose (or not verbose at all).
Dunno, this is what @marcel-apf started working on some time ago and we didn't reach an agreement. I guess this is really worth to revisit, and try to get some things moving in order to improve the debuggability of the project.
Sorry if I deviated too much here, @jcvenegas. /o\
Debug Kata in an enviroment like K8s increase the complexity to find good intial data to identify what is broken. Today Kata is working solutions to improve the observability.
Debug Kata in an enviroment like K8s increase the complexity to find good intial data to identify what is broken. Today Kata is working solutions to improve the observability.
Included: runtime/shim/hypervisor
While tracing may be helpful for some cases specially to identify where the times goes in Kata.
I would like to get a new script similar to kata collection script but for a Pod.
This is given a K8s pod I get information of it .
./get-pod-logs.sh
The general idea would be something like
so the output is a cronological ordered logs
From inital container
vmm logs
vmm tty logs (so agent logs as well)
virtiofsd logs
Solutions:
In all the solutions first we the script tracks from k8s pod id to get the name kata containers ids to get kata information. Then
just check in jorunal the output logs for the containers and assets associated with it.
1) Script search the pids of process
Once it has the kata containers ids
do
get logs
Pros
Not a lot of modifications in kata stack
Cons
If the container process is gone or some of the components are gone it may be more difficult to find the pids
In the case of virtiofsd the main process is not the one that gives logs is the child of this process, but this is
implementation specific and is not sure that will change in the future.
2) Kata runtime provide information about pids of:
shim, vmm, virtiofsd(main process and fork)
Pros
More clean interface to query kata specific information
If some of the process asociated to the container are gone, we still have the pids and we can still filter logs by pids
Cons
Still virtiofsd logging is not enough by filter with PID, as the real logs are by a fork process.
Other changes for external components
For some external components would be nice that they add extra metadata provided by kata.
###virtiofsd:
or instead of ask virtiofsd to log to syslog we can use a
systemd-cat
redirection and save the pid ofsystemd-cat
(as we do in VMMs@dagrh
For other components it may be a nice to have, but today components like cloud-hypervisor log collection is enough as output is sent to syslog by using
systemd-cat
so the PID to filter is thesystemd-cat pid
\cc @egernst as has some interest on get better debug/observability
\cc @jodh-intel is doing tracing and wrote kata collect script
\cc @cmaf is doing tracing
\cc @fidencio @chavafg @GabyCT that may face some issues in integrations in the past.
Please ping someone else that may be interested to get a nice sequence of events to debug specific kata issues using k8s (that is the main way to use kata today).
@bergwolf in case has a nice way to track this on debugging complex kata issues.
The text was updated successfully, but these errors were encountered: