Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sysdump: run bugtool even when cilium agent is not running #970

Merged
merged 2 commits into from
Jul 14, 2022

Conversation

squeed
Copy link
Contributor

@squeed squeed commented Jul 13, 2022

When we try and run cilium-bugtool in the agent container, it fails if that container is not running (e.g. because it's in CrashLoopBackoff). So, if that's not running, fall back to an EphemeralContainer or, if that fails, create another pod for exec purposes.

Fixes: #871

squeed added 2 commits July 13, 2022 13:00
Now that bugtool supports directly generating tar.gz files, do that and
remove the explicit extra step.

Signed-off-by: Casey Callendrello <cdc@isovalent.com>
When we try and run cilium-bugtool in the agent container, it fails if
that container is not running (e.g. because it's in CrashLoopBackoff).
So, if that's not running, fall back to an EphemeralContainer or, if
that fails, create another pod for exec purposes.

Fixes: cilium#871
Signed-off-by: Casey Callendrello <cdc@isovalent.com>
@squeed squeed requested a review from a team July 13, 2022 15:52
@squeed squeed requested a review from a team as a code owner July 13, 2022 15:52
@squeed squeed requested a review from nathanjsweet July 13, 2022 15:52
@squeed squeed temporarily deployed to ci July 13, 2022 15:52 Inactive
@tklauser tklauser requested review from tklauser and sayboras and removed request for nathanjsweet July 13, 2022 15:59
Copy link
Member

@tklauser tklauser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thank you!

Copy link
Member

@sayboras sayboras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 💯


// CreateEphemeralContainer will create a EphemeralContainer (debug container) in the specified pod.
// EphemeralContainers are special containers which can be added after-the-fact in running pods. They're
// useful for debugging, either when the target container image doesn't have necessary tools, or because
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when the target container image doesn't have necessary tools

This could be a good usage for us going forward, a few customers are asking if cilium agent can be with scratch/distroless/minimal base image. Right now, we rely on installing new tools (e.g. tcpdump) some one cases during troubleshooting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it can be a big space savings, but it will be a long time before we can rely on that feature existing. Right now it's in beta, auto-enabled but behind an optional feature gate. It is scheduled to go GA in 1.25.

@tklauser tklauser merged commit 35987b0 into cilium:master Jul 14, 2022
Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Thanks @squeed!

One small concern below on the first commit.

sysdump/sysdump.go Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sysdump: Collect bugtool report if agent is crashing
4 participants