Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add debug by copy support for kubectl alpha debug command #90094

Merged
merged 1 commit into from
Jul 9, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions staging/src/k8s.io/kubectl/pkg/cmd/debug/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ go_library(
"//vendor/github.com/docker/distribution/reference:go_default_library",
"//vendor/github.com/spf13/cobra:go_default_library",
"//vendor/k8s.io/klog/v2:go_default_library",
"//vendor/k8s.io/utils/pointer:go_default_library",
],
)

Expand All @@ -55,7 +56,9 @@ go_test(
embed = [":go_default_library"],
deps = [
"//staging/src/k8s.io/api/core/v1:go_default_library",
"//staging/src/k8s.io/apimachinery/pkg/apis/meta/v1:go_default_library",
"//staging/src/k8s.io/cli-runtime/pkg/genericclioptions:go_default_library",
"//vendor/github.com/google/go-cmp/cmp:go_default_library",
"//vendor/k8s.io/utils/pointer:go_default_library",
],
)
230 changes: 169 additions & 61 deletions staging/src/k8s.io/kubectl/pkg/cmd/debug/debug.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ import (

"github.com/docker/distribution/reference"
"github.com/spf13/cobra"
"k8s.io/klog/v2"

corev1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/errors"
Expand All @@ -37,7 +38,6 @@ import (
corev1client "k8s.io/client-go/kubernetes/typed/core/v1"
"k8s.io/client-go/tools/cache"
watchtools "k8s.io/client-go/tools/watch"
"k8s.io/klog/v2"
"k8s.io/kubectl/pkg/cmd/attach"
"k8s.io/kubectl/pkg/cmd/exec"
"k8s.io/kubectl/pkg/cmd/logs"
Expand All @@ -47,6 +47,7 @@ import (
"k8s.io/kubectl/pkg/util/i18n"
"k8s.io/kubectl/pkg/util/interrupt"
"k8s.io/kubectl/pkg/util/templates"
"k8s.io/utils/pointer"
)

var (
Expand All @@ -59,26 +60,39 @@ var (

# Create a debug container named debugger using a custom automated debugging image.
# (requires the EphemeralContainers feature to be enabled in the cluster)
kubectl alpha debug --image=myproj/debug-tools -c debugger mypod`))
kubectl alpha debug --image=myproj/debug-tools -c debugger mypod

# Create a debug container as a copy of the original Pod and attach to it
kubectl alpha debug mypod -it --image=busybox --copy-to=my-debugger

# Create a copy of mypod named my-debugger with my-container's image changed to busybox
kubectl alpha debug mypod --image=busybox --container=my-container --copy-to=my-debugger -- sleep 1d
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I'm looking at this I found it weird and hard to decipher. Have a look at how we do something similar in kubectl set image:

# Update image of all containers of daemonset abc to 'nginx:1.9.1'
kubectl set image daemonset abc *=nginx:1.9.1

How about supporting something similar here as well. Iow. when I pass --image=busybox it applies to all, similarly --image=*=busybox. But --image=my-container=busybox would apply that image only for that. @aylei @verb wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, current syntax works better for debug. A common use case of debug is to add a "debug" container, either by making a copy or adding an ephemeral container, to the pod so that user can diagnose their business containers with the tools installed in the debug container. So usually debug does't have to and also shouldn't mutate the image of other business containers. And --image, in semantic, is always the container image to use for the debug container.

--container is only used to specify the name of the debug container before debug by copy is added. When debug by copy, debug will change the image of an existing container if --container is set to an existing on deliberately, I think it is a not a common use case of debug and is only implemented for completeness.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two debugging scenarios I hoped to address with debug by copy:

  1. The same as addressed in debug by ephemeral container, but ephemeral containers are unavailable or the operator prefers to operate on a copy.
  2. A primary container in the pod is crashlooping and I want to replace the container image with (for example) busybox so I can examine the environment with a shell.

I suspect it would be uncommon to change multiple or all container images at once. Rather than addressing that directly maybe it'd be better to have an --edit that opens an editor prior to create, similar to kubectl edit, which would be a useful catch-all for arbitrary changes.

The mutate specification of kubectl set image seems more generically useful, but I'm inclined to keep kubectl debug focused on specific debugging journeys. It would be easy to add --image=*=busybox functionality in the future while remaining backwards compatible with current "only modify a single container" behavior, so I'd omit it for now.

It is a bit hard to decipher, though, and I think it's because it's unclear which flags operate on pod names and which operate on containers. Should we rename --copy-to to --copy-pod-to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO, --copy-to is simpler but still accurate enough, since the object of debug by copy is exactly a Pod.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The form I'm suggesting is addressing that particular use-case you're talking about @verb. If you have a pod with multiple containers how do you specify which one you replace with the above syntax? With the one I'm proposing it's explicit. I'd prefer not to have --edit option, that seems like an overkill. @aylei I agree a Pod's copy, but you are mutating the copy with your own image, and you need a way to say which image and where. As long as you have just one container, you're fine. With multiple it becomes a problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, SGTM. I don't feel strongly, and being able to mutate multiple images is undoubtedly useful. It's nice to use the same mutation language as other parts of kubectl.

Should we allow --container && --image mutation syntax at the same time? That is, can we add a new container and change the image of the existing ones with a command line:

kubectl alpha debug mypod --copy-to=my-debugger --container=new-container --image=new-container=busybox,old-container=debug-image

Since debug sometimes adds containers and sometimes changes existing containers, it might be less error prone to require the mutation syntax like --image=*=new-image for changes to existing images rather than have a default mutate action for --image=new-image.

`))
)

var nameSuffixFunc = utilrand.String

// DebugOptions holds the options for an invocation of kubectl debug.
type DebugOptions struct {
Args []string
ArgsOnly bool
Attach bool
Container string
Env []corev1.EnvVar
Image string
Interactive bool
Namespace string
TargetNames []string
PullPolicy corev1.PullPolicy
Quiet bool
Target string
TTY bool
Args []string
ArgsOnly bool
Attach bool
Container string
CopyTo string
Replace bool
Env []corev1.EnvVar
Image string
Interactive bool
Namespace string
TargetNames []string
PullPolicy corev1.PullPolicy
Quiet bool
SameNode bool
ShareProcesses bool
Target string
TTY bool

shareProcessedChanged bool

builder *resource.Builder
podClient corev1client.PodsGetter
Expand All @@ -89,9 +103,10 @@ type DebugOptions struct {
// NewDebugOptions returns a DebugOptions initialized with default values.
func NewDebugOptions(streams genericclioptions.IOStreams) *DebugOptions {
return &DebugOptions{
Args: []string{},
IOStreams: streams,
TargetNames: []string{},
Args: []string{},
IOStreams: streams,
TargetNames: []string{},
ShareProcesses: true,
}
}

Expand Down Expand Up @@ -120,12 +135,16 @@ func addDebugFlags(cmd *cobra.Command, opt *DebugOptions) {
cmd.Flags().BoolVar(&opt.ArgsOnly, "arguments-only", opt.ArgsOnly, i18n.T("If specified, everything after -- will be passed to the new container as Args instead of Command."))
cmd.Flags().BoolVar(&opt.Attach, "attach", opt.Attach, i18n.T("If true, wait for the Pod to start running, and then attach to the Pod as if 'kubectl attach ...' were called. Default false, unless '-i/--stdin' is set, in which case the default is true."))
cmd.Flags().StringVarP(&opt.Container, "container", "c", opt.Container, i18n.T("Container name to use for debug container."))
cmd.Flags().StringVar(&opt.CopyTo, "copy-to", opt.CopyTo, i18n.T("Create a copy of the target Pod with this name."))
cmd.Flags().BoolVar(&opt.Replace, "replace", opt.Replace, i18n.T("When used with '--copy-to', delete the original Pod"))
cmd.Flags().StringToString("env", nil, i18n.T("Environment variables to set in the container."))
cmd.Flags().StringVar(&opt.Image, "image", opt.Image, i18n.T("Container image to use for debug container."))
cmd.MarkFlagRequired("image")
cmd.Flags().String("image-pull-policy", string(corev1.PullIfNotPresent), i18n.T("The image pull policy for the container."))
cmd.Flags().BoolVarP(&opt.Interactive, "stdin", "i", opt.Interactive, i18n.T("Keep stdin open on the container(s) in the pod, even if nothing is attached."))
cmd.Flags().BoolVar(&opt.Quiet, "quiet", opt.Quiet, i18n.T("If true, suppress prompt messages."))
cmd.Flags().BoolVar(&opt.SameNode, "same-node", opt.SameNode, i18n.T("Schedule the copy of target Pod on the same node."))
cmd.Flags().BoolVar(&opt.ShareProcesses, "share-processes", opt.ShareProcesses, i18n.T("When used with '--copy-to', enable process namespace sharing in the copy."))
cmd.Flags().StringVar(&opt.Target, "target", "", i18n.T("Target processes in this container name."))
cmd.Flags().BoolVarP(&opt.TTY, "tty", "t", opt.TTY, i18n.T("Allocated a TTY for each container in the pod."))
}
Expand Down Expand Up @@ -173,6 +192,9 @@ func (o *DebugOptions) Complete(f cmdutil.Factory, cmd *cobra.Command, args []st
}
o.podClient = clientset.CoreV1()

// Share processes
o.shareProcessedChanged = cmd.Flags().Changed("share-processes")

return nil
}

Expand Down Expand Up @@ -273,9 +295,17 @@ func (o *DebugOptions) Run(f cmdutil.Factory, cmd *cobra.Command) error {

// visitPod handles debugging for pod targets by (depending on options):
// 1. Creating an ephemeral debug container in an existing pod, OR
// 2. Making a copy of pod with certain attributes changed (NOT YET IMPLEMENTED)
// 2. Making a copy of pod with certain attributes changed
// visitPod returns a pod and debug container name for subsequent attach, if applicable.
func (o *DebugOptions) visitPod(ctx context.Context, pod *corev1.Pod) (*corev1.Pod, string, error) {
if len(o.CopyTo) > 0 {
return o.debugByCopy(ctx, pod)
}
return o.debugByEphemeralContainer(ctx, pod)
}

// debugByEphemeralContainer runs an EphemeralContainer in the target Pod for use as a debug container
func (o *DebugOptions) debugByEphemeralContainer(ctx context.Context, pod *corev1.Pod) (*corev1.Pod, string, error) {
pods := o.podClient.Pods(pod.Namespace)
ec, err := pods.GetEphemeralContainers(ctx, pod.Name, metav1.GetOptions{})
if err != nil {
Expand All @@ -298,34 +328,26 @@ func (o *DebugOptions) visitPod(ctx context.Context, pod *corev1.Pod) (*corev1.P
return pod, debugContainer.Name, nil
}

func containerNames(pod *corev1.Pod) map[string]bool {
names := map[string]bool{}
for _, c := range pod.Spec.Containers {
names[c.Name] = true
}
for _, c := range pod.Spec.InitContainers {
names[c.Name] = true
// debugByCopy runs a copy of the target Pod with a debug container added or an original container modified
func (o *DebugOptions) debugByCopy(ctx context.Context, pod *corev1.Pod) (*corev1.Pod, string, error) {
copied, dc := o.generatePodCopyWithDebugContainer(pod)
copied, err := o.podClient.Pods(copied.Namespace).Create(ctx, copied, metav1.CreateOptions{})
if err != nil {
return nil, "", err
}
for _, c := range pod.Spec.EphemeralContainers {
names[c.Name] = true
if o.Replace {
err := o.podClient.Pods(pod.Namespace).Delete(ctx, pod.Name, *metav1.NewDeleteOptions(0))
if err != nil {
return nil, "", err
}
}
return names
return copied, dc, nil
}

// generateDebugContainer returns an EphemeralContainer suitable for use as a debug container
// in the given pod.
func (o *DebugOptions) generateDebugContainer(pod *corev1.Pod) *corev1.EphemeralContainer {
name := o.Container
if len(name) == 0 {
cn, existing := "", containerNames(pod)
for len(cn) == 0 || existing[cn] {
cn = fmt.Sprintf("debugger-%s", nameSuffixFunc(5))
}
if !o.Quiet {
fmt.Fprintf(o.ErrOut, "Defaulting debug container name to %s.\n", cn)
}
name = cn
}
name := o.computeDebugContainerName(pod)

ec := &corev1.EphemeralContainer{
EphemeralContainerCommon: corev1.EphemeralContainerCommon{
Expand All @@ -349,8 +371,87 @@ func (o *DebugOptions) generateDebugContainer(pod *corev1.Pod) *corev1.Ephemeral
return ec
}

// waitForEphemeralContainer watches the given pod until the ephemeralContainer is running
func waitForEphemeralContainer(ctx context.Context, podClient corev1client.PodsGetter, ns, podName, ephemeralContainerName string) (*corev1.Pod, error) {
// generatePodCopy takes a Pod and returns a copy and the debug container name of that copy
func (o *DebugOptions) generatePodCopyWithDebugContainer(pod *corev1.Pod) (*corev1.Pod, string) {
copied := &corev1.Pod{
ObjectMeta: metav1.ObjectMeta{
Name: o.CopyTo,
Namespace: pod.Namespace,
Annotations: pod.Annotations,
},
Spec: *pod.Spec.DeepCopy(),
}
// change ShareProcessNamespace configuration only when commanded explicitly
if o.shareProcessedChanged {
copied.Spec.ShareProcessNamespace = pointer.BoolPtr(o.ShareProcesses)
}
if !o.SameNode {
copied.Spec.NodeName = ""
}

containerByName := containerNameToRef(copied)
c, containerExists := containerByName[o.Container]
// Add a new container if the specified container does not exist
if !containerExists {
name := o.computeDebugContainerName(copied)
c = &corev1.Container{Name: name}
// envs are customizable when adding new container
c.Env = o.Env
}
c.Image = o.Image
c.ImagePullPolicy = o.PullPolicy
c.Stdin = o.Interactive
c.TerminationMessagePolicy = corev1.TerminationMessageReadFile
c.TTY = o.TTY
if o.ArgsOnly {
c.Args = o.Args
} else {
c.Command = o.Args
c.Args = nil
}
if !containerExists {
copied.Spec.Containers = append(copied.Spec.Containers, *c)
}
return copied, c.Name
}

func (o *DebugOptions) computeDebugContainerName(pod *corev1.Pod) string {
if len(o.Container) > 0 {
return o.Container
}
name := o.Container
if len(name) == 0 {
aylei marked this conversation as resolved.
Show resolved Hide resolved
cn, containerByName := "", containerNameToRef(pod)
for len(cn) == 0 || (containerByName[cn] != nil) {
cn = fmt.Sprintf("debugger-%s", nameSuffixFunc(5))
}
if !o.Quiet {
fmt.Fprintf(o.ErrOut, "Defaulting debug container name to %s.\n", cn)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't catch it last time, but seems weird that it's on ErrOut.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the correct behavior here is for output from kubectl and the container to use separate channels. For example, kubectl debug mypod --image=busybox -- ifconfig -a > interfaces.txt shouldn't write "Defaulting debug container..." to interfaces.txt.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's where you could use -v=1 instead, or --quiet rather than printing on stderr something that is not an error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth printing the message by default to make it easier to reconnect (and in the case of generating a pod name for node debugging, there's no other way to tell what the name is), but it's a good point that you can accomplish the same thing with --quiet.

The Unix user in me still thinks that stderr is correct here, but it's more important that kubectl is consistent. I can update this in my pending PR.

}
name = cn
}
return name
}

func containerNameToRef(pod *corev1.Pod) map[string]*corev1.Container {
names := map[string]*corev1.Container{}
for i := range pod.Spec.Containers {
ref := &pod.Spec.Containers[i]
names[ref.Name] = ref
}
for i := range pod.Spec.InitContainers {
ref := &pod.Spec.Containers[i]
names[ref.Name] = ref
}
for i := range pod.Spec.EphemeralContainers {
ref := &pod.Spec.Containers[i]
names[ref.Name] = ref
}
return names
}

// waitForContainer watches the given pod until the container is running
func waitForContainer(ctx context.Context, podClient corev1client.PodsGetter, ns, podName, containerName string) (*corev1.Pod, error) {
// TODO: expose the timeout
ctx, cancel := watchtools.ContextWithOptionalTimeout(ctx, 0*time.Second)
defer cancel()
Expand Down Expand Up @@ -381,17 +482,14 @@ func waitForEphemeralContainer(ctx context.Context, podClient corev1client.PodsG
return false, fmt.Errorf("watch did not return a pod: %v", ev.Object)
}

for _, s := range p.Status.EphemeralContainerStatuses {
if s.Name != ephemeralContainerName {
continue
}

klog.V(2).Infof("debug container status is %v", s)
if s.State.Running != nil || s.State.Terminated != nil {
return true, nil
}
s := getContainerStatusByName(p, containerName)
if s == nil {
return false, nil
}
klog.V(2).Infof("debug container status is %v", s)
if s.State.Running != nil || s.State.Terminated != nil {
return true, nil
}

return false, nil
})
if ev != nil {
Expand All @@ -403,26 +501,24 @@ func waitForEphemeralContainer(ctx context.Context, podClient corev1client.PodsG
return result, err
}

// TODO(verb): handle other types of containers
func handleAttachPod(ctx context.Context, f cmdutil.Factory, podClient corev1client.PodsGetter, ns, podName, ephemeralContainerName string, opts *attach.AttachOptions) error {
pod, err := waitForEphemeralContainer(ctx, podClient, ns, podName, ephemeralContainerName)
func handleAttachPod(ctx context.Context, f cmdutil.Factory, podClient corev1client.PodsGetter, ns, podName, containerName string, opts *attach.AttachOptions) error {
pod, err := waitForContainer(ctx, podClient, ns, podName, containerName)
if err != nil {
return err
}

opts.Namespace = ns
opts.Pod = pod
opts.PodName = podName
opts.ContainerName = ephemeralContainerName
opts.ContainerName = containerName
if opts.AttachFunc == nil {
opts.AttachFunc = attach.DefaultAttachFunc
}

var status *corev1.ContainerStatus
for i := range pod.Status.EphemeralContainerStatuses {
if pod.Status.EphemeralContainerStatuses[i].Name == ephemeralContainerName {
status = &pod.Status.EphemeralContainerStatuses[i]
}
status := getContainerStatusByName(pod, containerName)
if status == nil {
// impossible path
return fmt.Errorf("Error get container status of %s: %+v", containerName, err)
}
if status.State.Terminated != nil {
klog.V(1).Info("Ephemeral container terminated, falling back to logs")
Expand All @@ -436,6 +532,18 @@ func handleAttachPod(ctx context.Context, f cmdutil.Factory, podClient corev1cli
return nil
}

func getContainerStatusByName(pod *corev1.Pod, containerName string) *corev1.ContainerStatus {
allContainerStatus := [][]corev1.ContainerStatus{pod.Status.InitContainerStatuses, pod.Status.ContainerStatuses, pod.Status.EphemeralContainerStatuses}
for _, statusSlice := range allContainerStatus {
for i := range statusSlice {
if statusSlice[i].Name == containerName {
return &statusSlice[i]
}
}
}
return nil
}

// logOpts logs output from opts to the pods log.
func logOpts(restClientGetter genericclioptions.RESTClientGetter, pod *corev1.Pod, opts *attach.AttachOptions) error {
ctrName, err := opts.GetContainerName(pod)
Expand Down
Loading