New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR: New kubectl command `kubectl debug` #45922

Closed
verb opened this Issue May 17, 2017 · 16 comments

Comments

Projects
None yet
5 participants
@verb
Contributor

verb commented May 17, 2017

Is this a BUG REPORT or FEATURE REQUEST? (choose one): FEATURE REQUEST

SIG Node is working on new functionality (feature: kubernetes/enhancements#277 , proposal: #35584) to execute a "Debug Container" in the context of a running pod for the purposes of troubleshooting. This issue is to discuss changes needed to kubectl to surface this feature.

We're aiming for an alpha release in 1.7 that includes basic functionality. As written, the proposal in #35584 calls for a new command, kubectl debug, that resembles kubectl exec. An example run is:

% kubectl debug -it --image alpine $POD -c $DEBUG_CONTAINER_NAME -- sh
/ # ps x
PID   USER     TIME   COMMAND
    1 root       0:00 /pause
   13 root       0:00 /app
   26 root       0:00 sh
   32 root       0:00 ps x
/ #

Ideally we could have reasonable defaults so that the minimum command would be something like:

% kubectl debug -it
Defaulting container name to debug.
/ # 

One departure from kubectl exec is that kubectl debug supports reattaching to a running Debug Container with a command like kubectl debug --reattach $POD $CONTAINER_NAME

Options for the command would resemble:

Execute a Debug Container in a Pod.

Options:
  -c, --container='': Name of container to create. If omitted, the default "debug" will be used.
  -i, --stdin=false: Pass stdin to the container
  -m, --image='': Container Image to use when creating debug container. If omitted, the cluster default is used.
  -t, --tty=false: Stdin is a TTY

Usage:
  kubectl debug POD [-m IMAGE ] [-c CONTAINER] -- COMMAND [args...] [options]

Since this is an alpha feature it might be nice to be able to hide it in kubectl unless alpha features are enabled. It will return an error on clusters with alpha features disabled (the default).

/cc @pwittrock

@pwittrock

This comment has been minimized.

Member

pwittrock commented May 17, 2017

How are we going to hide this? I can think of a couple ways:

  • put it under and alpha subcommand as is done in gcloud
  • only expose it if a --alpha flag is provided

FYI, code freeze for 1.7 is a week from tomorrow, so this will need to be fully reviewed, approved and have all checks passing by then to make it into 1.7.

You probably want to send an MVP for review ASAP, so follow up later with anything that is non-trivial to implement and not part of the MVP.

@verb

This comment has been minimized.

Contributor

verb commented May 17, 2017

ack, thanks. Will dig into which is easier and create an MVP ASAP.

@verb

This comment has been minimized.

Contributor

verb commented May 20, 2017

Thinking about this a little bit more, it would be nice to integrate the same feature gates used with the rest of kubernetes into kubectl. I was going to add a flag to set feature gates, but this has a couple of downsides:

  1. As a flag the feature gates wouldn't be able to influence behavior of other flags, and kubectl has plenty of complexity prior to flag parsing.
  2. It would be nice if enabling alpha features was more persistent than a flag (which isn't a concern for the other long running daemons).

I settled on an environment variable as a decent way to do this and opened #46151. Using this, kubectl debug would be hidden and inaccessible without the alpha feature enabled by an environment variable.

vdemeester pushed a commit to vdemeester/kubernetes that referenced this issue Jun 23, 2017

Merge pull request kubernetes#46151 from verb/kubectl-featuregate
Automatic merge from submit-queue

Add alpha command to kubectl

Also allow new commands to disable themselves by returning a nil value. This can be used to disable commands based on feature gates.

**What this PR does / why we need it**: Method of enabling alpha functionality in kubectl

**Which issue this PR fixes** *(optional, in `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close that issue when PR gets merged)*: ref kubernetes#45922

**Special notes for your reviewer**: Part of a discussion in kubernetes#45922 with @pwittrock

**Release note**:

```release-note
NONE
```
@k8s-ci-robot

This comment has been minimized.

Contributor

k8s-ci-robot commented Sep 13, 2017

@verb: GitHub didn't allow me to assign the following users: aaron-prindle.

Note that only kubernetes members can be assigned.

In response to this:

/assign @aaron-prindle

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@verb

This comment has been minimized.

Contributor

verb commented Sep 13, 2017

@aaron-prindle is interested in working on this.

/assign @verb

@fejta-bot

This comment has been minimized.

fejta-bot commented Jan 5, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@verb

This comment has been minimized.

Contributor

verb commented Jan 6, 2018

/remove-lifecycle stale

@t3hmrman

This comment has been minimized.

t3hmrman commented Feb 22, 2018

Hey I just ran into this, I'm running cri-containerd(+containerd), and the direct docker exec hack of course doesn't work for me. I'm on k8s v1.9.2 actually but instead of enabling the experimental debug option, I just went with adding my own little alpine container:

      - name: debug
        image: alpine
        command: ["/bin/sh"]
        args: ["-c", "sleep 100000000"]

And then kubectl apply -f pod.yaml and watch get spun up then go in. This helped in my case since I was trying to be "amongst" the other containers to observe them, but obviously wouldn't help you if you absolutely needed to get into the one specific container as root...

@fejta-bot

This comment has been minimized.

fejta-bot commented May 23, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@t3hmrman

This comment has been minimized.

t3hmrman commented May 23, 2018

Seems like this would be a good fit for a kubectl plugin

@verb

This comment has been minimized.

Contributor

verb commented May 23, 2018

@t3hmrman oh that's neat, I didn't know about the plugins, I don't think they were around when I first added kubectl alpha to support this sort of thing. Thanks for the heads up, I'll dig into the plugins docs.

/remove-lifecycle stale

@t3hmrman

This comment has been minimized.

t3hmrman commented May 24, 2018

@verb I'm also think they weren't -- I'm not sure when kubectl plugins became a thing, but I'm really hoping other operators will use them instead of building their own CLI tools. Not sure if it completely satisfies the needs of this ticket, but thought it was worth a mention!

Currently I avoid operators that introduce their own CLI tools to manage resources (so for a concrete example, zalando/patroni over CrunchyData/postgres-operator)

@fejta-bot

This comment has been minimized.

fejta-bot commented Aug 22, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@fejta-bot

This comment has been minimized.

fejta-bot commented Sep 21, 2018

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@fejta-bot

This comment has been minimized.

fejta-bot commented Oct 21, 2018

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot

This comment has been minimized.

Contributor

k8s-ci-robot commented Oct 21, 2018

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment