New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Validate Docker against the Docker API versions #53221

Open
yujuhong opened this Issue Sep 28, 2017 · 16 comments

Comments

Projects
None yet
@yujuhong
Contributor

yujuhong commented Sep 28, 2017

Forked from discussion in #42926 (comment) and #42926 (comment).

Problem

Questions were raised that the validated Docker version (1.13.1/17.06-CE) has already been EOL'd before kuberentes v1.8 is release. Although some users may get longer support terms from commercially -supported edition or vendor-provided version (COS image on GCP, RHEL, etc), this is not ideal for users who rely purely on docker community edition.

On the other hand, with the current kubernetes release cycle, I think chasing the 4-month docker support window is futile. Even if we had validated docker 17.06-CE with Kubernetes 1.8, it'd still be EOL'd a month after 1.8 is out. Instead, we should shift the focus from Docker engine version to Docker API version as @euank suggested.

Background

With each new Docker engine version, Docker publishes a new API version (see the lookup table). Right now, kubelet/dockershim always uses the latest Docker API version that the engine supports, but in fact, each engine supports multiple API versions.

For example, with Docker 1.13, the latest, supported API version is 1.26

$ docker version
Server:
 Version:      1.13.1
 API version:  1.26 (minimum version 1.12)
 Go version:   go1.8.1
 Git commit:   092cba3
 Built:        Wed Aug 30 20:31:05 2017
 OS/Arch:      linux/amd64
 Experimental: false

The following two requests (using two different API versions) both work against the 1.13.1 engine.

GET /v1.26/containers/json
GET /v1.23/containers/json

Proposed

We should allow users who want to upgrade their Docker CE version for security patches (not features) an option to do so. On the other hand, it's still valuable to publish the Docker version that was used in the validation test.

  1. Publish the MIN_ and MAX_DOCKER_API_VERSION validated against the current Kubernetes version.
    • Set the MAX_DOCKER_API_VERSION to use in the docker client in kubelet/dockershim. Kubelet will use the the minimum of (MAX_DOCKER_API_VERSION, newest_api_version_supported_by_the_engine) to talk to the docker engine.
    • Add a flag in kubelet/dockershim so that users can override the MAX_DOCKER_API_VERSION if they desire.
  2. Publish the Docker engine versions validated with each OS distro in some form. This information is still valuable since efforts have been put in each docker validation to discover, document, and work around bugs.
  • For users who want to upgrade their Docker CE version for security patches, they are free to do so because kubelet/dockershim will simply use the validated MAX_DOCKER_API_VERSION to communicate with Docker. Of course, users'd need to bear the risk of hitting bugs in the new docker versions, but this is well-known and they can always reach out to the Docker community to get the bugs fixed.
  • For users who want to try or test newer Docker API version, they can restart kubelet and override the setting with --max-docker-api-version

Caveats

Kubernetes may get more user issues for Docker engine versions that were not really validated against the release (through e2e test). I think it's a reasonable tradeoff and most issues should be redirected to the Docker community since kubelet uses a fixed API version.

Note: the current docker version client configuration has a bug, need to re-vender to include the fix moby/moby#35008

/cc @kubernetes/sig-node-proposals @dchen1107 @derekwaynecarr

@dchen1107

This comment has been minimized.

Member

dchen1107 commented Sep 28, 2017

Thanks for filing the proposal, @yujuhong. I included this topic for next week's sig-node. cc/ @euank could you also join the sig-node meeting next Tuesday at 10:00am?

@dchen1107

This comment has been minimized.

Member

dchen1107 commented Sep 28, 2017

cc/ @jdumars 1.8 release manager for the inputs on the release quality.

@jdumars

This comment has been minimized.

Member

jdumars commented Sep 28, 2017

I'll try and attend the SIG meeting if I can. At first glance, I think this seems very reasonable. It might be worth review in SIG Architecture as well. @bgrant0607 and @jbeda might have some additional input.

@yujuhong

This comment has been minimized.

Contributor

yujuhong commented Sep 28, 2017

@bgrant0607

This comment has been minimized.

Member

bgrant0607 commented Sep 28, 2017

@jdumars This is squarely within the domain of SIG Node.

@calebamiles

This comment has been minimized.

Member

calebamiles commented Oct 3, 2017

cc: @philips

@yujuhong

This comment has been minimized.

Contributor

yujuhong commented Oct 9, 2017

This was discussed in the sig node meeting last week (10/03), which I unfortunately had to miss.
Here's the derived summary from the meeting notes (please correct me if I am wrong).

  1. In general, validating container runtime versions should not be the SIG node's responsibility. Owners of individual CRI integration are responsible for the validation, and a common, portable validation framework/suite (e.g., CRI validation suite) will be provided.
  2. SIG node will continue carrying on the work to validate Docker (for now) to prevent regression for the users.
  3. No objection to changing to validating the Docker API version in Q4.

Marking this issue for 1.9 tentatively.

@k8s-merge-robot

This comment has been minimized.

Contributor

k8s-merge-robot commented Oct 13, 2017

[MILESTONENOTIFIER] Milestone Removed

@yujuhong

Important: This issue was missing labels required for the v1.9 milestone for more than 3 days:

kind: Must specify exactly one of kind/bug, kind/cleanup or kind/feature.

Help
@yujuhong

This comment has been minimized.

Contributor

yujuhong commented Dec 21, 2017

Marking 1.10 tentatively, pending on more discussion in a follow-up sig-node meeting. Volunteers to work on this are welcome :-)

@jberkus

This comment has been minimized.

jberkus commented Feb 21, 2018

@yujuhong is this on track for 1.10? I see no activity on it in a month.

@yujuhong yujuhong removed this from the v1.10 milestone Feb 26, 2018

@yujuhong

This comment has been minimized.

Contributor

yujuhong commented Feb 26, 2018

No one had spare cycles to work on this for 1.10. Removed the milestone.

@fejta-bot

This comment has been minimized.

fejta-bot commented May 27, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@redbaron

This comment has been minimized.

Contributor

redbaron commented May 27, 2018

/remove-lifecycle stale

@dims

This comment has been minimized.

Member

dims commented Jul 20, 2018

fyi "need to re-vender to include the fix moby/moby#35008" is being handled in #64283

@dims

This comment has been minimized.

@fejta-bot

This comment has been minimized.

fejta-bot commented Oct 21, 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment