Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate docker v1.9.0-rc1 release #16110

Closed
dchen1107 opened this issue Oct 22, 2015 · 29 comments · Fixed by #20867
Closed

Validate docker v1.9.0-rc1 release #16110

dchen1107 opened this issue Oct 22, 2015 · 29 comments · Fixed by #20867
Assignees
Labels
area/docker priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@dchen1107
Copy link
Member

.

@dchen1107 dchen1107 added area/docker sig/node Categorizes an issue or PR as relevant to SIG Node. labels Oct 22, 2015
@dchen1107
Copy link
Member Author

@vishh is this your turn?

@vishh
Copy link
Contributor

vishh commented Oct 22, 2015

Sure :) I will work on this!

On Thu, Oct 22, 2015 at 11:10 AM, Dawn Chen notifications@github.com
wrote:

@vishh https://github.com/vishh is this your turn?


Reply to this email directly or view it on GitHub
#16110 (comment)
.

@vishh vishh self-assigned this Nov 2, 2015
@vishh vishh added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Nov 2, 2015
@vishh
Copy link
Contributor

vishh commented Nov 2, 2015

Update: I'm testing v1.9.0~rc4. I'm running the following suite of tests (based on hack/jenkins/e2e.sh) - kubernetes-e2e-gce, kubernetes-e2e-gce-slow, kubernetes-e2e-gce-parallel.

@vishh
Copy link
Contributor

vishh commented Nov 2, 2015

Update: No tests failed in kubernetes-e2e-gce suite. This release has been looking good so far.

@vishh
Copy link
Contributor

vishh commented Nov 3, 2015

Update: No tests failed in kubernetes-e2e-gce-slow suite. 😃

@vishh
Copy link
Contributor

vishh commented Nov 3, 2015

Update: kubernetes-e2e-gce-parallel tests passed.

@brendandburns
Copy link
Contributor

Note, I'm think we want to set the --dns-opt flag from @thockin in moby/moby#16031

@vishh
Copy link
Contributor

vishh commented Nov 4, 2015

@brendandburns: What dns options have to be set?

@brendandburns
Copy link
Contributor

I would ask @thockin but I assume he added the flag for a reason ;)

@thockin
Copy link
Member

thockin commented Nov 4, 2015

we can use dns-opt when we know we don't have to fall-back on older docker
versions. We current have a (very clever) hack that works. We set the
ndots option.

On Tue, Nov 3, 2015 at 4:42 PM, Brendan Burns notifications@github.com
wrote:

I would ask @thockin https://github.com/thockin but I assume he added
the flag for a reason ;)


Reply to this email directly or view it on GitHub
#16110 (comment)
.

@feiskyer
Copy link
Member

feiskyer commented Nov 4, 2015

Docker has just released v1.9, so I think v1.9 should be validated on high priority.

@brendandburns
Copy link
Contributor

@feiskyer yep, validation is underway by @vishh. See above.

@vishh
Copy link
Contributor

vishh commented Nov 4, 2015

I have tested against most of the test suites. As of now there are no known issues. I'm planning on running soak tests next.

@vishh
Copy link
Contributor

vishh commented Nov 5, 2015

I haven't been able to find any issues. I will update HEAD to use docker v1.9 soon.

@dchen1107
Copy link
Member Author

FYI: moby/moby#17720 (Docker 1.9 serious performance issues)

@karlkfi
Copy link
Contributor

karlkfi commented Nov 12, 2015

Are the new /etc/hosts changes going to be an issue?

Do not update /etc/hosts for the default bridge network, except for links (#17325)

moby/moby#17325

@vishh
Copy link
Contributor

vishh commented Nov 12, 2015

@karlkfi: AFAIK, we want docker to not touch /etc/hosts once the container is created.

@thockin
Copy link
Member

thockin commented Nov 12, 2015

re:/etc/hosts - we have our workaround which should dodge it entirely.

On Thu, Nov 12, 2015 at 11:48 AM, Vish Kannan notifications@github.com
wrote:

@karlkfi https://github.com/karlkfi: AFAIK, we want docker to not
touch /etc/hosts once the container is created.


Reply to this email directly or view it on GitHub
#16110 (comment)
.

@pikeas
Copy link

pikeas commented Dec 8, 2015

+1, why is K8S still on docker 1.7.1?

@karlkfi
Copy link
Contributor

karlkfi commented Dec 8, 2015

Docker v1.9.1 was released a few weeks ago.

@yujuhong
Copy link
Contributor

FYI: moby/moby#17720 (Docker 1.9 serious performance issues)

@Random-Liu, it might be useful to run your mircobenchmark for docker v1.9 before we try to upgrade.

@vishh
Copy link
Contributor

vishh commented Dec 10, 2015

There are still some performance issues with v1.9.1. As of now I'm updating
HEAD to run docker v1.9.1 for now to help identify other potential issues.

On Thu, Dec 10, 2015 at 10:49 AM, Yu-Ju Hong notifications@github.com
wrote:

FYI: moby/moby#17720 moby/moby#17720
(Docker 1.9 serious performance issues)

@Random-Liu https://github.com/Random-Liu, it might be useful to run
your mircobenchmark for docker v1.9 before we try to upgrade.


Reply to this email directly or view it on GitHub
#16110 (comment)
.

@yujuhong
Copy link
Contributor

Can we make sure v1.9.1 can pass e2e tests reliably before upgrading? I am dreading the endless test failures that we'll have to triage at the end of the day. The performance benchmark, e.g., at least can give us some idea about how this is going to affect kubelet.

@vishh
Copy link
Contributor

vishh commented Dec 10, 2015

Yes. Of course!

On Thu, Dec 10, 2015 at 11:02 AM, Yu-Ju Hong notifications@github.com
wrote:

Can we make sure v1.9.1 can pass e2e tests reliably before upgrading? I am
dreading the endless test failures that we'll have to triage at the end of
the day. The performance benchmark, e.g., at least can give us some idea
about how this is going to affect kubelet.


Reply to this email directly or view it on GitHub
#16110 (comment)
.

@Random-Liu
Copy link
Member

@yujuhong I got the result I post before with docker 1.9:

  • Go Version: 1.5.1
  • Docker Version:
    • Client Version: 1.9.0, API version: 1.21
    • Server Version: 1.9.0, API version: 1.21

@karlkfi
Copy link
Contributor

karlkfi commented Dec 10, 2015

We are now running docker 1.9.1 on the Mesosphere K8s CI. However, this is a little ambiguous, because the mesos-slave-dind containers that run the kubelets still use docker 1.8.2.

This upgrade did, however, fix a regular problem we were having in our CI scripts on docker 1.7.1 where layer pulls would block and/or stall (without any other concurrent docker use).

@dchen1107
Copy link
Member Author

@vishh Besides the performance issue I brought up at #16110 (comment), there are a network related issue in docker 1.9.1 should affect k8s: cc/ @thockin @kubernetes/goog-cluster

moby/moby#18535 & moby/moby#18145

We should wait for 1.9.2 for whatever reason.

@vishh
Copy link
Contributor

vishh commented Dec 17, 2015

xref: https://github.com/docker/docker/milestones/1.9.2

On Mon, Dec 14, 2015 at 10:48 AM, Dawn Chen notifications@github.com
wrote:

@vishh https://github.com/vishh Besides the performance issue I brought
up at #16110 (comment)
#16110 (comment),
there are a network related issue in docker 1.9.1 should affect k8s: cc/
@thockin https://github.com/thockin @kubernetes/goog-cluster
https://github.com/orgs/kubernetes/teams/goog-cluster

moby/moby#18535 moby/moby#18535 &
moby/moby#18145 moby/moby#18145

We should wait for 1.9.2 for whatever reason.


Reply to this email directly or view it on GitHub
#16110 (comment)
.

@Random-Liu
Copy link
Member

Introduction
I benchmarked docker ps and docker inspect with the following 3 experiments:

  1. Do periodically docker ps and docker inspect, _gradually increase the dead container number and alive container number_.
  2. Do periodically docker ps and docker inspect, _gradually increase the operation interval_.
  3. Do periodically docker ps and docker inspect, _gradually increase the number of goroutines which doing docker ps and docker inspect_.

The metrics we measured:

  • CPU usage.
  • docker ps latency.
  • docker inspect latency.

Benchmark Environment

  • Cloud Provider: GCE
  • VM Instance: n1-standard-2 (2 vCPUs, 7.5 GB memory)
  • OS: Debian GNU/Linux 8.3 (jessie)
  • Docker version:
lantaol@docker-perf-1-8-3:~$ docker version
Client:
 Version:      1.8.3
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   f4bf5c7
 Built:        Mon Oct 12 05:27:08 UTC 2015
 OS/Arch:      linux/amd64
Server:
 Version:      1.8.3
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   f4bf5c7
 Built:        Mon Oct 12 05:27:08 UTC 2015
 OS/Arch:      linux/amd64
lantaol@docker-perf-1-9-1:~$ docker version
Client:
 Version:      1.9.1
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   a34a1d5
 Built:        Fri Nov 20 12:59:02 UTC 2015
 OS/Arch:      linux/amd64
Server:
 Version:      1.9.1
 API version:  1.21
 Go version:   go1.4.2
 Git commit:   a34a1d5
 Built:        Fri Nov 20 12:59:02 UTC 2015
 OS/Arch:      linux/amd64

Experiment 1

  • Workflow:
    start -> create containers -> periodically docker ps [all=true] -> periodically docker ps[all=false] -> periodically docker inspect -> create containers -> ... -> finish
  • Environment Variable
    • Operation interval: 200ms
  • Container Number
    varies_container
  • Latency
    • Docker 1.8.3
      latency-varies_container
    • Docker 1.9.1
      latency-varies_container
  • CPU Usage
    • Docker 1.8.3
      cpu
    • Docker 1.9.1
      cpu

Experiment 2

  • Workflow:
    start -> periodically docker ps [all=true] with increasing interval -> periodically docker ps [all=false] with increasing interval -> periodically docker inspect with increasing interval -> finish
  • Environment Variable
    • Dead container number: 600
    • Alive container number: 200
  • ps Interval
    list_alive
  • ps [all=true] Latency
    • Docker 1.8.3
      latency-list_all
    • Docker 1.9.1
      latency-list_all
  • ps [all=false] Latency
    • Docker 1.8.3
      latency-list_alive
    • Docker 1.9.1
      latency-list_alive
  • inspect Interval
    inspect
  • inspect Latency
    • Docker 1.8.3
      latency-inspect
    • Docker 1.9.1
      latency-inspect
  • CPU Usage
    • Docker 1.8.3
      cpu
    • Docker 1.9.1
      cpu

Experiment 3 Result

  • Workflow:
    start-> periodically docker ps [all=true] with increasing groutines number -> periodically docker inspect with increasing groutines number -> finish
  • Environment Variable
    • Dead container number: 600
    • Alive container number: 200
    • docker ps interval: 1s
    • docker inspect interval: 0.5s
  • Routine Number
    list_all
    The change rule of routine number is the same for docker ps and docker inspect.
  • ps [all=true] Latency
    • Docker 1.8.3
      latency-list_all
    • Docker 1.9.1
      latency-list_all
  • inspect Latency
    • Docker 1.8.3
      latency-inspect
    • Docker 1.9.1
      latency-inspect
  • CPU Usage
    • Docker 1.8.3
      cpu
    • Docker 1.9.1
      cpu

Conclusion
There is no significant performance difference for docker ps and docker inspect between docker 1.8.3 and docker 1.9.1.

Benchmark Tool
https://github.com/Random-Liu/docker_micro_benchmark

/cc @kubernetes/sig-node

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docker priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants