-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flaky test: TestDockerNetworkHostModeUngracefulDaemonRestart #19368
Comments
Just had this one again with gccgo https://jenkins.dockerproject.org/job/Docker-PRs-gccgo/748/console |
|
Thanks @thaJeztah. I was able to get it to fail locally at least once last week, but I'll have to try again since I keep getting sidetracked and I've since lost that container. |
This also fails on ARM with golang. https://jenkins.dockerproject.org/job/Docker-PRs-arm/97/console |
some debug info
|
So apparently @tophj-ibm can recreate this a lot more easily than I can, but I put in an inaccurate debug message. It should be an "inspect error," not a "start error," in case that confuses anyone. |
Hitting this again; https://jenkins.dockerproject.org/job/Docker-PRs-gccgo/1386/consoleFull |
@thaJeztah i will look into it. |
I've been looking into this, and I've seen it fail in two ways: 1) the container doesn't exist, or 2) the container isn't started. It's always the last container that's the problem. The Start() function just checks to see if the daemon responding to requests. So we can either add a sleep to the test, or rework Start() a bit. I have a feeling that reworking Start() might also require adding some logic in the daemon itself. |
@cinperez thanks. am trying to understand why this is seen almost consistently for gccgo CI but not on other CI runs. |
@mavenugo Go code compiled with gccgo isn't as optimized as go code compile with gc (and there may be some things we can add to our builds but I'm not sure anyone has dug into it much), so things run more slowly. I've seen that pretty consistently. That doesn't prove that this issue is a timing issue, but it could be why we see it on gccgo only. |
Hm @tiborvass, @mavenugo, looks like this failed again on gccgo: https://jenkins.dockerproject.org/job/Docker%20Master%20%28gccgo%29/1584/consoleFull le sigh |
Just starting to relook into this issue. I'm getting this error when trying to load the last container that was originally started. (not necessarily the last container to be restarted) Failed to load container 7d00772d8d3242210243177bc2142f0b406008db6fc27e6b41a3ec5e9119d555: EOF It's possible the daemon kill is happening before the container has fully started, I'll continue to investigate. |
Fixes moby#19368 by waiting until all container statuses are running before killing the daemon Signed-off-by: Christopher Jones <tophj@linux.vnet.ibm.com>
Description of problem:
This one has failed a few times in the past couple of days with gccgo:
https://jenkins.dockerproject.org/job/Docker%20Master%20%28gccgo%29/1285/consoleFull
https://jenkins.dockerproject.org/job/Docker%20Master%20%28gccgo%29/1287/consoleFull
https://jenkins.dockerproject.org/job/Docker%20Master%20%28gccgo%29/1294/consoleFull
docker version
:1.10.10-dev (latest upstream)
docker info
:❗ This is to make Gordon happy, and should be ignored since it's from my laptop (running in a container), not one of Docker's test nodes.
./docker info
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Server Version: 1.10.0-dev
Storage Driver: devicemapper
Pool Name: docker-253:2-5914896-pool
Pool Blocksize: 65.54 kB
Base Device Size: 10.74 GB
Backing Filesystem: xfs
Data file: /dev/loop2
Metadata file: /dev/loop3
Data Space Used: 11.8 MB
Data Space Total: 107.4 GB
Data Space Available: 84.95 GB
Metadata Space Used: 581.6 kB
Metadata Space Total: 2.147 GB
Metadata Space Available: 2.147 GB
Udev Sync Supported: false
Deferred Removal Enabled: false
Deferred Deletion Enabled: false
Deferred Deleted Device Count: 0
Data loop file: /var/lib/docker/devicemapper/devicemapper/data
WARNING: Usage of loopback devices is strongly discouraged for production use. Either use
--storage-opt dm.thinpooldev
or use--storage-opt dm.no_warn_on_loop_devices=true
to suppress this warning.Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
Library Version: 1.02.82 (2013-10-04)
Execution Driver: native-0.2
Logging Driver: json-file
Plugins:
Volume: local
Network: null host bridge
Kernel Version: 4.2.8-200.fc22.x86_64
Operating System: Ubuntu 14.04.3 LTS (containerized)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.678 GiB
Name: 703f7cbf4c89
ID: 5VYE:UHVY:NVE7:FXOH:6XFB:MT4G:2WOX:BXUT:5E7S:VFRS:EL4S:6HOX
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
uname -a
:❗ same comment as above.
4.2.8-200.fc22.x86_64
Environment details (AWS, VirtualBox, physical, etc.):
docker's jenkins (see above links)
How reproducible:
flaky. always seen by me in docker's jenkins builds
Steps to Reproduce:
Actual Results:
Test always passes
Expected Results:
Sometimes it doesn't
Additional info:
I'll try to look into it today. I haven't tried recreating it locally but wanted to get it into an issue so others can "me too" it and so I don't forget about it.
The text was updated successfully, but these errors were encountered: