-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flaky test: TestServicePlugin, TestCreateServiceSecretFileMode #37132
Comments
Adjusting the timeout value for this specific case or bump all the service timeout value? I don't think we leave this test case failure with timeout reason as it is (or just |
https://jenkins.dockerproject.org/job/Docker-PRs/49551/console in #37141
|
I don't think we have, currently, but having detailed metrics about tests (pass/failure to identify flaky ones, ones that fail on specific configurations, and time to complete) has been discussed frequently; don't think anyone had time yet to dive into that |
Understood. Now we have captured 2 service related test cases timeout: |
Yep, it's exactly that : we discussed (for a while now 馃槤), but so far didn't had time to dive into it 馃槶 |
PowerPC on #38050 failed to this:
|
This is from an ARM64 failure; docker version, docker info
check-config.sh output
Full logs of that test run;
Errors from that log:
The moby/libnetwork/osl/kernel/knobs_linux.go Lines 29 to 47 in 085c6a9
Looking at those, they don't seem to be related, and just "warnings". Perhaps need to be looked into why they fail in our CI, but doesn't look related. |
Hm... ok so perhaps I'm silly, but I think I see what's happening. If I'm right, that test was broken from the start (added in 65ee7ff / #36130), but may have worked in case we managed to get the logs. I think what's wrong here is that the command that's used for the service will either exit immediately (if the secret is mounted correctly at moby/integration/service/create_test.go Line 333 in 085c6a9
Trying to run the test steps locally shows that this is the case; echo 'CONFIG' | docker config create myconfig -
docker service create --config source=myconfig,target=/etc/config,mode=0777 --name myservice busybox sh -c 'ls -l /etc/config || /bin/top' The above creates the service, but it keeps retrying, because each task exits immediately (followed by SwarmKit reconciling and starting a new task);
And, for sure; the tasks exit cleanly (no error), but do exit, so swarm just keeps up spinning new tasks;
So the test is now in a race-condition, and proceeds if it happens to find 1 healthy task, after which it continues, and picks the first task from the list of tasks (before #42960, that means "any task"; either running or exited) And if we're able to get logs of that task, and it includes the |
This has been failing on several PR's
#37086 was increasing the timeout for swarm tests (but that was removed in favor of changing the timeout on a test-by-test base (#37086 (comment)))
Other failures observed (#42923):
The text was updated successfully, but these errors were encountered: