Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: l4lb: gather more infos about docker-in-docker issues #32570

Merged
merged 1 commit into from
May 16, 2024

Conversation

mhofstetter
Copy link
Member

@mhofstetter mhofstetter commented May 16, 2024

Sometimes, the L4LB tests timeout waiting for the docker (in docker) instance to get ready.

+[06:55:18] docker run --privileged --name lb-node -d --network cilium-l4lb -v /lib/modules:/lib/modules docker:dind
ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271
+[06:55:18] docker exec -t lb-node mount bpffs /sys/fs/bpf -t bpf
+[06:55:18] docker run --name nginx -d --network cilium-l4lb nginx
544abbd0503584c582da50480d5f96eae5ecadb426d48d6fee078558b45451b2
+[06:55:18] docker exec -t lb-node docker ps
+[06:55:18] sleep 1
+[06:55:19] docker exec -t lb-node docker ps
+[06:55:19] sleep 1
+[06:55:20] docker exec -t lb-node docker ps
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
+[06:55:20] sleep 1
+[06:55:21] docker exec -t lb-node docker ps
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
+[06:55:21] sleep 1
+[06:55:22] docker exec -t lb-node docker ps
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
+[06:55:22] sleep 1
+[06:55:23] docker exec -t lb-node docker ps
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
+[06:55:20] docker exec -t lb-node docker ps
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running

Unfortunately, fetching the LB logs after the failed test doesn't help either, as this fails with the same error.

Run docker exec -t lb-node docker logs cilium-lb
  docker exec -t lb-node docker logs cilium-lb
...
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running

Therefore, this commit adds an additional job step that fetches the status and logs of the docker instance itself.

Examples:

Sometimes, the L4LB tests timeout waiting for the docker (in docker) instance
to be ready.

```
+[06:55:18] docker run --privileged --name lb-node -d --network cilium-l4lb -v /lib/modules:/lib/modules docker:dind
ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271
+[06:55:18] docker exec -t lb-node mount bpffs /sys/fs/bpf -t bpf
+[06:55:18] docker run --name nginx -d --network cilium-l4lb nginx
544abbd0503584c582da50480d5f96eae5ecadb426d48d6fee078558b45451b2
+[06:55:18] docker exec -t lb-node docker ps
+[06:55:18] sleep 1
+[06:55:19] docker exec -t lb-node docker ps
+[06:55:19] sleep 1
+[06:55:20] docker exec -t lb-node docker ps
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
+[06:55:20] sleep 1
+[06:55:21] docker exec -t lb-node docker ps
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
+[06:55:21] sleep 1
+[06:55:22] docker exec -t lb-node docker ps
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
+[06:55:22] sleep 1
+[06:55:23] docker exec -t lb-node docker ps
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
+[06:55:20] docker exec -t lb-node docker ps
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
```

Unfortunately, fetching the LB logs after the failed test doesn't help either,
as this fails with the same error.

```
Run docker exec -t lb-node docker logs cilium-lb
  docker exec -t lb-node docker logs cilium-lb
...
Error response from daemon: Container ca31f2a72a098bf612d569fee976e7e17779a1546337748dc62b63c99da13271 is not running
```

Therefore, this commit adds an additional job step that fetches the
status and logs of the docker instance itself.

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
@mhofstetter mhofstetter added area/CI Continuous Integration testing issue or flake release-note/ci This PR makes changes to the CI. labels May 16, 2024
@mhofstetter
Copy link
Member Author

/test

@mhofstetter mhofstetter marked this pull request as ready for review May 16, 2024 09:44
@mhofstetter mhofstetter requested review from a team as code owners May 16, 2024 09:44
@mhofstetter mhofstetter requested review from aanm and tklauser May 16, 2024 09:44
@mhofstetter mhofstetter added needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch needs-backport/1.15 This PR / issue needs backporting to the v1.15 branch labels May 16, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.15.6 May 16, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.13.17 May 16, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.14.12 May 16, 2024
@aanm aanm added this pull request to the merge queue May 16, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 16, 2024
Merged via the queue into main with commit 9392745 May 16, 2024
70 checks passed
@aanm aanm deleted the pr/mhofstetter/test-l4lb-gather-more-info branch May 16, 2024 11:00
@YutaroHayakawa YutaroHayakawa mentioned this pull request May 23, 2024
15 tasks
@YutaroHayakawa YutaroHayakawa added backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. and removed needs-backport/1.15 This PR / issue needs backporting to the v1.15 branch labels May 23, 2024
@YutaroHayakawa YutaroHayakawa mentioned this pull request May 24, 2024
12 tasks
@YutaroHayakawa YutaroHayakawa added backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. and removed needs-backport/1.14 This PR / issue needs backporting to the v1.14 branch labels May 24, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport pending to v1.15 in 1.15.6 May 24, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport pending to v1.14 in 1.14.12 May 24, 2024
@YutaroHayakawa YutaroHayakawa mentioned this pull request May 24, 2024
10 tasks
@YutaroHayakawa YutaroHayakawa added backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. and removed needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch labels May 24, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport pending to v1.13 in 1.13.17 May 24, 2024
@github-actions github-actions bot added backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. and removed backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. labels May 25, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed this from Backport pending to v1.15 in 1.15.6 May 25, 2024
@github-actions github-actions bot added backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. and removed backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels May 27, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Backport done to v1.15 in 1.15.6 May 27, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed this from Backport pending to v1.14 in 1.14.12 May 27, 2024
@github-actions github-actions bot added backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. and removed backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. labels May 28, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Backport done to v1.14 in 1.14.12 May 28, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed this from Backport pending to v1.13 in 1.13.17 May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/CI Continuous Integration testing issue or flake backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/ci This PR makes changes to the CI.
Projects
1.14.12
Backport done to v1.14
1.15.6
Backport done to v1.15
Status: Released
Status: Released
Status: Released
Development

Successfully merging this pull request may close these issues.

None yet

4 participants