Conversation
Please sign your commits following these rules: $ git clone -b "debug-ci-intermittent-fail" git@github.com:ijc/docker-app.git somewhere
$ cd somewhere
$ git rebase -i HEAD~842354610128
editor opens
change each 'pick' to 'edit'
save the file and quit
$ git commit --amend -s --no-edit
$ git rebase --continue # and repeat the amend for each commit
$ git push -f Amending updates the existing PR. You DO NOT need to open a new one. |
cddf5f3
to
95b5a31
Compare
Codecov Report
@@ Coverage Diff @@
## master #534 +/- ##
==========================================
+ Coverage 69.69% 69.79% +0.09%
==========================================
Files 55 55
Lines 3013 3013
==========================================
+ Hits 2100 2103 +3
+ Misses 633 631 -2
+ Partials 280 279 -1
Continue to review full report at Codecov.
|
Thanks @chris-crone but I intended this one just for debugging not merging, I've tagged the title now. |
e3b581b
to
ca537cc
Compare
ci.docker.com seems to not be exhibiting this issue any more. But it looks like https://jenkins.dockerproject.org/job/docker/job/app/job/PR-534/8/consoleFull caught it in the act with a reasonably full set of debug:
Which shows that the stdout of
Is repeated twice within a single |
32bc9cb
to
3eedae3
Compare
Codecov Report
@@ Coverage Diff @@
## master #534 +/- ##
=========================================
Coverage ? 69.79%
=========================================
Files ? 55
Lines ? 3013
Branches ? 0
=========================================
Hits ? 2103
Misses ? 631
Partials ? 279 Continue to review full report at Codecov.
|
b5dc87e
to
bed1beb
Compare
From the logging in https://ci.docker.com/teams-dsg/job/dsg/job/app/job/PR-534/85/consoleFull I am reasonably convinced that the invocation is not printing the rendered output twice, but rather it is being duplicated somewhere further up the stack. This is because in the block:
(where the The repeated output is:
Unfortunately while the tweaks I've made to make this happen more reliably in CI (but still not totally reliably) have helped there I still cannot reproduce this locally at all (otherwise |
a304016
to
ee8eee3
Compare
Some more results from https://ci.docker.com/teams-dsg/job/dsg/job/app/job/PR-534/91/ where the
fd 5 is a socket to the daemon and the attach protocol is described at https://github.com/moby/moby/blob/master/client/container_attach.go#L10...L33, essentially So we can see that For comparison a successful run (done locally) looks like:
So it seems to get the "all one line" variant which the bad case saw second. So the info is already duplicated before |
Separate the "file" and "stdio" case since they both error with "rendering mismatch". Signed-off-by: Ian Campbell <ijc@docker.com>
Improves output on failure. Signed-off-by: Ian Campbell <ijc@docker.com>
... also in master. Signed-off-by: Ian Campbell <ijc@docker.com>
Since it includes t.Log output etc. Signed-off-by: Ian Campbell <ijc@docker.com>
Signed-off-by: Ian Campbell <ijc@docker.com>
So I can pickup and test known-bad binaries locally. Signed-off-by: Ian Campbell <ijc@docker.com>
Signed-off-by: Ian Campbell <ijc@docker.com>
Trying to reproduce more reliably Signed-off-by: Ian Campbell <ijc@docker.com>
Signed-off-by: Ian Campbell <ijc@docker.com>
Signed-off-by: Ian Campbell <ijc@docker.com>
Trying to see if it is printed twice vs printed once then duplicated by something... Signed-off-by: Ian Campbell <ijc@docker.com>
Signed-off-by: Ian Campbell <ijc@docker.com>
Signed-off-by: Ian Campbell <ijc@docker.com>
c79ce35
to
e00c2fa
Compare
Managed to capture From the invocation image traces:
and: $ grep write\( cnab-run.render.*
cnab-run.render.13:write(2, "version: \"3.4\"\nservices:\n test:\n command:\n - bash\n - -c\n - cat bar bam\n image: alpine:latest\n\n", 110) = 110
cnab-run.render.13:write(2, "2019-05-10 10:15:31.975416038 +0000 UTC m=+0.052860297: END FORMATTED OUTPUT\n", 77) = 77
cnab-run.render.13:write(1, "version: \"3.4\"\nservices:\n test:\n command:\n - bash\n - -c\n - cat bar bam\n image: alpine:latest\n", 109) = 109
cnab-run.render.7:write(2, "2019-05-10 10:15:31.968134272 +0000 UTC m=+0.045578545: cnab-run render on instance \"custom-action\" with formatter \"yaml\"\n", 122) = 122
cnab-run.render.7:write(2, "2019-05-10 10:15:31.974418925 +0000 UTC m=+0.051863204: BEGIN FORMATTED OUTPUT\n", 79) = 79
$ grep cat.bar.bam cnab-run.render.*
cnab-run.render.13:write(2, "version: \"3.4\"\nservices:\n test:\n command:\n - bash\n - -c\n - cat bar bam\n image: alpine:latest\n\n", 110) = 110
cnab-run.render.13:write(1, "version: \"3.4\"\nservices:\n test:\n command:\n - bash\n - -c\n - cat bar bam\n image: alpine:latest\n", 109) = 109 Which validates that it is only printing its output to stdout a single time (and that the stderr output is only produced once too). However reconstructing the trace from
Which is clearly reading the stdout content from the connection to the daemon (which is on fd 6) twice (first time it is split on BTW, fd 5 is the output file:
So it seems that something between the container's stdout/stderr and the daemon rest endpoint has not only duplicated both of them but has also apparently for the first one split it on From earlier tracing of the happy case with no duplication we know that the expected one is the unsplit one, so it is the first of the two copies which is the "extra". The actual test output for this case:
|
Signed-off-by: Ian Campbell <ijc@docker.com>
The issue seems to be fixed on the CI. We don't have flaky builds anymore. |
This is investigating the intermittent CI failure originally spotted in #529.