Richgo hangs after the test finish with a failure #30

nmiculinic · 2020-06-16T09:57:38Z

In the https://github.com/kubermatic/kubecarrier project we're using richgo for parsing the test output:

https://github.com/kubermatic/kubecarrier/blob/e2e-explorations/hack/.e2e-test.sh

kubectl kubecarrier e2e-test run --test.v --test.failfast --test-id=${TEST_ID} | richgo testfilter

Sometimes after the failing test, the richgo hangs. Here's the output from stdout/err:

...
     |     --- FAIL: Integration/apiserver (65.18s)
     |         --- PASS: Integration/apiserver/account-service (1.39s)
     |         --- PASS: Integration/apiserver/region-service (2.20s)
     |         --- PASS: Integration/apiserver/provider-service (2.58s)
     |         --- FAIL: Integration/apiserver/offering-service (60.02s)
     |         --- FAIL: Integration/apiserver/instance-service (137.03s)
FAIL

and after running ps axf I see only richgo is still running; thus my own testing binary producing output has closed.

486497 pts/6    Ss     0:06              \_ /usr/bin/zsh -i
 558511 pts/6    S+     0:00              |   \_ make e2e-test
 569591 pts/6    S+     0:00              |       \_ /bin/bash ./hack/.e2e-test.sh
 569593 pts/6    Sl+    0:00              |           \_ richgo testfilter
 569602 pts/6    S+     0:00              |               \_ cat -

After stracing it:

 ▲ ~/Desktop/kubecarrier sudo strace -fp 569593                                                                                                                                                                                                                                                                                                                           
strace: Process 569593 attached with 5 threads
[pid 569601] futex(0xc000074148, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 569600] futex(0xc00004e848, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid 569599] epoll_pwait(5,  <unfinished ...>
[pid 569598] restart_syscall(<... resuming interrupted read ...> <unfinished ...>
[pid 569593] waitid(P_PID, 569602,

Now I have two questions:

why does it hang?
Why it's executing cat - command

Is there anything more I could do to debug this issue?

The text was updated successfully, but these errors were encountered:

kyoh86 · 2020-06-28T08:24:10Z

I cannot run your test (kubecarrier)
Please give me raw result to find a bug.

kubectl kubecarrier e2e-test run --test.v --test.failfast --test-id=${TEST_ID}  > raw.txt

kyoh86 · 2020-06-28T08:26:10Z

Why it's executing cat - command

It's executing cat - to use factoryFunc and editor.Editor interfaces.
It may be able to be refactored (but I have no idea)

nmiculinic · 2020-06-30T10:24:28Z

Currently, the situation is as follows:

This is highly non-deterministic and hard to replicate
Our e2e tests have improved in stability recently, with 0.2.0 release. Even so, they take about 5+ min on average
we're not using richgo anymore since we only used it for coloring and prow(CI) doesn't support coloring. We also implemented a small test-line aggregation/sorting post-processing since we're running a lot of parallel tests and their outputs are intertwined. (P.S. go tool test2json is buggy with parallel tests)

Thus I don't have time capacity replicating the issue anymore. If you need help running the e2e tests, I'll gladly help you. For getting started try running them in privileged docker container quay.io/kubecarrier/test as specified in .prow.yaml file how they are being run in the CI.

EDIT: I've checked the exact commit hash, 99fde20f836c1a9c20bfafadd636941cb6deb762 is where e2e-exploring branch currently points to locally (it doesn't appear present in the upstream repo)

https://github.com/kubermatic/kubecarrier/blob/99fde20f836c1a9c20bfafadd636941cb6deb762/hack/.e2e-test.sh

kyoh86 · 2020-07-01T01:31:01Z

I see.
Not being worried about this issue, anybody won't check it out.

kyoh86 self-assigned this Jun 17, 2020

kyoh86 closed this as completed Jul 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Richgo hangs after the test finish with a failure #30

Richgo hangs after the test finish with a failure #30

nmiculinic commented Jun 16, 2020

kyoh86 commented Jun 28, 2020

kyoh86 commented Jun 28, 2020

nmiculinic commented Jun 30, 2020 •

edited

kyoh86 commented Jul 1, 2020

Richgo hangs after the test finish with a failure #30

Richgo hangs after the test finish with a failure #30

Comments

nmiculinic commented Jun 16, 2020

kyoh86 commented Jun 28, 2020

kyoh86 commented Jun 28, 2020

nmiculinic commented Jun 30, 2020 • edited

kyoh86 commented Jul 1, 2020

nmiculinic commented Jun 30, 2020 •

edited