Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testground reports failure for instances when they appear to succeed #1382

Closed
lalexgap opened this issue Jul 11, 2022 · 4 comments · Fixed by #1407
Closed

Testground reports failure for instances when they appear to succeed #1382

lalexgap opened this issue Jul 11, 2022 · 4 comments · Fixed by #1407

Comments

@lalexgap
Copy link
Contributor

Version information:

❯ git log | head -n1
commit 42b4355f7760d29100d971df1c94739d1b796450

Macbook pro (M1)

Description:

Testground reports that instances have failed run finished with outcome = failure (single:0/2) even though it looks like they have succeeded:

Jul 11 19:20:04.210869	INFO	2.0365s         OK << single[001] (95aa87) >>
Jul 11 19:20:04.214827	INFO	2.0369s         OK << single[000] (acfc2b) >>

I've noticed this when running my own tests but I'm also able to reproduce it by running the ping-pong sample test case. It doesn't happen all the time but does happen quite frequently.

testground run single --plan=network --testcase=ping-pong \
                        --builder=docker:go --runner=local:docker \
                        --instances=2 --wait

Testground collect output:
cb67fo7semvi2emk8gog.tgz

Full console output:

❯ ❯ testground run single --plan=network --testcase=ping-pong \
                        --builder=docker:go --runner=local:docker \
                        --instances=2 --wait
Jul 11 19:20:00.744322	INFO	created a synthetic composition file for this job; all instances will run under singleton group "single"
Jul 11 19:20:00.744352	INFO	using home directory: /Users/lagap/testground
Jul 11 19:20:00.744406	INFO	no .env.toml found at /Users/lagap/testground/.env.toml; running with defaults
Jul 11 19:20:00.744419	INFO	testground client initialized	{"addr": "http://localhost:8042"}

>>> Result:

Jul 11 19:20:00.750255	INFO	run is queued with ID: cb67fo7semvi2emk8gog

>>> Server output:

Jul 11 19:20:01.189398	INFO	performing build for groups	{"plan": "network", "groups": ["single"], "builder": "docker:go"}
Jul 11 19:20:01.222146	INFO	found existing volume	{"volume_name": "testground-goproxy-vol"}
Jul 11 19:20:01.256548	INFO	container is already running	{"container_name": "testground-goproxy"}
Step 1/29 : ARG BUILD_BASE_IMAGE
Step 2/29 : ARG RUNTIME_IMAGE=busybox:1.31.1-glibc
Step 3/29 : FROM ${BUILD_BASE_IMAGE} AS builder
 ---> 67f476ddb57c
Step 4/29 : ARG PLAN_PATH
 ---> Using cache
 ---> 13dddb20f9f1
Step 5/29 : ENV PLAN_DIR /plan/${PLAN_PATH}
 ---> Using cache
 ---> 6dab5c1d9517
Step 6/29 : ENV SDK_DIR /sdk
 ---> Using cache
 ---> 74754c372718
Step 7/29 : RUN rm -rf ${PLAN_DIR} ${SDK_DIR} /testground_dep_list
 ---> Using cache
 ---> e4f6c0469032
Step 8/29 : ARG TESTPLAN_EXEC_PKG="."
 ---> Using cache
 ---> 9f1ba0c47bb5
Step 9/29 : ARG GO_PROXY=direct
 ---> Using cache
 ---> 12f58cc317fb
Step 10/29 : ARG BUILD_TAGS
 ---> Using cache
 ---> 7f415f0f0ea0
Step 11/29 : ENV TESTPLAN_EXEC_PKG ${TESTPLAN_EXEC_PKG}
 ---> Using cache
 ---> 5ae8f99821dd
Step 12/29 : ENV GOCACHE /go/cache
 ---> Using cache
 ---> 41d1ec2b732f
Step 13/29 : ARG MODFILE="go.mod"
 ---> Using cache
 ---> 38b3afb756fc
Step 14/29 : ARG MODFILE_SUM="go.sum"
 ---> Using cache
 ---> 2f39c890b96a
Step 15/29 : COPY /plan/${PLAN_PATH}/${MODFILE} ${PLAN_DIR}/go.mod
 ---> Using cache
 ---> 906606a1c3fc
Step 16/29 : COPY /plan/${PLAN_PATH}/${MODFILE_SUM} ${PLAN_DIR}/go.sum
 ---> Using cache
 ---> 9cff138018de
Step 17/29 : RUN echo "Using go proxy: ${GO_PROXY}"     && cd ${PLAN_DIR}     && go env -w GOPROXY="${GO_PROXY}"     && go mod download
 ---> Using cache
 ---> e87e79715920
Step 18/29 : RUN cp ${PLAN_DIR}/go.mod ${PLAN_DIR}/go.sum /tmp/
 ---> Using cache
 ---> 9761406fa33e
Step 19/29 : COPY . /
 ---> Using cache
 ---> f696440ae7a1
Step 20/29 : RUN cp /tmp/go.mod /tmp/go.sum ${PLAN_DIR}/
 ---> Using cache
 ---> eb93e5ebdf80
Step 21/29 : RUN cd ${PLAN_DIR}     && go env -w GOPROXY="${GO_PROXY}"     && CGO_ENABLED=${CgoEnabled} GOOS=linux go build -o ${PLAN_DIR}/testplan.bin ${BUILD_TAGS} ${TESTPLAN_EXEC_PKG}
 ---> Using cache
 ---> da3aa98fe30f
Step 22/29 : RUN cd ${PLAN_DIR}   && go list -m all > /testground_dep_list
 ---> Using cache
 ---> 4c6ad595f08f
 ---> 4c6ad595f08f
Step 23/29 : FROM ${RUNTIME_IMAGE} AS runtime
 ---> 7ddc309cad80
Step 24/29 : ARG PLAN_PATH
 ---> Using cache
 ---> d2eb843d11a4
Step 25/29 : ENV PLAN_DIR /plan/${PLAN_PATH}
 ---> Using cache
 ---> 42af4709d6b7
Step 26/29 : COPY --from=builder /testground_dep_list /
 ---> Using cache
 ---> 3e0c31e54e8f
Step 27/29 : COPY --from=builder ${PLAN_DIR}/testplan.bin /testplan
 ---> Using cache
 ---> 607127c11616
Step 28/29 : EXPOSE 6060
 ---> Using cache
 ---> 940cac4c26cf
Step 29/29 : ENTRYPOINT [ "/testplan"]
 ---> Using cache
 ---> a7788ac76265
 ---> a7788ac76265
Successfully built a7788ac76265
Successfully tagged 3800533d9fc7:latest
Jul 11 19:20:01.498759	INFO	build completed	{"default_tag": "3800533d9fc7:latest", "took": "0s"}
Jul 11 19:20:01.713522	INFO	got docker image id	{"image_id": "a7788ac76265"}
Jul 11 19:20:01.800200	INFO	tagging image	{"image_id": "a7788ac76265", "tag": "tg-plan-network:a7788ac76265"}
Jul 11 19:20:01.806458	INFO	build succeeded	{"plan": "network", "groups": ["single"], "builder": "docker:go", "artifact": "a7788ac76265"}
Jul 11 19:20:01.806643	INFO	performing healthcheck on runner
Jul 11 19:20:01.950728	INFO	healthcheck: ok
Jul 11 19:20:01.950893	INFO	starting run	{"run_id": "cb67fo7semvi2emk8gog", "plan": "network", "case": "ping-pong", "runner": "local:docker", "instances": 2}
Jul 11 19:20:02.002126	INFO	creating container	{"runner": "local:docker", "run_id": "cb67fo7semvi2emk8gog", "name": "tg-network-ping-pong-cb67fo7semvi2emk8gog-single-0"}
Jul 11 19:20:02.099224	INFO	creating container	{"runner": "local:docker", "run_id": "cb67fo7semvi2emk8gog", "name": "tg-network-ping-pong-cb67fo7semvi2emk8gog-single-1"}
Jul 11 19:20:02.174607	INFO	starting containers	{"runner": "local:docker", "run_id": "cb67fo7semvi2emk8gog", "count": 2}
Jul 11 19:20:02.174745	INFO	starting container	{"runner": "local:docker", "run_id": "cb67fo7semvi2emk8gog", "id": "95aa87fe30d201322f2e76e194ed61db4ff481023cc1eaeab3b6154605405437", "group": "single", "group_index": 1}
Jul 11 19:20:02.174950	INFO	starting container	{"runner": "local:docker", "run_id": "cb67fo7semvi2emk8gog", "id": "acfc2bf0efd7cd367c8ba4625be3c7f5d4bde458b117af6482214cc1493ddf59", "group": "single", "group_index": 0}
Jul 11 19:20:02.886507	INFO	0.7117s      OTHER << sidecar      >> Jul 11 19:20:02.884640	INFO	resolved route to host	{"host": "testground-redis", "ip": "192.18.0.3"}
Jul 11 19:20:02.890078	INFO	0.7154s      OTHER << sidecar      >> Jul 11 19:20:02.885958	INFO	resolved route to host	{"host": "testground-sync-service", "ip": "192.18.0.4"}
Jul 11 19:20:02.890301	INFO	0.7156s      OTHER << sidecar      >> Jul 11 19:20:02.886941	INFO	resolved route to host	{"host": "testground-influxdb", "ip": "192.18.0.5"}
Jul 11 19:20:02.890677	INFO	0.7138s    MESSAGE << sidecar      >> InfluxDB unavailable; no metrics will be dispatched: no InfluxDB URL in $INFLUXDB_URL env var
Jul 11 19:20:02.899091	INFO	0.7210s    MESSAGE << single[001] (95aa87) >> registering default http handler at: http://[::]:6060/ (pprof: http://[::]:6060/debug/pprof/)
Jul 11 19:20:02.900446	INFO	0.7241s      START << single[001] (95aa87) >> {"plan":"network","case":"ping-pong","run":"cb67fo7semvi2emk8gog","outputs_path":"/outputs","temp_path":"/temp","instances":2,"group":"single","group_instances":2,"network":"16.1.0.0/16","start_time":"0001-01-01T00:00:00Z"}
Jul 11 19:20:02.901577	INFO	0.7269s      OTHER << sidecar      >> Jul 11 19:20:02.901421	INFO	successfully resolved route to host	{"container_id": "95aa87fe30d201322f2e76e194ed61db4ff481023cc1eaeab3b6154605405437"}
Jul 11 19:20:02.903856	INFO	0.7291s      OTHER << sidecar      >> Jul 11 19:20:02.903514	INFO	external routing disabled
Jul 11 19:20:02.903972	INFO	0.7293s      OTHER << sidecar      >> Jul 11 19:20:02.903608	INFO	waiting for all networks to be ready	{"sidecar": true, "run_id": "cb67fo7semvi2emk8gog"}
Jul 11 19:20:02.931455	INFO	started containers	{"runner": "local:docker", "run_id": "cb67fo7semvi2emk8gog", "count": 2}
Jul 11 19:20:02.941068	INFO	0.7658s    MESSAGE << single[001] (95aa87) >> waiting for network to initialize
Jul 11 19:20:02.951665	INFO	0.7759s    MESSAGE << sidecar      >> InfluxDB unavailable; no metrics will be dispatched: no InfluxDB URL in $INFLUXDB_URL env var
Jul 11 19:20:02.954654	INFO	0.7788s    MESSAGE << single[000] (acfc2b) >> registering default http handler at: http://[::]:6060/ (pprof: http://[::]:6060/debug/pprof/)
Jul 11 19:20:02.956550	INFO	0.7804s      START << single[000] (acfc2b) >> {"plan":"network","case":"ping-pong","run":"cb67fo7semvi2emk8gog","outputs_path":"/outputs","temp_path":"/temp","instances":2,"group":"single","group_instances":2,"network":"16.1.0.0/16","start_time":"0001-01-01T00:00:00Z"}
Jul 11 19:20:02.958239	INFO	0.7835s      OTHER << sidecar      >> Jul 11 19:20:02.956361	INFO	successfully resolved route to host	{"container_id": "acfc2bf0efd7cd367c8ba4625be3c7f5d4bde458b117af6482214cc1493ddf59"}
Jul 11 19:20:02.959949	INFO	0.7852s      OTHER << sidecar      >> Jul 11 19:20:02.959238	INFO	external routing disabled
Jul 11 19:20:02.960054	INFO	0.7854s      OTHER << sidecar      >> Jul 11 19:20:02.959778	INFO	waiting for all networks to be ready	{"sidecar": true, "run_id": "cb67fo7semvi2emk8gog"}
Jul 11 19:20:02.966366	INFO	0.7904s    MESSAGE << single[001] (95aa87) >> network initialisation successful
Jul 11 19:20:02.966675	INFO	0.7920s      OTHER << sidecar      >> Jul 11 19:20:02.965111	INFO	all networks ready	{"sidecar": true, "run_id": "cb67fo7semvi2emk8gog"}
Jul 11 19:20:02.966801	INFO	0.7921s      OTHER << sidecar      >> Jul 11 19:20:02.965196	INFO	all networks ready	{"sidecar": true, "run_id": "cb67fo7semvi2emk8gog"}
Jul 11 19:20:02.971192	INFO	0.7958s    MESSAGE << single[001] (95aa87) >> network initialization complete
Jul 11 19:20:02.975859	INFO	0.8015s    MESSAGE << single[001] (95aa87) >> claimed sequence numbers; global=1, group(single)=1
Jul 11 19:20:02.977538	INFO	0.8025s    MESSAGE << single[001] (95aa87) >> before sync.MustBoundClient
Jul 11 19:20:02.977855	INFO	0.8024s    MESSAGE << single[000] (acfc2b) >> waiting for network to initialize
Jul 11 19:20:02.979108	INFO	0.8045s    MESSAGE << single[001] (95aa87) >> before netclient.MustConfigureNetwork
Jul 11 19:20:02.981361	INFO	0.8069s    MESSAGE << single[000] (acfc2b) >> network initialisation successful
Jul 11 19:20:02.983672	INFO	0.8091s    MESSAGE << single[000] (acfc2b) >> network initialization complete
Jul 11 19:20:02.985407	INFO	0.8109s    MESSAGE << single[000] (acfc2b) >> claimed sequence numbers; global=2, group(single)=2
Jul 11 19:20:02.985959	INFO	0.8118s    MESSAGE << single[000] (acfc2b) >> before sync.MustBoundClient
Jul 11 19:20:02.987656	INFO	0.8131s    MESSAGE << single[000] (acfc2b) >> before netclient.MustConfigureNetwork
Jul 11 19:20:03.070869	INFO	0.8961s      OTHER << sidecar      >> Jul 11 19:20:03.070078	INFO	applying network change	{"sidecar": true, "run_id": "cb67fo7semvi2emk8gog", "network": {"network":"default","IPv4":null,"IPv6":null,"enable":true,"default":{"latency":100000000,"jitter":0,"bandwidth":1048576,"filter":0,"loss":0,"corrupt":0,"corrupt_corr":0,"reorder":0,"reorder_corr":0,"duplicate":0,"duplicate_corr":0},"rules":null,"callback_state":"network-configured","routing_policy":"deny_all"}}
Jul 11 19:20:03.070998	INFO	0.8963s      OTHER << sidecar      >> Jul 11 19:20:03.070147	INFO	external routing already disabled
Jul 11 19:20:03.071029	INFO	0.8963s      OTHER << sidecar      >> Jul 11 19:20:03.070084	INFO	applying network change	{"sidecar": true, "run_id": "cb67fo7semvi2emk8gog", "network": {"network":"default","IPv4":null,"IPv6":null,"enable":true,"default":{"latency":100000000,"jitter":0,"bandwidth":1048576,"filter":0,"loss":0,"corrupt":0,"corrupt_corr":0,"reorder":0,"reorder_corr":0,"duplicate":0,"duplicate_corr":0},"rules":null,"callback_state":"network-configured","routing_policy":"deny_all"}}
Jul 11 19:20:03.071084	INFO	0.8964s      OTHER << sidecar      >> Jul 11 19:20:03.070349	INFO	external routing already disabled
Jul 11 19:20:03.073575	INFO	0.8985s    MESSAGE << single[000] (acfc2b) >> I am 1
Jul 11 19:20:03.074482	INFO	0.8997s    MESSAGE << single[001] (95aa87) >> I am 2
Jul 11 19:20:03.074725	INFO	0.8999s    MESSAGE << single[000] (acfc2b) >> before reconfiguring network
Jul 11 19:20:03.075903	INFO	0.9011s    MESSAGE << single[001] (95aa87) >> before reconfiguring network
Jul 11 19:20:03.170087	INFO	0.9954s      OTHER << sidecar      >> Jul 11 19:20:03.169848	INFO	applying network change	{"sidecar": true, "run_id": "cb67fo7semvi2emk8gog", "network": {"network":"default","IPv4":"16.1.1.1/24","IPv6":null,"enable":true,"default":{"latency":100000000,"jitter":0,"bandwidth":1048576,"filter":0,"loss":0,"corrupt":0,"corrupt_corr":0,"reorder":0,"reorder_corr":0,"duplicate":0,"duplicate_corr":0},"rules":null,"callback_state":"ip-changed","routing_policy":"deny_all"}}
Jul 11 19:20:03.170208	INFO	0.9955s      OTHER << sidecar      >> Jul 11 19:20:03.169901	INFO	external routing already disabled
Jul 11 19:20:03.170241	INFO	0.9955s      OTHER << sidecar      >> Jul 11 19:20:03.169907	INFO	applying network change	{"sidecar": true, "run_id": "cb67fo7semvi2emk8gog", "network": {"network":"default","IPv4":"16.1.1.2/24","IPv6":null,"enable":true,"default":{"latency":100000000,"jitter":0,"bandwidth":1048576,"filter":0,"loss":0,"corrupt":0,"corrupt_corr":0,"reorder":0,"reorder_corr":0,"duplicate":0,"duplicate_corr":0},"rules":null,"callback_state":"ip-changed","routing_policy":"deny_all"}}
Jul 11 19:20:03.170296	INFO	0.9956s      OTHER << sidecar      >> Jul 11 19:20:03.169924	INFO	external routing already disabled
Jul 11 19:20:03.749333	INFO	1.5743s    MESSAGE << single[001] (95aa87) >> waiting until ready
Jul 11 19:20:03.850216	INFO	1.6753s    MESSAGE << single[000] (acfc2b) >> waiting until ready
Jul 11 19:20:03.852117	INFO	1.6771s    MESSAGE << single[000] (acfc2b) >> writing my id
Jul 11 19:20:03.853218	INFO	1.6786s    MESSAGE << single[000] (acfc2b) >> reading their id
Jul 11 19:20:03.952061	INFO	1.7772s    MESSAGE << single[001] (95aa87) >> writing my id
Jul 11 19:20:03.953151	INFO	1.7788s    MESSAGE << single[001] (95aa87) >> reading their id
Jul 11 19:20:03.955457	INFO	1.7803s    MESSAGE << single[001] (95aa87) >> returning their id
Jul 11 19:20:03.956096	INFO	1.7815s    MESSAGE << single[001] (95aa87) >> reading my id
Jul 11 19:20:04.053884	INFO	1.8789s    MESSAGE << single[000] (acfc2b) >> returning their id
Jul 11 19:20:04.054647	INFO	1.8802s    MESSAGE << single[000] (acfc2b) >> reading my id
Jul 11 19:20:04.056852	INFO	1.8817s    MESSAGE << single[000] (acfc2b) >> done
Jul 11 19:20:04.057049	INFO	1.8828s    MESSAGE << single[000] (acfc2b) >> ping RTT was 205.625125ms [200ms, 215ms]
Jul 11 19:20:04.155268	INFO	1.9803s    MESSAGE << single[001] (95aa87) >> done
Jul 11 19:20:04.156458	INFO	1.9817s    MESSAGE << single[001] (95aa87) >> ping RTT was 204.443084ms [200ms, 215ms]
Jul 11 19:20:04.170206	INFO	1.9955s      OTHER << sidecar      >> Jul 11 19:20:04.170131	INFO	applying network change	{"sidecar": true, "run_id": "cb67fo7semvi2emk8gog", "network": {"network":"default","IPv4":"16.1.1.2/24","IPv6":null,"enable":true,"default":{"latency":10000000,"jitter":0,"bandwidth":1048576,"filter":0,"loss":0,"corrupt":0,"corrupt_corr":0,"reorder":0,"reorder_corr":0,"duplicate":0,"duplicate_corr":0},"rules":null,"callback_state":"latency-reduced","routing_policy":"deny_all"}}
Jul 11 19:20:04.170308	INFO	1.9956s      OTHER << sidecar      >> Jul 11 19:20:04.170183	INFO	external routing already disabled
Jul 11 19:20:04.170408	INFO	1.9957s      OTHER << sidecar      >> Jul 11 19:20:04.170263	INFO	applying network change	{"sidecar": true, "run_id": "cb67fo7semvi2emk8gog", "network": {"network":"default","IPv4":"16.1.1.1/24","IPv6":null,"enable":true,"default":{"latency":10000000,"jitter":0,"bandwidth":1048576,"filter":0,"loss":0,"corrupt":0,"corrupt_corr":0,"reorder":0,"reorder_corr":0,"duplicate":0,"duplicate_corr":0},"rules":null,"callback_state":"latency-reduced","routing_policy":"deny_all"}}
Jul 11 19:20:04.170477	INFO	1.9958s      OTHER << sidecar      >> Jul 11 19:20:04.170301	INFO	external routing already disabled
Jul 11 19:20:04.170815	INFO	1.9962s    MESSAGE << single[001] (95aa87) >> ping pong
Jul 11 19:20:04.171273	INFO	1.9967s    MESSAGE << single[000] (acfc2b) >> ping pong
Jul 11 19:20:04.171716	INFO	1.9971s    MESSAGE << single[001] (95aa87) >> waiting until ready
Jul 11 19:20:04.172228	INFO	1.9980s    MESSAGE << single[000] (acfc2b) >> waiting until ready
Jul 11 19:20:04.184123	INFO	2.0091s    MESSAGE << single[001] (95aa87) >> writing my id
Jul 11 19:20:04.185588	INFO	2.0110s    MESSAGE << single[001] (95aa87) >> reading their id
Jul 11 19:20:04.185601	INFO	2.0082s    MESSAGE << single[000] (acfc2b) >> writing my id
Jul 11 19:20:04.186735	INFO	2.0107s    MESSAGE << single[000] (acfc2b) >> reading their id
Jul 11 19:20:04.195557	INFO	2.0208s    MESSAGE << single[001] (95aa87) >> returning their id
Jul 11 19:20:04.195740	INFO	2.0211s    MESSAGE << single[000] (acfc2b) >> returning their id
Jul 11 19:20:04.197099	INFO	2.0224s    MESSAGE << single[000] (acfc2b) >> reading my id
Jul 11 19:20:04.197101	INFO	2.0222s    MESSAGE << single[001] (95aa87) >> reading my id
Jul 11 19:20:04.207852	INFO	2.0333s    MESSAGE << single[000] (acfc2b) >> done
Jul 11 19:20:04.208178	INFO	2.0334s    MESSAGE << single[001] (95aa87) >> done
Jul 11 19:20:04.208825	INFO	2.0345s    MESSAGE << single[000] (acfc2b) >> ping RTT was 26.292042ms [20ms, 35ms]
Jul 11 19:20:04.208828	INFO	2.0344s    MESSAGE << single[001] (95aa87) >> ping RTT was 25.283208ms [20ms, 35ms]
Jul 11 19:20:04.210869	INFO	2.0365s         OK << single[001] (95aa87) >>
Jul 11 19:20:04.214827	INFO	2.0369s         OK << single[000] (acfc2b) >>
Jul 11 19:20:04.265942	INFO	deleting containers	{"runner": "local:docker", "run_id": "cb67fo7semvi2emk8gog", "ids": ["acfc2bf0efd7cd367c8ba4625be3c7f5d4bde458b117af6482214cc1493ddf59", "95aa87fe30d201322f2e76e194ed61db4ff481023cc1eaeab3b6154605405437"]}
Jul 11 19:20:04.266026	INFO	deleting container	{"runner": "local:docker", "run_id": "cb67fo7semvi2emk8gog", "id": "95aa87fe30d201322f2e76e194ed61db4ff481023cc1eaeab3b6154605405437"}
Jul 11 19:20:04.266035	INFO	deleting container	{"runner": "local:docker", "run_id": "cb67fo7semvi2emk8gog", "id": "acfc2bf0efd7cd367c8ba4625be3c7f5d4bde458b117af6482214cc1493ddf59"}
Jul 11 19:20:04.681560	INFO	run finished with outcome = failure (single:0/2)	{"run_id": "cb67fo7semvi2emk8gog", "plan": "network", "case": "ping-pong", "runner": "local:docker", "instances": 2}

>>> Result:

Jul 11 19:20:08.259833	INFO	finished run with ID: cb67fo7semvi2emk8gog
run outcome: failure
@Bidon15
Copy link
Member

Bidon15 commented Jul 12, 2022

I've bumped into the same issue during a run of my composition (it is successful in the logic itself, but the outcome says failure as mentioned in the bug description)

@laurentsenta
Copy link
Contributor

Thanks for raising this, I couldn't reproduce this, but that's an issue I've seen a couple of time:
Recently we clarified Testground's outcome so that the CLI output matches the dashboard's (localhost:8042).
OK means that the container was run without an error. But the container itself needs to record_success. See #1345

  • Is your test calling the SDK's record_success? (go-sdk does it by default if you use InitializedTestCaseFn).
  • Could you share your test?
  • Do you have a Linux / Ubuntu machine nearby you could use to compare the outcomes?
  • Could you try to restart the testground containers before re-running the ping example?

I use the following command:

docker stop $(docker container ls -q --filter name='testground*') # restart containers
# then restart the `testground daemon`

@lalexgap
Copy link
Contributor Author

Thanks for raising this, I couldn't reproduce this, but that's an issue I've seen a couple of time: Recently we clarified Testground's outcome so that the CLI output matches the dashboard's (localhost:8042). OK means that the container was run without an error. But the container itself needs to record_success. See #1345

  • Is your test calling the SDK's record_success? (go-sdk does it by default if you use InitializedTestCaseFn).

I was not doing this! After using InitializedTestCaseFn my tests now report success 🎉 This was most likely the main issue I've been running into with my own tests, but I'm still able to see this behaviour sporadically when running pingpong test locally.

  • Do you have a Linux / Ubuntu machine nearby you could use to compare the outcomes?
  • Could you try to restart the testground containers before re-running the ping example?

I've still seen this very sporadically when running pingpong test locally (even after restarting the testground containers)

I noticed that there is always a warning from each instance in the logs when it does report failures:

Jul 24 22:13:03.725586	WARN	no handler available for response: 1
Jul 24 22:13:03.726929	WARN	no handler available for response: 1
Jul 24 22:13:04.031441	INFO	run finished with outcome = failure (single:0/2)

I tried to reproduce the pingpong failure on a debian cloud vm and was unable to. I suspect that this is some kind of timing/ performance issue caused by the fact that I'm using a M1 mac?

Let me know if there's any more info I can provide to help!

@laurentsenta
Copy link
Contributor

Thanks for the details @lalexgap, your note helped me uncover a race condition in the local docker runner.

When #1407 is merged, I'll close this ticket.
If you see the bug AFTER it was merged (and you updated testground), please let us know here or in a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants