fix(test-tooling): failing fabric AIO container launch #320

petermetz · 2020-10-19T16:26:26Z

Describe the bug

CI tests are failing in the test tooling package that verifies that the Fabric all in one (AIO) image works as expected.

To Reproduce

Appears to be flaky.
Run the CI script to attempt to reproduce: npm run run-ci

Expected behavior

Test should be consistently failing or succeeding.

Logs/Stack traces

https://travis-ci.org/github/hyperledger/cactus/jobs/735204836

Cloud provider or hardware configuration:

Travis CI virtual machine

Operating system name, version, build:

Details are in the linked logs above.

Hyperledger Cactus release version or commit (git rev-parse --short HEAD):

a74a7ed

Hyperledger Cactus Plugins/Connectors Used

Fabric

Additional context

Issue appears to be about ports that are already allocated. The Fabric AIO image has this limitation that it does not randomly assign published ports because we did not yet finish the docker in docker (DIND) support for it and that is needed for the peer containers (or we must bind to specific host ports instead of random ones and for now that's just how we do it)

The text was updated successfully, but these errors were encountered:

petermetz · 2020-10-21T21:21:10Z

A little more investigation and I'm thinking this is most likely due to the fixed host ports we use. The CI VM runs the CI script twice against both NodeJS 12 and 14 and so one of them gets knocked out when it tires to launch the AIO container while the other instance of the CI script is doing the same thing sitting on the host port they both want.

Meaning that this issue will most likely be fixed by #279 or anything else that gives us PublishAllPorts: true capabilities for the Fabric AIO images.

petermetz · 2020-10-21T22:04:17Z

Related: MiniFabric might provide a solution idea that we can re-use or just flat out make our own Fabric AIO image inherit from the MiniFabric image... Not sure yet, but I've made some inquiries here: hyperledger-labs/minifabric#105

petermetz · 2020-10-30T18:16:29Z

Related: MiniFabric might provide a solution idea that we can re-use or just flat out make our own Fabric AIO image inherit from the MiniFabric image... Not sure yet, but I've made some inquiries here: hyperledger-labs/minifabric#105

Unfortunately MiniFabric does not support pulling up multiple ledgers, just multiple channels within the same ledger but that is not a good fit for our tests. Still worth evaluating as some kind of workaround to use it if we cannot make DinD work within reasonable time.

petermetz · 2021-02-09T01:02:21Z

This is pretty unbounded in complexity because we are missing a feature from the Fabric NodeJS SDK to support discovery on non-standard ports/customizable ports. One option would be to monkey patch the Fabric SDK, which, if it works, then this was actually pretty easy to solve. We haven't done the exploration yet to check this though.

Epic facepalm once again. Turns out the default restart try count of supervisord is too low which leads to race conditions. Increasing the retry count from 4 to 20 should do it, this way the fabric-network process (see supervisord.conf file) should be 5 times as "patient" waiting for the docker daemon to launch within the AIO container. What was happening before is that the fabric-network script tried launching itself in parallel with the docker daemon, but it would time out before the docker daemon could come online. Published these images as ghcr.io/hyperledger/cactus-fabric2-all-in-one:2021-09-02--fix-876-supervisord-retries and ghcr.io/hyperledger/cactus-fabric-all-in-one:2021-09-02--fix-876-supervisord-retries Fixes hyperledger#718 Fixes hyperledger#876 Fixes hyperledger#320 Fixes hyperledger#319 Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>

Epic facepalm once again. Turns out the default restart try count of supervisord is too low which leads to race conditions. Increasing the retry count from 4 to 20 should do it, this way the fabric-network process (see supervisord.conf file) should be 5 times as "patient" waiting for the docker daemon to launch within the AIO container. What was happening before is that the fabric-network script tried launching itself in parallel with the docker daemon, but it would time out before the docker daemon could come online. Published these images as ghcr.io/hyperledger/cactus-fabric2-all-in-one:2021-09-02--fix-876-supervisord-retries and ghcr.io/hyperledger/cactus-fabric-all-in-one:2021-09-02--fix-876-supervisord-retries Fixes #718 Fixes #876 Fixes #320 Fixes #319 Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>

Epic facepalm once again. Turns out the default restart try count of supervisord is too low which leads to race conditions. Increasing the retry count from 4 to 20 should do it, this way the fabric-network process (see supervisord.conf file) should be 5 times as "patient" waiting for the docker daemon to launch within the AIO container. What was happening before is that the fabric-network script tried launching itself in parallel with the docker daemon, but it would time out before the docker daemon could come online. Published these images as ghcr.io/hyperledger/cactus-fabric2-all-in-one:2021-09-02--fix-876-supervisord-retries and ghcr.io/hyperledger/cactus-fabric-all-in-one:2021-09-02--fix-876-supervisord-retries Fixes hyperledger#718 Fixes hyperledger#876 Fixes hyperledger#320 Fixes hyperledger#319 Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>

Minor Version Changes

petermetz added bug Something isn't working Fabric labels Oct 19, 2020

petermetz added this to the v1.0.0 milestone Oct 19, 2020

petermetz mentioned this issue Nov 5, 2020

docs(examples): extend supply chain app with fabric specific elements #362

Closed

petermetz added dependencies Pull requests that update a dependency file Developer_Experience labels Feb 9, 2021

petermetz self-assigned this Sep 2, 2021

petermetz mentioned this issue Sep 3, 2021

fix(test): flaky fabric AIO container boot #876 #1300

Merged

petermetz closed this as completed in #1300 Sep 7, 2021

ryjones pushed a commit that referenced this issue Feb 1, 2023

Merge pull request #320 from VRamakrishna/main

4bcac88

Minor Version Changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(test-tooling): failing fabric AIO container launch #320

fix(test-tooling): failing fabric AIO container launch #320

petermetz commented Oct 19, 2020

petermetz commented Oct 21, 2020

petermetz commented Oct 21, 2020

petermetz commented Oct 30, 2020

petermetz commented Feb 9, 2021

fix(test-tooling): failing fabric AIO container launch #320

fix(test-tooling): failing fabric AIO container launch #320

Comments

petermetz commented Oct 19, 2020

petermetz commented Oct 21, 2020

petermetz commented Oct 21, 2020

petermetz commented Oct 30, 2020

petermetz commented Feb 9, 2021