Skip to content

Commit

Permalink
feat(test-tooling): containers pull image retries exp. back-off #656
Browse files Browse the repository at this point in the history
Potentially fixing #656. Definitely improves the situation but it
is impossible to tell in advance if this will make all the other-
wise non-reproducible issues go away. Fingers crossed.

This change makes it so that the pullImage(...) method of the Containers
utility class will now - by default - retry 6 times if the docker
image pulling has failed. The internval between retries is
increasing exponentially (power of two) starting from one
second as the delay then proceeding to be 2^6 seconds
for the final retry (which if also fails then an AbortError
is thrown by the underlying pRetry library that is powering
the retry mechanism.)

For reference, here is a randomly failed CI test execution
where the logs show that DockerHub is randomly in-
accessible over the network and that's another thing that
makes our tests flaky, hence this commit to fix this.

https://github.com/hyperledger/cactus/runs/2178802580?check_suite_focus=true#step:8:2448

In case that link goes dead in the future, here's also the actual logs:

not ok 60 - packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts # time=25389.665ms
  ---
  env:
    TS_NODE_COMPILER_OPTIONS: '{"jsx":"react"}'
  file: packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts
  timeout: 1800000
  command: /opt/hostedtoolcache/node/12.13.0/x64/bin/node
  args:
    - -r
    - /home/runner/work/cactus/cactus/node_modules/ts-node/register/index.js
    - --max-old-space-size=4096
    - packages/cactus-test-cmd-api-server/src/test/typescript/integration/remote-plugin-imports.test.ts
  stdio:
    - 0
    - pipe
    - 2
  cwd: /home/runner/work/cactus/cactus
  exitCode: 1
  ...
{
    # NodeJS API server + Rust plugin work together
    [2021-03-23T20:45:51.458Z] INFO (VaultTestServer): Created VaultTestServer OK. Image FQN: vault:1.6.1
    not ok 1 Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
      ---
        operator: error
        at: bound (/home/runner/work/cactus/cactus/node_modules/onetime/index.js:30:12)
        stack: |-
          Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
              at /home/runner/work/cactus/cactus/packages/cactus-test-tooling/node_modules/docker-modem/lib/modem.js:301:17
              at IncomingMessage.<anonymous> (/home/runner/work/cactus/cactus/packages/cactus-test-tooling/node_modules/docker-modem/lib/modem.js:328:9)
              at IncomingMessage.emit (events.js:215:7)
              at endReadableNT (_stream_readable.js:1183:12)
              at processTicksAndRejections (internal/process/task_queues.js:80:21)
      ...

    Bail out! Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
}
Bail out! Error: (HTTP code 500) server error - Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

Signed-off-by: Peter Somogyvari <peter.somogyvari@accenture.com>
  • Loading branch information
petermetz committed Mar 25, 2021
1 parent 1a84b57 commit 2735ec2
Show file tree
Hide file tree
Showing 3 changed files with 85 additions and 27 deletions.
19 changes: 19 additions & 0 deletions packages/cactus-test-tooling/package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 6 additions & 2 deletions packages/cactus-test-tooling/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,10 @@
"ignore": [
"src/**/generated/*"
],
"extensions": ["ts", "json"],
"extensions": [
"ts",
"json"
],
"quiet": true,
"verbose": false,
"runOnChangeOnly": true
Expand Down Expand Up @@ -91,6 +94,7 @@
"loglevel": "1.6.7",
"loglevel-plugin-prefix": "0.8.4",
"node-ssh": "11.1.1",
"p-retry": "4.4.0",
"tar-stream": "2.1.2",
"typescript-optional": "2.0.1",
"web3": "1.2.7"
Expand All @@ -100,8 +104,8 @@
"@types/extract-zip": "1.6.2",
"@types/fs-extra": "8.1.0",
"@types/joi": "14.3.4",
"@types/ssh2": "0.5.44",
"@types/node-ssh": "7.0.0",
"@types/ssh2": "0.5.44",
"@types/tar-stream": "2.1.0"
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,15 @@ import { Container, ContainerInfo } from "dockerode";
import Dockerode from "dockerode";
import tar from "tar-stream";
import fs from "fs-extra";
import pRetry from "p-retry";
import { Streams } from "../common/streams";
import { Checks, LoggerProvider, Strings } from "@hyperledger/cactus-common";
import { LogLevelDesc } from "loglevel";
import {
Checks,
LogLevelDesc,
LoggerProvider,
Strings,
ILoggerOptions,
} from "@hyperledger/cactus-common";

export interface IPruneDockerResourcesRequest {
logLevel?: LogLevelDesc;
Expand Down Expand Up @@ -297,33 +303,62 @@ export class Containers {
return NetworkSettings.Networks[networkNames[0]].IPAddress;
}
}

public static pullImage(
containerNameAndTag: string,
pullOptions: any = {},
imageFqn: string,
options: any = {},
logLevel?: LogLevelDesc,
): Promise<any[]> {
const defualtLoggerOptions: ILoggerOptions = {
label: "containers#pullImage()",
level: logLevel || "INFO",
};
const log = LoggerProvider.getOrCreate(defualtLoggerOptions);
const task = () => Containers.tryPullImage(imageFqn, options, logLevel);
const retryOptions: pRetry.Options = {
retries: 6,
onFailedAttempt: async (ex) => {
log.debug(`Failed attempt at pulling container image ${imageFqn}`, ex);
},
};
return pRetry(task, retryOptions);
}

public static tryPullImage(
imageFqn: string,
options: any = {},
logLevel?: LogLevelDesc,
): Promise<any[]> {
return new Promise((resolve, reject) => {
const loggerOptions: ILoggerOptions = {
label: "containers#tryPullImage()",
level: logLevel || "INFO",
};
const log = LoggerProvider.getOrCreate(loggerOptions);

const docker = new Dockerode();
docker.pull(
containerNameAndTag,
pullOptions,
(pullError: any, stream: any) => {
if (pullError) {
reject(pullError);
} else {
docker.modem.followProgress(
stream,
(progressError: any, output: any[]) => {
if (progressError) {
reject(progressError);
} else {
resolve(output);
}
},
);
}
},
);

const pullStreamStartedHandler = (pullError: any, stream: any) => {
if (pullError) {
log.error(`Could not even start ${imageFqn} pull:`, pullError);
reject(pullError);
} else {
log.debug(`Started ${imageFqn} pull progress stream OK`);
docker.modem.followProgress(
stream,
(progressError: any, output: any[]) => {
if (progressError) {
log.error(`Failed to finish ${imageFqn} pull:`, progressError);
reject(progressError);
} else {
log.debug(`Finished ${imageFqn} pull completely OK`);
resolve(output);
}
},
);
}
};

docker.pull(imageFqn, options, pullStreamStartedHandler);
});
}

Expand Down

0 comments on commit 2735ec2

Please sign in to comment.