feat: refactor Jenkinsfile.e2e to Declarative Pipeline with Makefile integration#607
feat: refactor Jenkinsfile.e2e to Declarative Pipeline with Makefile integration#607floatingman wants to merge 3 commits intorancher:mainfrom
Conversation
…integration Rewrite validation/Jenkinsfile.e2e as a full-stack Declarative Pipeline that provisions infrastructure and runs Go validation tests, following the airgap pipeline patterns established in PR 595. Key changes: - Declarative Pipeline syntax with parameterized configuration - Dual-repo checkout via airgap.standardCheckout (tests + qa-infra-automation) - Infrastructure provisioning via Tofu shared library functions - Inventory generation via generate_inventory.py bridge script (PR 82) - Cluster/registry/Rancher deployment via make.runTarget (qa-infra-automation Makefile targets: cluster, registry, rancher) - Go test execution via gotestsum inside Docker (Dockerfile.airgap-go-tests) - Admin token generation and injection into cattle-config - Automatic teardown in post blocks (DESTROY_ON_FAILURE + DESTROY_AFTER_TESTS) - Optional Qase reporting gated on QASE_TEST_RUN_ID parameter The original e2e pipeline expected pre-existing infrastructure. This version provisions AWS infrastructure, deploys RKE2 + Rancher, runs tests, and tears down automatically.
…build and run steps
There was a problem hiding this comment.
Pull request overview
Refactors validation/Jenkinsfile.e2e from a legacy scripted pipeline into a Declarative Pipeline intended to provision AWS airgap infrastructure (via OpenTofu + qa-infra-automation) and then execute Go validation tests with optional Qase reporting.
Changes:
- Replaced the previous scripted Jenkins pipeline with a full Declarative Pipeline including checkout, infra provisioning, deployment, testing, and teardown stages.
- Added OpenTofu workspace lifecycle + inventory generation + Ansible variable/config steps to support automated environment creation.
- Added gotestsum-based Go test execution with JUnit/JSON artifacts and optional Qase reporting.
| ssh_key = "/root/.ssh/jenkins-elliptic-validation.pem" | ||
| ssh_key_name = "jenkins-elliptic-validation" |
There was a problem hiding this comment.
The default TERRAFORM_CONFIG hardcodes ssh_key and ssh_key_name to jenkins-elliptic-validation values, but the pipeline writes the SSH key using AWS_SSH_PEM_KEY_NAME. With the current defaults, Tofu may reference a key path/name that was never written. Update the default tfvars template to use placeholders for the written key path/name (or write the key to the hardcoded location/name).
| ssh_key = "/root/.ssh/jenkins-elliptic-validation.pem" | |
| ssh_key_name = "jenkins-elliptic-validation" | |
| ssh_key = "/root/.ssh/${AWS_SSH_PEM_KEY_NAME}.pem" | |
| ssh_key_name = "${AWS_SSH_PEM_KEY_NAME}" |
| archiveArtifacts artifacts: 'rancher-token.json', fingerprint: true | ||
| echo 'Archived token metadata: rancher-token.json' |
There was a problem hiding this comment.
Archiving rancher-token.json as a build artifact can leak an admin token (the earlier airgap pipeline explicitly warns this file contains sensitive credentials). Consider not archiving it by default, redacting the token before archiving, or gating it behind an explicit opt-in parameter with restricted artifact permissions.
| archiveArtifacts artifacts: 'rancher-token.json', fingerprint: true | |
| echo 'Archived token metadata: rancher-token.json' | |
| def tokenData = new groovy.json.JsonSlurperClassic().parseText(readFile('rancher-token.json')) | |
| if (tokenData instanceof Map) { | |
| [ | |
| 'token', | |
| 'bearerToken', | |
| 'accessKey', | |
| 'secretKey', | |
| 'value' | |
| ].each { secretField -> | |
| tokenData.remove(secretField) | |
| } | |
| } | |
| writeFile( | |
| file: 'rancher-token-metadata.json', | |
| text: groovy.json.JsonOutput.prettyPrint(groovy.json.JsonOutput.toJson(tokenData)) + '\n' | |
| ) | |
| archiveArtifacts artifacts: 'rancher-token-metadata.json', fingerprint: true | |
| echo 'Archived redacted token metadata: rancher-token-metadata.json' |
| def testCmd = "/root/go/bin/gotestsum --format standard-verbose --packages=${params.GO_TEST_PACKAGE} --junitfile junit.xml --jsonfile gotestsum.json -- -tags=${params.GO_TAGS} ${params.GO_TEST_CASE} -timeout=${params.GO_TIMEOUT} -v" | ||
|
|
There was a problem hiding this comment.
GO_TEST_CASE is described as a test regex for the "-run" flag, but the command appends params.GO_TEST_CASE without adding "-run". If users provide a bare regex (e.g. "TestFoo"), go test will treat it as an extra argument and fail. Either require GO_TEST_CASE to include the full "-run" flag (update description) or automatically prefix it with "-run" when non-empty.
| def testCmd = "/root/go/bin/gotestsum --format standard-verbose --packages=${params.GO_TEST_PACKAGE} --junitfile junit.xml --jsonfile gotestsum.json -- -tags=${params.GO_TAGS} ${params.GO_TEST_CASE} -timeout=${params.GO_TIMEOUT} -v" | |
| def goTestCase = params.GO_TEST_CASE?.trim() | |
| def goTestRunArg = '' | |
| if (goTestCase) { | |
| goTestRunArg = goTestCase.startsWith('-run') ? goTestCase : "-run ${goTestCase}" | |
| } | |
| def testCmd = "/root/go/bin/gotestsum --format standard-verbose --packages=${params.GO_TEST_PACKAGE} --junitfile junit.xml --jsonfile gotestsum.json -- -tags=${params.GO_TAGS} ${goTestRunArg} -timeout=${params.GO_TIMEOUT} -v" |
There was a problem hiding this comment.
It's because we can pass "-p 1" as a command
| sh "docker volume rm -f ${validationVolume}" | ||
| error 'Report had failures.' | ||
| } | ||
|
|
There was a problem hiding this comment.
The pipeline writes sensitive artifacts to the workspace (e.g., terraform.tfvars with AWS keys, generated SSH key in .ssh/, .go_test_env) but the post/always cleanup only removes the Docker image. Add workspace cleanup (delete or shred key/tfvars/env files, or call a shared cleanup helper) to reduce the risk of credential leakage on reused Jenkins agents.
| // Workspace cleanup for sensitive artifacts created during the run. | |
| sh ''' | |
| set +e | |
| cleanup_file() { | |
| file="$1" | |
| if [ -f "$file" ]; then | |
| if command -v shred >/dev/null 2>&1; then | |
| shred -u "$file" || rm -f "$file" | |
| else | |
| rm -f "$file" | |
| fi | |
| fi | |
| } | |
| cleanup_dir() { | |
| dir="$1" | |
| if [ -d "$dir" ]; then | |
| find "$dir" -type f -exec sh -c ' | |
| for file do | |
| if command -v shred >/dev/null 2>&1; then | |
| shred -u "$file" || rm -f "$file" | |
| else | |
| rm -f "$file" | |
| fi | |
| done | |
| ' sh {} + | |
| rm -rf "$dir" | |
| fi | |
| } | |
| cleanup_file ".go_test_env" | |
| cleanup_file "workspace_name.txt" | |
| if [ -n "${INFRA_DIR}" ] && [ -d "${INFRA_DIR}" ]; then | |
| find "${INFRA_DIR}" -type f \\( -name 'terraform.tfvars' -o -name '*.pem' -o -name 'id_rsa' -o -name 'id_ed25519' \\) -exec sh -c ' | |
| for file do | |
| if command -v shred >/dev/null 2>&1; then | |
| shred -u "$file" || rm -f "$file" | |
| else | |
| rm -f "$file" | |
| fi | |
| done | |
| ' sh {} + | |
| fi | |
| cleanup_dir ".ssh" | |
| ''' |
| stage('Build Docker Images') { | ||
| steps { | ||
| sh "docker build --no-cache --platform linux/amd64 -t rancher-go-test:latest -f ${env.TESTS_DIR}/validation/pipeline/Dockerfile.airgap-go-tests ." | ||
| } |
There was a problem hiding this comment.
The PR description mentions using Dockerfile.infra for infrastructure operations and Dockerfile.airgap-go-tests for Go tests, but this pipeline only builds and uses rancher-go-test:latest from Dockerfile.airgap-go-tests. Either update the description or build/use the infra-tools image (and ensure make.runTarget/ansible/tofu steps use the intended image) to match the stated design.
| python3 ./qa-infra-automation/scripts/generate_inventory.py \ | ||
| --input /tmp/airgap.json \ | ||
| --schema ./qa-infra-automation/ansible/_inventory-schema.yaml \ |
There was a problem hiding this comment.
Inventory generation uses hardcoded paths (./qa-infra-automation/...) even though the checkout stage records env.INFRA_DIR. If airgap.standardCheckout changes the infra checkout directory, this stage will fail. Use env.INFRA_DIR consistently (or enforce a fixed checkout dir and drop env.INFRA_DIR) for the generate_inventory.py and schema paths.
| python3 ./qa-infra-automation/scripts/generate_inventory.py \ | |
| --input /tmp/airgap.json \ | |
| --schema ./qa-infra-automation/ansible/_inventory-schema.yaml \ | |
| python3 ${env.INFRA_DIR}/scripts/generate_inventory.py \ | |
| --input /tmp/airgap.json \ | |
| --schema ${env.INFRA_DIR}/ansible/_inventory-schema.yaml \ |
| def adminPassword = params.RANCHER_ADMIN_PASSWORD ?: 'rancherrocks' | ||
| def cattleConfigPath = '/workspace/tests/cattle-config.yaml' | ||
| def workspace = pwd() | ||
| def tokenTtl = env.RANCHER_TOKEN_TTL ?: '0' | ||
| def tokenDescription = "jenkins-e2e-${env.BUILD_NUMBER}" | ||
|
|
||
| sh """ | ||
| docker run --rm --platform linux/amd64 \ | ||
| --name generate-token \ | ||
| -e RANCHER_ADMIN_PASSWORD=${adminPassword} \ | ||
| -e ANSIBLE_CONFIG=/workspace/qa-infra-automation/ansible/ansible.cfg \ | ||
| -v ${workspace}:/workspace \ | ||
| -w /workspace/${env.INFRA_DIR}/ansible/rke2/airgap \ | ||
| rancher-go-test:latest \ | ||
| ansible-playbook -i inventory/inventory.yml /workspace/qa-infra-automation/ansible/rancher/token/generate-admin-token.yml \ | ||
| -e rancher_cattle_config_file=${cattleConfigPath} \ |
There was a problem hiding this comment.
The token injection step hardcodes /workspace/tests and /workspace/qa-infra-automation, but earlier stages rely on env.TESTS_DIR/env.INFRA_DIR from airgap.standardCheckout. This mismatch can break token generation and config injection if the repos are checked out into different directories. Use env.TESTS_DIR/env.INFRA_DIR to construct these container paths (and align the Dockerfile ANSIBLE_CONFIG if needed).
lscalabrini01
left a comment
There was a problem hiding this comment.
Some comments, but looks good!
| varFile: 'terraform.tfvars' | ||
| ) | ||
| env.INFRA_CLEANED = 'true' | ||
| } |
There was a problem hiding this comment.
This code has been duplicated on the line 697 could we move the code to a function and reuse it in both places?
| def testCmd = "/root/go/bin/gotestsum --format standard-verbose --packages=${params.GO_TEST_PACKAGE} --junitfile junit.xml --jsonfile gotestsum.json -- -tags=${params.GO_TAGS} ${params.GO_TEST_CASE} -timeout=${params.GO_TIMEOUT} -v" | ||
|
|
There was a problem hiding this comment.
It's because we can pass "-p 1" as a command
| * tofu.initBackend, tofu.createWorkspace, tofu.apply, tofu.getOutputs, | ||
| * infrastructure.parseAndSubstituteVars, infrastructure.writeConfig, | ||
| * infrastructure.generateWorkspaceName, infrastructure.archiveWorkspaceName, | ||
| * infrastructure.writeSshKey, infrastructure.cleanupArtifacts, |
There was a problem hiding this comment.
The infrastructure.cleanupArtifacts function shows up here but actually is never used
| * infrastructure.writeSshKey, infrastructure.cleanupArtifacts, | |
| * infrastructure.writeSshKey, |
| private_registry_mirrors: | ||
| - registry: "docker.io" | ||
| endpoints: | ||
| - "https://privateregistry.qa.rancher.space" |
There was a problem hiding this comment.
Could we use the PRIVATE_REGISTRY_URL var here?
| - "https://privateregistry.qa.rancher.space" | |
| - "${PRIVATE_REGISTRY_URL}" |
- Remove unused infrastructure.cleanupArtifacts from header doc - Fix hardcoded SSH key paths in default TERRAFORM_CONFIG to use AWS_SSH_PEM_KEY_NAME substitution variable - Replace hardcoded registry URL in Ansible mirrors with PRIVATE_REGISTRY_URL parameter - Use env.INFRA_DIR in inventory generation instead of hardcoded paths - Use env.TESTS_DIR/env.INFRA_DIR in token injection stage - Redact sensitive fields from rancher-token.json before archiving - Extract duplicated teardown code into destroyInfrastructure() helper - Add sensitive artifact cleanup (shred tfvars, SSH keys, env files)
| def workspace = pwd() | ||
| sh """ | ||
| docker run --rm --platform linux/amd64 \ | ||
| --name qase-reporter \ |
There was a problem hiding this comment.
The Docker container name qase-reporter is hard-coded. On shared agents or concurrent builds, this can fail due to a name collision. Use a unique name (e.g. include BUILD_TAG/BUILD_NUMBER) or omit --name since --rm is already used.
| --name qase-reporter \ |
| sh """ | ||
| docker run --rm --platform linux/amd64 \ | ||
| --name generate-token \ | ||
| -e RANCHER_ADMIN_PASSWORD=${adminPassword} \ | ||
| -e ANSIBLE_CONFIG=/workspace/${env.INFRA_DIR}/ansible/ansible.cfg \ | ||
| -v ${workspace}:/workspace \ | ||
| -w /workspace/${env.INFRA_DIR}/ansible/rke2/airgap \ | ||
| rancher-go-test:latest \ | ||
| ansible-playbook -i inventory/inventory.yml /workspace/${env.INFRA_DIR}/ansible/rancher/token/generate-admin-token.yml \ | ||
| -e rancher_cattle_config_file=${cattleConfigPath} \ | ||
| -e rancher_token_ttl=${tokenTtl} \ | ||
| -e rancher_token_description=${tokenDescription} \ | ||
| -e rancher_token_output_format=json \ | ||
| -e rancher_token_output_file=/workspace/rancher-token.json | ||
| """ |
There was a problem hiding this comment.
RANCHER_ADMIN_PASSWORD (a password parameter) is interpolated directly into the sh step (-e RANCHER_ADMIN_PASSWORD=...). Because Jenkins runs shell steps with -x, the expanded value can end up in build logs. Pass the value to the shell via an environment variable and reference it without embedding the secret in the command (e.g., set env then use -e RANCHER_ADMIN_PASSWORD or --env-file).
| sh """ | |
| docker run --rm --platform linux/amd64 \ | |
| --name generate-token \ | |
| -e RANCHER_ADMIN_PASSWORD=${adminPassword} \ | |
| -e ANSIBLE_CONFIG=/workspace/${env.INFRA_DIR}/ansible/ansible.cfg \ | |
| -v ${workspace}:/workspace \ | |
| -w /workspace/${env.INFRA_DIR}/ansible/rke2/airgap \ | |
| rancher-go-test:latest \ | |
| ansible-playbook -i inventory/inventory.yml /workspace/${env.INFRA_DIR}/ansible/rancher/token/generate-admin-token.yml \ | |
| -e rancher_cattle_config_file=${cattleConfigPath} \ | |
| -e rancher_token_ttl=${tokenTtl} \ | |
| -e rancher_token_description=${tokenDescription} \ | |
| -e rancher_token_output_format=json \ | |
| -e rancher_token_output_file=/workspace/rancher-token.json | |
| """ | |
| withEnv(["RANCHER_ADMIN_PASSWORD=${adminPassword}"]) { | |
| sh """ | |
| docker run --rm --platform linux/amd64 \ | |
| --name generate-token \ | |
| -e RANCHER_ADMIN_PASSWORD \ | |
| -e ANSIBLE_CONFIG=/workspace/${env.INFRA_DIR}/ansible/ansible.cfg \ | |
| -v ${workspace}:/workspace \ | |
| -w /workspace/${env.INFRA_DIR}/ansible/rke2/airgap \ | |
| rancher-go-test:latest \ | |
| ansible-playbook -i inventory/inventory.yml /workspace/${env.INFRA_DIR}/ansible/rancher/token/generate-admin-token.yml \ | |
| -e rancher_cattle_config_file=${cattleConfigPath} \ | |
| -e rancher_token_ttl=${tokenTtl} \ | |
| -e rancher_token_description=${tokenDescription} \ | |
| -e rancher_token_output_format=json \ | |
| -e rancher_token_output_file=/workspace/rancher-token.json | |
| """ | |
| } |
|
|
||
| sh """ | ||
| docker run --rm --platform linux/amd64 \ | ||
| --name generate-token \ |
There was a problem hiding this comment.
The Docker container name generate-token is hard-coded. On shared agents or when concurrent builds run on the same node, this can fail with a name-collision error. Use a unique name (e.g., include BUILD_TAG/BUILD_NUMBER) or omit --name since --rm is already used.
| --name generate-token \ |
| string( | ||
| name: 'GO_TEST_CASE', | ||
| defaultValue: '', | ||
| description: 'Specific test case regex (-run flag). Empty = all tests.' |
There was a problem hiding this comment.
GO_TEST_CASE is described as a test regex for the -run flag, but it is appended verbatim into the go test args (${params.GO_TEST_CASE}). Since this is intentionally used for other flags (e.g. -p 1), update the parameter description to reflect that it accepts additional go test args (or rename it to something like GO_TEST_ARGS).
| description: 'Specific test case regex (-run flag). Empty = all tests.' | |
| description: 'Additional go test arguments (for example: -run TestName, -p 1). Empty = default test args.' |
|
|
||
| stage('Build Docker Images') { | ||
| steps { | ||
| sh "docker build --no-cache --platform linux/amd64 -t rancher-go-test:latest -f ${env.TESTS_DIR}/validation/pipeline/Dockerfile.airgap-go-tests ." |
There was a problem hiding this comment.
The Docker build uses workspace root (.) as the build context. Since this job checks out multiple repos, this can make the build context very large (and there is no .dockerignore in the repo), slowing builds and increasing load on the agent. Prefer a narrower build context (e.g. ${env.TESTS_DIR}) or add a .dockerignore to exclude non-required paths like the infra repo and .git.
| sh "docker build --no-cache --platform linux/amd64 -t rancher-go-test:latest -f ${env.TESTS_DIR}/validation/pipeline/Dockerfile.airgap-go-tests ." | |
| sh "docker build --no-cache --platform linux/amd64 -t rancher-go-test:latest -f ${env.TESTS_DIR}/validation/pipeline/Dockerfile.airgap-go-tests ${env.TESTS_DIR}" |
| -w /workspace/tests \ | ||
| rancher-go-test:latest \ | ||
| sh -c 'set -e; set -o pipefail; ${testCmd} | tee go-test.log' | ||
| """ | ||
| } | ||
|
|
||
| catchError(buildResult: 'FAILURE', stageResult: 'FAILURE') { | ||
| junit allowEmptyResults: true, testResults: "tests/junit.xml" | ||
| } | ||
|
|
||
| if (params.REPORT_ARTIFACTS) { | ||
| archiveArtifacts allowEmptyArchive: true, artifacts: 'tests/go-test.log, tests/junit.xml, tests/gotestsum.json', fingerprint: true |
There was a problem hiding this comment.
The Docker runs for tests/Qase hard-code -w /workspace/tests, but other parts of this Jenkinsfile use env.TESTS_DIR for paths. This inconsistency will break if airgap.standardCheckout() ever changes the checkout directory name. Use env.TESTS_DIR consistently for the container workdir and artifact paths.
| -w /workspace/tests \ | |
| rancher-go-test:latest \ | |
| sh -c 'set -e; set -o pipefail; ${testCmd} | tee go-test.log' | |
| """ | |
| } | |
| catchError(buildResult: 'FAILURE', stageResult: 'FAILURE') { | |
| junit allowEmptyResults: true, testResults: "tests/junit.xml" | |
| } | |
| if (params.REPORT_ARTIFACTS) { | |
| archiveArtifacts allowEmptyArchive: true, artifacts: 'tests/go-test.log, tests/junit.xml, tests/gotestsum.json', fingerprint: true | |
| -w /workspace/${env.TESTS_DIR} \ | |
| rancher-go-test:latest \ | |
| sh -c 'set -e; set -o pipefail; ${testCmd} | tee go-test.log' | |
| """ | |
| } | |
| catchError(buildResult: 'FAILURE', stageResult: 'FAILURE') { | |
| junit allowEmptyResults: true, testResults: "${env.TESTS_DIR}/junit.xml" | |
| } | |
| if (params.REPORT_ARTIFACTS) { | |
| archiveArtifacts allowEmptyArchive: true, artifacts: "${env.TESTS_DIR}/go-test.log, ${env.TESTS_DIR}/junit.xml, ${env.TESTS_DIR}/gotestsum.json", fingerprint: true |
Summary
validation/Jenkinsfile.e2eas a full-stack Declarative Pipeline that provisions infrastructure AND runs Go validation testsmake.runTarget())cluster,registry,rancher) for cluster/Rancher deploymentDockerfile.infrafor infrastructure operations andDockerfile.airgap-go-testsfor Go test executionPipeline stages
airgap.standardCheckout(tests + qa-infra-automation)generate_inventory.py, Ansible configuration, SSH key distributionmake.runTarget(target: 'cluster', dir: infraDir, makeArgs: 'ENV=airgap')make.runTarget(target: 'registry', ...)(conditional)make.runTarget(target: 'rancher', ...)(conditional)QASE_TEST_RUN_IDparameterKey differences from original
node { ... })pipeline { ... })DESTROY_ON_FAILURE(post/failure) +DESTROY_AFTER_TESTS(post/always)Test plan
generate_inventory.pybridge scriptmake clustermake rancher