Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions ci-operator/step-registry/abi/OWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
approvers: &owners
- cspi-qe-ocp-lp
- ieng-chaos
reviewers: *owners
124 changes: 124 additions & 0 deletions ci-operator/step-registry/abi/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
# Agent-based Installer (ABI)

**Layout (step-registry paths):** `conf/<platform>/` holds manifest / image-input work; `install/<mechanism>/` holds boot and cluster deployment (e.g. **BMC**
virtual media today; **PXE** or other targets can be added alongside without colliding with bare-metal **conf**). See `conf/bm` and `install/bmc` below.

**Step Inputs Parameters (names, defaults, semantics):**
| Step | Reference (source of truth) | Registry Documentation |
|---------------------|-----------------------------------------------------------------------|-------------------------------------------------------------------------------|
| **abi-conf-bm** | [`abi-conf-bm-ref.yaml`](conf/bm/abi-conf-bm-ref.yaml) | [`abi-conf-bm`](https://steps.ci.openshift.org/reference/abi-conf-bm) |
| **abi-install-bmc** | [`abi-install-bmc-ref.yaml`](install/bmc/abi-install-bmc-ref.yaml) | [`abi-install-bmc`](https://steps.ci.openshift.org/reference/abi-install-bmc) |

**Steps Execution Order:** [`abi-conf-bm-commands.sh`](conf/bm/abi-conf-bm-commands.sh) → [`abi-install-bmc-commands.sh`](install/bmc/abi-install-bmc-commands.sh)

**Official Documentation:** [Preparing to install with the Agent-based Installer](https://docs.redhat.com/en/documentation/openshift_container_platform/latest/html/installing_an_on-premise_cluster_with_the_agent-based_installer/preparing-to-install-with-the-agent-based-installer).

## Installation Phases

| Phase | Comments |
|---------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **Day-0** | Cluster Configuration.<br> Creates a bare-minimum `install-config.yaml` and generates an `agent-config.yaml` template. Then `UpdateCfg Day0` applies overrides from `OCP__ABI__CFG_FN`, followed by `OCP__ABI__DAY0_SCRIPTS_YAML`. Both configuration files must be complete before proceeding to Day-1. |
| **Day-1** | Manifest Customization.<br> Generates the full manifest tree under `openshift/` (`agent create cluster-manifests`). Then `UpdateCfg Day1` applies overrides from `OCP__ABI__CFG_FN`, followed by `OCP__ABI__DAY1_SCRIPTS_YAML`, before the ISO is built. |
| **Day-1.5** | Post-Bootstrap Operations.<br> Runs after `agent wait-for bootstrap-complete`. Applies custom actions as configured in `OCP__ABI__CFG_FN` (e.g. scale Worker MachineSets to 0 when workers are provisioned directly by ABI). Runs concurrently with `wait-for install-complete`. |
| **Day-2** | Post-Deployment Customization.<br> Runs after `agent wait-for install-complete` and `KUBECONFIG` is set. Custom post-deployment actions via `OCP__ABI__DAY2_SCRIPTS_YAML` (e.g. install operators, apply policies). |

`SHARED_DIR` holds inter-step artifacts (tarball, kubeconfig, `kubeconfig-minimal`). Logs and `ocp.tgz` → `ARTIFACT_DIR`.

## OCP__ABI__CFG_FN

Pre-populate `OCP__ABI__CFG_FN` (e.g., `${CLUSTER_PROFILE_DIR}/ocp--abi--cfg.yaml`) with the full `agent-config.yaml`, e.g. Host definitions (NMState network
config, BMC addresses), and any extra configuration needed:
```yaml
Day0:
config: {}
configFileOverride:
yaml+:
- ...yamlCfg...:
...yamlCfgContentToDeepMergeAppendArray...
yaml-:
- ...yamlCfg...:
...yamlCfgContentToDeepMergeReplaceArray...
yaml=:
- ...yamlCfg...:
...yamlCfgContentToReplace...
json+:
- ...jsonCfg...: |
...jsonCfgContentToDeepMergeAppendArray...
json-:
- ...jsonCfg...: |
...jsonCfgContentToDeepMergeReplaceArray...
json=:
- ...jsonCfg...: |
...jsonCfgContentToReplace...
Day1: # Same schema as `Day0`
...
Day1.5:
config:
- NodeProv: ...booleanNodeProvisioningStatus...
Day2: # Same schema as `Day1.5`
...
```

Example:
```yaml
Day0:
configFileOverride:
yaml-:
- install-config.yaml:
networking:
machineNetwork:
- cidr: 10.6.158.0/24
platform:
baremetal:
apiVIPs:
- 10.6.158.26
ingressVIPs:
- 10.6.158.27
provisioningNetwork: Disabled
- agent-config.yaml: # Full agent-config.yaml: Host definitions (NMState network config, BMC addresses, roles, rootDeviceHints, etc.)
apiVersion: v1beta1
kind: AgentConfig
metadata:
name: integrity-config
rendezvousIP: 10.6.158.11
additionalNTPSources:
- clock.corp.redhat.com
hosts:
- ... # Per-host: hostname, role, rootDeviceHints, interfaces, networkConfig, bmc
Day1.5:
config:
- NodeProv: false
```

## Tunneling / Chisel

Refer to [WebApp Services — Chisel Tunneling Service](https://redhat.atlassian.net/wiki/display/MPEXIENG/WebApp+Services#Chisel-Tunneling-Service)
for the reference setup (which uses **NGINX** as a reverse proxy in front of **Chisel** to achieve configurable data-plane port forwarding).

Operational layout and port table (if the above reference setup is used):
[Chisel Tunneling Service](https://redhat.atlassian.net/wiki/display/MPEXIENG/WebApp+Services#Step2.1.2.2.3--Chisel_OperationalTasks).

Step Input Parameters: `OCP__ABI__TUN_SVC__*` / `OCP__ABI__TEAM_NAME`

## BMC / Redfish

**abi-conf-bm** emits `ocp--bmc--info.json`; **abi-install-bmc** drives virtual media and power via Redfish. Details live in `abi-install-bmc-commands.sh`
Comment thread
sg-rh marked this conversation as resolved.
(maintainer-oriented).

## Phase Customization Scripts

The `OCP__ABI__DAY0_SCRIPTS_YAML`, `OCP__ABI__DAY1_SCRIPTS_YAML`, and `OCP__ABI__DAY2_SCRIPTS_YAML` allow injecting arbitrary shell scripts into the
corresponding installation phase, executed in the order listed within the step's shell environment. See [Installation Phases](#installation-phases) for when
each script runs relative to the phase operations.

Example (`OCP__ABI__DAY0_SCRIPTS_YAML`):
```yaml
OCP__ABI__DAY0_SCRIPTS_YAML: |
Scripts:
- | # Complete override of configuration files instead of using `OCP__ABI__CFG_FN` mechanism (not recommended, just serves as an example).
mkdir -p "${OCP__ABI__CLUSTER_DIR}/openshift"
cp -f "${CLUSTER_PROFILE_DIR}/install-config.yaml" "${OCP__ABI__CLUSTER_DIR}/install-config.yaml"
cp -f "${CLUSTER_PROFILE_DIR}/agent-config.yaml" "${OCP__ABI__CLUSTER_DIR}/agent-config.yaml"
```

Schema: [BuildCustomScriptsFromYAML.sh](https://github.com/RedHatQE/OpenShift-LP-QE--Tools/blob/main/libs/bash/common/BuildCustomScriptsFromYAML.sh).
1 change: 1 addition & 0 deletions ci-operator/step-registry/abi/chains/OWNERS
1 change: 1 addition & 0 deletions ci-operator/step-registry/abi/chains/bm--bmc/OWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"path": "abi/chains/bm--bmc/abi-chains-bm--bmc-chain.yaml",
"owners": {
"approvers": [
"cspi-qe-ocp-lp",
"ieng-chaos"
],
"reviewers": [
"cspi-qe-ocp-lp",
"ieng-chaos"
]
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
chain:
as: abi-chains-bm--bmc
env:
- name: OCP__ABI__CLUSTER_DIR
default: /tmp/ocpClusterDir
documentation: |-
The Steps use a Container Image where the `CWD` is R/O. Overrides this to a writable location.
steps:
- ref: abi-conf-bm
- ref: abi-install-bmc
documentation: |-
This Chain deploy OpenShift Container Platform (OCP) on Bare Metal with BMC.

See [ABI overview](https://github.com/openshift/release/blob/main/ci-operator/step-registry/abi/README.md) for details.
1 change: 1 addition & 0 deletions ci-operator/step-registry/abi/conf/OWNERS
1 change: 1 addition & 0 deletions ci-operator/step-registry/abi/conf/bm/OWNERS
220 changes: 220 additions & 0 deletions ci-operator/step-registry/abi/conf/bm/abi-conf-bm-commands.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,220 @@
#!/bin/bash
# abi-conf-bm — Agent-based installer configuration (bare metal; **conf** phase).
#
# Logic in this Step:
# - Bare-minimum `install-config.yaml` scaffold -> OCP-version-aware defaults -> `baremetal` platform -> `agent-config.yaml` template.
# - `UpdateCfg Day0` merges, updates, or replaces config entries; `OCP__ABI__DAY0_SCRIPTS_YAML` scripts further customize `install-config.yaml` / `agent-config.yaml`.
# - Extracts BMC info to `ocp--bmc--info.json`; strips BMC credentials from `agent-config.yaml`.
# - Generates Cluster manifests.
# - `UpdateCfg Day1` + `OCP__ABI__DAY1_SCRIPTS_YAML` scripts customize manifests.
#
set -euxo pipefail
shopt -s inherit_errexit

mkdir -p "${OCP__ABI__CLUSTER_DIR}"

eval "$(
curl -fsSL "https://raw.githubusercontent.com/RedHatQE/OpenShift-LP-QE--Tools/main/libs/bash/common/BuildCustomScriptsFromYAML.sh"
)"
eval "$(
curl -fsSL "https://raw.githubusercontent.com/RedHatQE/OpenShift-LP-QE--Tools/main/libs/bash/common/EnsureReqs.sh"
)"; EnsureReqs yq
Comment thread
sg-rh marked this conversation as resolved.

typeset ocpABIcfg="${CLUSTER_PROFILE_DIR}/${OCP__ABI__CFG_FN}"; [ -r "${ocpABIcfg}" ]

# Extract `openshift-install` from the release image.
# The `RELEASE_IMAGE_LATEST` is set by CI Operator based on `.releases.latest` in CI Conf.
oc adm release extract \
-a /var/run/secrets/registry-pull--build-farms/.dockerconfigjson \
"${RELEASE_IMAGE_LATEST}" \
--command=openshift-install \
--to="/tmp"
export PATH="/tmp:${PATH}"


function openshift-install () {
Comment thread
sg-rh marked this conversation as resolved.
typeset -i es=0
{
echo \
"$(date -Iseconds)|${FUNCNAME[0]@Q} ${*@Q}"$'\n'"$(printf '%.0s-' {1..80})"
command openshift-install \
--dir "${OCP__ABI__CLUSTER_DIR}/" \
--log-level "${OCP__ABI__INSTLR_LOG_LEVEL}" \
"$@" 2>&1 || es=$?
echo "$(printf '%.0s=' {1..80})"
exit ${es}
} | tee -a "${ARTIFACT_DIR}/ocp--installer--cluster.log"
return ${PIPESTATUS[0]}
}
Comment on lines +35 to +48
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This openshift-install wrapper function is identical in both scripts (conf L35-48, install L50-62). If the logging format or argument handling needs updating, it must be changed in two places. Given that both scripts already source shared libraries from OpenShift-LP-QE--Tools, this could live there as well, or be extracted to a small helper that both scripts source from the step registry.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a valid point. My concern is that a different installation process might not always share or require these exact same CLI parameters. While being called more than once is a good enough reason to create a function, moving it into an external library requires a much broader use case.

Introducing an external library carries a higher cost and increases our dependency burden; it needs stronger justification if it is only going to be used by two Step scripts.

I am not opposing the idea right off the bat, as I generally prefer to do things right from the get-go, but unfortunately, time is not on our side for this particular PR. Let's keep this idea in mind for when we start developing more Conf and Install steps for other targets in the future.

Thanks for the great suggestion!


function UpdateCfg () {
typeset topKey="${1:?}"; (($#)) && shift
typeset cfgType='' cfgFile='' cfgCont='' updateOp=''
while IFS=$'\t' read -r cfgType cfgFile cfgCont; do
[[ "${cfgFile}" == */* ]] &&
mkdir -p "${OCP__ABI__CLUSTER_DIR}/${cfgFile%/*}"
true 1>> "${OCP__ABI__CLUSTER_DIR}/${cfgFile}"
exec 3< <(cat "${OCP__ABI__CLUSTER_DIR}/${cfgFile}"); wait $!
case ${cfgType} in
(*+) updateOp='select(fileIndex==0) *+ ' ;;
(*-) updateOp='select(fileIndex==0) * ' ;;
(*=) updateOp='' ;;
esac
updateOp+='select(fileIndex==1)'
case ${cfgType} in
(yaml+|yaml-|yaml=)
yq eval-all "${updateOp}" \
- \
<(set +x; yq -p json -o yaml eval . 0<<<"${cfgCont}") \
0<&3 1>"${OCP__ABI__CLUSTER_DIR}/${cfgFile}"
;;
(json+|json-|json=)
yq -p json -o json eval-all "${updateOp}" \
- \
<(set +x; echo "${cfgCont}") \
0<&3 1>"${OCP__ABI__CLUSTER_DIR}/${cfgFile}"
;;
(*) : "Invalid Type: ${cfgType}"; false;;
esac
exec 3<&-
done 0< <(
yq -o json eval . "${ocpABIcfg}" |
jq -r --arg k "${topKey}" '
(.[$k].configFileOverride // empty) | to_entries[] |
.key as $type | .value[]? | to_entries[] |
[$type, .key, (
if ($type | startswith("json")) then .value
else (.value | tojson)
end
)] | join("\t")
'
)
true
}


# Create bare-minimum `install-config.yaml`.
{
yq -p yaml -o json eval . |
jq -c \
--arg clsName "${OCP__ABI__BM__CLS_NAME}" \
--arg baseDom "${OCP__ABI__BM__BASE_DOM}" \
--rawfile pullCrd <(set +x; cat "${CLUSTER_PROFILE_DIR}/pull-secret") \
--rawfile sshKey <(set +x; cat "${CLUSTER_PROFILE_DIR}/ssh-publickey") \
'
.baseDomain=$baseDom |
.metadata.name=$clsName |
.pullSecret=($pullCrd | rtrimstr("\n")) |
.sshKey=$sshKey
' |
yq -p json -o yaml eval .
} 0<<'fileEOF' 1> "${OCP__ABI__CLUSTER_DIR}/install-config.yaml"
apiVersion: v1
baseDomain: ''
metadata:
name: ''
platform: {none: {}}
pullSecret: ''
sshKey: ''
fileEOF

# Enrich with OCP-version-aware defaults.
openshift-install create install-config
# Update for Bare Metal target.
yq -i eval \
'.platform={"baremetal": {}}' \
"${OCP__ABI__CLUSTER_DIR}/install-config.yaml"

# Create `agent-config.yaml` template.
openshift-install agent create agent-config-template
# Being idempotent on re-run.
[ -s "${OCP__ABI__CLUSTER_DIR}/agent-config.yaml" ] || {
jq -r \
'."*agentconfig.AgentConfig".File.Data' \
"${OCP__ABI__CLUSTER_DIR}/.openshift_install_state.json" |
base64 -d 1> "${OCP__ABI__CLUSTER_DIR}/agent-config.yaml"
}

# Customize `install-config.yaml` and complete `agent-config.yaml`.
UpdateCfg Day0
eval "$(BuildCustomScriptsFromYAML OCP__ABI__DAY0_SCRIPTS_YAML)"

# Retrieve BMC Information from `agent-config.yaml`.
# Currently, if all Master Nodes are ready to be installed, but
# not all Worker Nodes are registering, the
# `wait-for bootstrap-complete` will exit out with error.
# As workaround, we boot the Worker Nodes first, and the
# Rendezvous Host last.
{
yq -p yaml -o json eval . |
jq \
--rawfile usr <(set +x; cat "${CLUSTER_PROFILE_DIR}/cred--bmc--usr") \
--rawfile pwd <(set +x; cat "${CLUSTER_PROFILE_DIR}/cred--bmc--pwd") \
--argjson rIP "$(yq -o json '(select(
(.rendezvousIP | length) > 0) | .rendezvousIP
) // ([
(.hosts[] | select(.role == "master")),
(.hosts[] | select(.role == "arbiter")),
(.hosts[] | select((.role == "") or (.role == null)))
] | .[0] | [.networkConfig.interfaces[] |
select(.ipv4.enabled == true) |
.ipv4.address[0].ip
] | .[0]) // error(
"rendezvousIP could not be determined"
) ' "${OCP__ABI__CLUSTER_DIR}/agent-config.yaml")" \
'[(
(.hosts[] | select(.role == "worker")),
((
(.hosts[] | select((.role == "") or (.role == null))),
(.hosts[] | select(.role == "auto-assign")),
(.hosts[] | select(.role == "arbiter")),
(.hosts[] | select(.role == "master"))
) | select(any((
.networkConfig.interfaces[] |
select(.ipv4.enabled == true) |
.ipv4.address[]?.ip
); . == $rIP) | not)),
(.hosts[] | select(any((
.networkConfig.interfaces[] |
select(.ipv4.enabled == true) |
.ipv4.address[]?.ip
); . == $rIP)))
) | {
url: ("https://" + (.bmc.address | split("://")[-1])),
usr: (.bmc.username // ($usr | rtrimstr("\n"))),
pwd: (.bmc.password // ($pwd | rtrimstr("\n"))),
hostIPv4: ([
.networkConfig.interfaces[] |
select(.ipv4.enabled == true) |
.ipv4.address[0]?.ip
][0] // null)
}]'
} 0< "${OCP__ABI__CLUSTER_DIR}/agent-config.yaml" 1> "${SHARED_DIR}/ocp--bmc--info.json"

# Strip BMC Credentials from `agent-config.yaml`.
exec 3< <(cat "${OCP__ABI__CLUSTER_DIR}/agent-config.yaml"); wait $!
{
yq -p yaml -o json eval . |
jq '.hosts[].bmc |= del(.username, .password)' |
yq -p json -o yaml eval .
} 0<&3 1> "${OCP__ABI__CLUSTER_DIR}/agent-config.yaml"
exec 3<&-

# Set ISO Mode.
((OCP__ABI__MIN_ISO)) && (
export __IMG__ROOT_FS="${OCP__ABI__TUN_SVC__DP_BASE_URL%%/}/${OCP__ABI__TUN_SVC__DP_PORT}/boot-artifacts"
yq -i eval '
.minimalISO=true |
.bootArtifactsBaseURL=strenv(__IMG__ROOT_FS)
' "${OCP__ABI__CLUSTER_DIR}/agent-config.yaml"
)

# Generate full manifest tree.
openshift-install agent create cluster-manifests

# Manifest Customization.
UpdateCfg Day1
eval "$(BuildCustomScriptsFromYAML OCP__ABI__DAY1_SCRIPTS_YAML)"

# Save OCP Installation information for next Step.
tar zcf "${SHARED_DIR}/ocpClusterInf.tgz" -C "${OCP__ABI__CLUSTER_DIR}/" .
Loading