Skip to content

Chaos Experiments Go‐Runner Base Image Migration to RedHat UBI Image

Vedant Shrotria edited this page May 3, 2024 · 3 revisions

This documentation will take us through the changes required for UBI Migration of Chaos Experiment Image.

What should be validated/tested once migration is completed?

  • All Experiments should be able to execute fine.
  • All required binaries or OS level packages should be available.
  • All permissions should be checked with respect to all binaries used by the experiments.
  • Users should be able to run experiments on OpenShift clusters/Environments.

Binaries Analysis

  • We are using around total 15 binaries/tools listed below -
sudo, sshpass, ps, tc, iptables, stress-ng, kubectl, promql, pause, dns_interceptor, toxiproxy-cli, toxiproxy-server, crictl, docker, nsutil
  • Out of these, We ourselves are building & pushing 6 binaries -
promql, pause, dns_interceptor, toxiproxy-cli, toxiproxy-server, nsutil
  • If we look in terms of vulnerability, we are managing all 15 binaries/tools.

  • Out of these, 8 binaries are used with sudo -

crictl, docker, nsutil, dns-interceptor, toxiproxy-cli, toxiproxy-server, stress-ng, tc

Challenges

  • UBI Base Image - (We need to use ubi:9.3 instead of ubi-minimal:9.3)

    • Reason for not using ubi-minimal

      • It is very minimal image doesn’t have all required OS packages/binaries.

      • It is very restrictive in terms of new binaries installations as it only have one package manager i.e. microdnf. microdnf can only install a certain binaries which are available in default repositories. To install from an external repo, It needs more work.

      • Using ubi gives us 2 package managers yum & dnf which have good support for installing binaries from external repos as well & much more info in terms of debuggability.

  • tc binary - This binary is not available in UBI by-default neither in default repositories. We have to install it form external/extra repositories. It’s part of iproute-tc package which itself depends on iproute.So both will have to be installed.

RUN yum install -y https://dl.rockylinux.org/pub/rocky/9/devel/$(uname -m)/os/Packages/i/iproute-6.2.0-5.el9.$(uname -m).rpm
RUN yum install -y https://dl.rockylinux.org/pub/rocky/9/devel/$(uname -m)/os/Packages/i/iproute-tc-6.2.0-5.el9.$(uname -m).rpm 
  • stress-ng binary - This binary is not available in UBI by-default neither in default repositories. We have to install it form external/extra repositories. It depends on libjudy so both have to be installed.
RUN yum install -y https://yum.oracle.com/repo/OracleLinux/OL9/appstream/$(uname -m)/getPackage/Judy-1.0.5-28.el9.$(uname -m).rpm
RUN yum install -y https://yum.oracle.com/repo/OracleLinux/OL9/appstream/$(uname -m)/getPackage/stress-ng-0.14.00-2.el9.$(uname -m).rpm
  • UBI has a secure path & one normal path. sudo uses the secure_path for discovering binaries. So all binaries which are run with sudo have to be shifted to secure path. We are using /sbin/ for all our binaries.
General secure_path is available in visudo.
- If we do sudo visudo, it will open the config, if we search for secure_path, we can find it there - 
Defaults    secure_path = /sbin:/bin:/usr/sbin:/usr/bin
  • iptables - UBI doesn’t come with iptables by default. We cannot install it from epel repository as the iptables provided by epel is based on nf_tables which doesn’t support TCP & REDIRECT or may need some more update in experiment logic. We will have to install iptables (legacy) from fedora RPM.
With iptables (nf_tables)
Warning: Extension REDIRECT revision 0 not supported, missing kernel module?
Warning: Extension REDIRECT revision 0 not supported, missing kernel module?
With iptables (legacy) - It is working fine!
# Keep this sequence, otherwise this will fail
RUN yum install -y https://dl.rockylinux.org/pub/rocky/9/devel/$(uname -m)/os/Packages/i/iptables-libs-1.8.8-6.el9_1.$(uname -m).rpm
RUN yum install -y https://dl.fedoraproject.org/pub/epel/9/Everything/$(uname -m)/Packages/i/iptables-legacy-libs-1.8.8-6.el9.2.$(uname -m).rpm
RUN yum install -y https://dl.fedoraproject.org/pub/epel/9/Everything/$(uname -m)/Packages/i/iptables-legacy-1.8.8-6.el9.2.$(uname -m).rpm
  • htop (Optional) - It’s also not available in default repository. We will have to install it from epel-release RPM - It is not used anywhere - Only for debugging purposes.
dnf install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm
yum install -y htop
  • Experiment Code using ps will have to be changed as ps aux works a little different in case of UBI -

sudo kill -9 $(ps aux | grep [t]oxiproxy | awk 'FNR==1{print $1}') had to be changed to sudo kill -9 $(ps aux | grep [t]oxiproxy | awk 'FNR==2{print $2}')

In Alpine
PID   USER     TIME  COMMAND
    1 root      0:04 /usr/lib/systemd/systemd noresume ,firmware cros_efi
    2 root      0:00 [kthreadd]
UBI


[litmus@ubi-pod ~]$ sudo nsenter -t 2353 -n ps aux | grep [t]oxiproxy
root       11992  0.0  0.1  16096  7688 ?        S    10:37   0:00 sudo nsenter -t 2353 -n toxiproxy-server -host=0.0.0.0
root       11994  0.0  0.1 1233864 7684 ?        Sl   10:37   0:00 toxiproxy-server -host=0.0.0.0

Binaries Versions Matrix (Not maintained by us), we should try to match all to avoid issues

Binaries Alpine UBI
stress-ng 0.14.00 0.14.00-2
iproute-tc 6.0.0-r1 6.2.0-5
iptables v1.8.8 (legacy) v1.8.8 (legacy)

Final changes on high level

  • Base Image - Hardened Alpine image (litmuschaos/experiment-alpine) will be replaced with registry.access.redhat.com/ubi9/ubi:9.3

  • Docker Binary should be shifted to root Dockerfile.

  • Maintainer should be LitmusChaos

  • User ID should be changed to non-root user 2000

  • User should be changed to litmus & bound to 2000

  • All Binaries (Experiments/Helpers/Healthchecks) to be provided with 755 permission & assigned to litmus user.

  • All Binaries which are used with sudo to be shifted to /sbin/

  • All Binaries which are not used sudo to be shifted to /usr/local/bin/

  • No base image creation, Dockerfile will have all binaries. Making independent on base image.