Skip to content
This repository has been archived by the owner on May 16, 2024. It is now read-only.

Add support for RHEL 7.x #36

Open
closertotheheart opened this issue Jan 11, 2022 · 2 comments
Open

Add support for RHEL 7.x #36

closertotheheart opened this issue Jan 11, 2022 · 2 comments

Comments

@closertotheheart
Copy link

I know the CentOS/RHEL support isn't official yet, but what is in the repo is RHEL8 based.

Would be great to have a working RHEL 7.x (7.7 or later) Dockerfile and entrypoint.sh as well (we use Ubuntu 20.04, RHEL 7.7 and RHEL 8.4 at present).

I'd be happy to test it on RHEL 7.x if that's an issue for the dev team prior to merge into master.

@ReyRen
Copy link

ReyRen commented Sep 25, 2023

Really need this.

@ReyRen
Copy link

ReyRen commented Sep 25, 2023

I try to build the image by myself. Here is my Dockerfile

ARG D_BASE_IMAGE=docker.io/library/centos:7.9.2009
FROM $D_BASE_IMAGE

ARG D_OFED_VERSION="5.6-2.0.9.0"
ARG D_OS_VERSION="7.9"
ARG D_OS="rhel${D_OS_VERSION}"
ENV D_OS=${D_OS}
ARG D_ARCH="x86_64"
ARG D_OFED_PATH="MLNX_OFED_LINUX-${D_OFED_VERSION}-${D_OS}-${D_ARCH}"
ENV D_OFED_PATH=${D_OFED_PATH}

ARG D_OFED_TARBALL_NAME="${D_OFED_PATH}.tgz"
ARG D_OFED_BASE_URL="http://privateAddress/ofed-driver"
ARG D_OFED_URL_PATH="${D_OFED_BASE_URL}/${D_OFED_TARBALL_NAME}"

ARG D_WITHOUT_FLAGS="--add-kernel-support"
ENV D_WITHOUT_FLAGS=${D_WITHOUT_FLAGS}

# Download and extract tarball
WORKDIR /root
RUN yum update -y
RUN yum -y install wget && wget -c ${D_OFED_URL_PATH}
RUN tar -xzf ${D_OFED_TARBALL_NAME}
RUN yum -y install autoconf automake binutils ethtool gcc-4.8.5 git hostname kmod libmnl libtool lsof make pciutils perl procps python36 python36-devel rpm-build tcl tk wget
RUN yum -y install perl pciutils python gcc-gfortran libxml2-python tcsh libnl.i686 libnl expat glib2 tcl libstdc++ bc tk gtk2 atk cairo numactl pkgconfig ethtool lsof
RUN yum -y install  python-devel
RUN wget -c ${D_OFED_BASE_URL}/kernel-devel-3.10.0-1160.el7.x86_64.rpm
RUN rpm -ivh kernel-devel-3.10.0-1160.el7.x86_64.rpm
RUN yum -y install dnf
RUN yum install -y kernel-devel-3.10.0-1160.el7.x86_64

WORKDIR /
ADD ./entrypoint.sh /root/entrypoint.sh

ENTRYPOINT ["/root/entrypoint.sh"]

But even if build very success, and the kernel-dev install properly, but the error still there:

(pytorch) [root@gpu5 ~]# docker run cfb5cfc70383
+ VENDOR=0x15b3
+ DRIVER_PATH=/sys/bus/pci/drivers/mlx5_core
++ uname -r
+ KVER=3.10.0-1160.el7.x86_64
+ unset_driver_readiness
+ rm -f /.driver-ready
+ ofed_exist_for_kernel
+ [[ -e /usr/lib/modules/3.10.0-1160.el7.x86_64/extra/mlnx-ofa_kernel/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko ]]
+ echo 'No OFED driver found for kernel 3.10.0-1160.el7.x86_64'
+ return 1
No OFED driver found for kernel 3.10.0-1160.el7.x86_64
+ [[ 1 -ne 0 ]]
+ _install_prerequisites
Enabling RHOCP and EUS RPM repos...
+ echo 'Enabling RHOCP and EUS RPM repos...'
++ cat /host/etc/os-release
++ grep '^ID='
cat: /host/etc/os-release: No such file or directory
+ eval local
++ local
++ cat /host/etc/os-release
++ grep '^VERSION_ID='
cat: /host/etc/os-release: No such file or directory
+ eval local
++ local
++ cat /host/etc/os-release
++ grep '^RHEL_VERSION='
cat: /host/etc/os-release: No such file or directory
+ eval
+ OPENSHIFT_VERSION=4.9
+ '[' '' = rhcos ']'
+ '[' '' = rhel ']'
++ cat /etc/os-release
++ grep '^VERSION_ID='
+ eval local 'VERSION_ID="7"'
++ local VERSION_ID=7
+ RHEL_VERSION=7
+ dnf config-manager --set-enabled rhocp-4.9-for-rhel-8-x86_64-rpms
No such command: config-manager. Please use /usr/bin/dnf --help
It could be a DNF plugin command, try: "dnf install 'dnf-command(config-manager)'"
+ true
+ dnf makecache --releasever=7
CentOS-7 - Base                                 2.2 MB/s |  10 MB     00:04
^C^C^C^C^CCentOS-7 - Updates                              2.2 MB/s |  29 MB     00:12
CentOS-7 - Extras                               2.2 MB/s | 360 kB     00:00
Metadata cache created.
+ dnf config-manager --set-enabled rhel-8-for-x86_64-baseos-eus-rpms
No such command: config-manager. Please use /usr/bin/dnf --help
It could be a DNF plugin command, try: "dnf install 'dnf-command(config-manager)'"
+ true
+ dnf makecache --releasever=7
CentOS-7 - Base                                 0.0  B/s |   0  B     00:00
CentOS-7 - Updates                              0.0  B/s |   0  B     00:00
CentOS-7 - Extras                               0.0  B/s |   0  B     00:00
Metadata cache created.
+ echo 'Installing dependencies'
Installing dependencies
+ dnf -q -y --releasever=7 install createrepo elfutils-libelf-devel kernel-rpm-macros numactl-libs
Error: Unable to find a match: kernel-rpm-macros
+ echo 'Installing Linux kernel headers...'
Installing Linux kernel headers...
+ dnf -q -y --releasever=7 install kernel-headers-3.10.0-1160.el7.x86_64 kernel-devel-3.10.0-1160.el7.x86_64
+ echo 'Installing Linux kernel module files...'
+ dnf -q -y --releasever=7 install kernel-core-3.10.0-1160.el7.x86_64
Installing Linux kernel module files...
Error: Unable to find a match: kernel-core-3.10.0-1160.el7.x86_64
+ touch /lib/modules/3.10.0-1160.el7.x86_64/modules.order
touch: cannot touch '/lib/modules/3.10.0-1160.el7.x86_64/modules.order': No such file or directory
+ touch /lib/modules/3.10.0-1160.el7.x86_64/modules.builtin
touch: cannot touch '/lib/modules/3.10.0-1160.el7.x86_64/modules.builtin': No such file or directory
+ depmod 3.10.0-1160.el7.x86_64
depmod: ERROR: could not open directory /lib/modules/3.10.0-1160.el7.x86_64: No such file or directory
depmod: FATAL: could not search modules: No such file or directory
+ echo 'Generating Linux kernel version string...'
Generating Linux kernel version string...
+ sh /usr/src/kernels/3.10.0-1160.el7.x86_64/scripts/extract-vmlinux /lib/modules/3.10.0-1160.el7.x86_64/vmlinuz
+ grep -E '^Linux version'
+ strings
+ sed 's/^\(.*\)\s\+(.*)$/\1/'
Usage: extract-vmlinux <kernel-image>
+ '[' -z '' ']'
+ echo 'Could not locate Linux kernel version string'
Could not locate Linux kernel version string
+ return 1
+ _install_ofed
+ /bin/bash -c '/root/${D_OFED_PATH}/mlnxofedinstall --add-kernel-support   --force -vvv --skip-repo'
TERM environment variable not set.
Distro was not provided, trying to auto-detect the current distro...
dist_rpm: centos-release-7-9.2009.1.el7.centos
Auto-detected RHEL7.9 distro.
glibc-devel 32bit is required to install 32-bit libraries.
Note: This program will create MLNX_OFED_LINUX TGZ for rhel7.9 under /tmp/MLNX_OFED_LINUX-5.6-2.0.9.0-3.10.0-1160.el7.x86_64 directory.
See log file /tmp/MLNX_OFED_LINUX-5.6-2.0.9.0-3.10.0-1160.el7.x86_64/mlnx_iso.80_logs/mlnx_ofed_iso.80.log

Checking if all needed packages are installed...
/lib/modules/3.10.0-1160.el7.x86_64/build/scripts is required to build mlnx-ofa_kernel RPM.
Please install the corresponding kernel-source or kernel-devel RPM.

Error: One or more required packages for installing OFED-internal are missing.
Please install the missing packages using your Linux distribution Package Management tool.
Run:
yum install kernel-devel-3.10.0-1160.el7.x86_64
Failed to build MLNX_OFED_LINUX for 3.10.0-1160.el7.x86_64
+ sed -i '/ESP_OFFLOAD_LOAD=yes/c\ESP_OFFLOAD_LOAD=no' /etc/infiniband/openib.conf
sed: can't read /etc/infiniband/openib.conf: No such file or directory
+ cp /root/MLNX_OFED_LINUX-5.6-2.0.9.0-rhel7.9-x86_64/docs/scripts/openibd-post-start-configure-interfaces/post-start-hook.sh /etc/infiniband/post-start-hook.sh
cp: cannot create regular file '/etc/infiniband/post-start-hook.sh': No such file or directory
+ chmod +x /etc/infiniband/post-start-hook.sh
chmod: cannot access '/etc/infiniband/post-start-hook.sh': No such file or directory
+ rebuild_driver
+ echo 'Rebuilding driver'
Rebuilding driver
+ fix_src_link
++ uname -m
+ local ARCH=x86_64
++ uname -r

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants