Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ci] build aarch64 wheel builds on a real aarch64 machine #6843

Open
wants to merge 16 commits into
base: master
Choose a base branch
from
1 change: 0 additions & 1 deletion .ci/setup.sh
Original file line number Diff line number Diff line change
@@ -3,7 +3,6 @@
set -e -E -u -o pipefail

# defaults
AZURE=${AZURE:-"false"}
IN_UBUNTU_BASE_CONTAINER=${IN_UBUNTU_BASE_CONTAINER:-"false"}
SETUP_CONDA=${SETUP_CONDA:-"true"}

63 changes: 55 additions & 8 deletions .github/workflows/python_package.yml
Original file line number Diff line number Diff line change
@@ -18,7 +18,54 @@ env:
SKBUILD_STRICT_CONFIG: true

jobs:
test:
test-linux-aarch64:
name: bdist wheel (ubuntu-24.04-arm)
runs-on: ubuntu-24.04-arm
timeout-minutes: 60
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 5
submodules: true
- name: Setup and run tests
shell: bash
# this uses 'docker run' instead of just setting 'container:'
# because actions/checkout requires GLIBC 2.28 and that is too
# new for manylinux2014
env:
BUILD_DIRECTORY: /LightGBM
run: |
cat > ./docker-script.sh <<EOF
mkdir -p \$BUILD_ARTIFACTSTAGINGDIRECTORY
export CONDA=\$HOME/miniforge
export PATH=\$CONDA/bin:\$PATH
${{ env.BUILD_DIRECTORY }}/.ci/setup.sh || exit 1
${{ env.BUILD_DIRECTORY }}/.ci/test.sh || exit 1
EOF
IMAGE_URI="lightgbm/vsts-agent:manylinux2014_aarch64"
docker pull "${IMAGE_URI}" || exit 1
docker run \
--platform "${PLATFORM}" \
--rm \
--env BUILD_DIRECTORY=${{ env.BUILD_DIRECTORY }} \
--env BUILD_ARTIFACTSTAGINGDIRECTORY=${{ env.BUILD_DIRECTORY }}/artifacts/ \
--env COMPILER=gcc \
--env METHOD=wheel \
--env OS_NAME=linux \
--env PRODUCES_ARTIFACTS=true \
--env PYTHON_VERSION="3.13" \
--env TASK=bdist \
-v "${PWD}":"${{ env.BUILD_DIRECTORY }}" \
-w ${{ env.BUILD_DIRECTORY }} \
"${IMAGE_URI}" \
/bin/bash ./docker-script.sh
- name: upload wheels
uses: actions/upload-artifact@v4
with:
name: linux-aarch64-wheel
path: artifacts/*.whl
test-macos:
name: ${{ matrix.task }} ${{ matrix.method }} (${{ matrix.os }}, Python ${{ matrix.python_version }})
runs-on: ${{ matrix.os }}
timeout-minutes: 60
@@ -65,17 +112,13 @@ jobs:
run: |
export TASK="${{ matrix.task }}"
export METHOD="${{ matrix.method }}"
export OS_NAME="macos"
export PYTHON_VERSION="${{ matrix.python_version }}"
if [[ "${{ matrix.os }}" == "macos-14" ]]; then
# use clang when creating macOS release artifacts
export COMPILER="clang"
export OS_NAME="macos"
elif [[ "${{ matrix.os }}" == "macos-13" ]]; then
else
export COMPILER="gcc"
export OS_NAME="macos"
elif [[ "${{ matrix.os }}" == "ubuntu-latest" ]]; then
export COMPILER="clang"
export OS_NAME="linux"
fi
export BUILD_DIRECTORY="$GITHUB_WORKSPACE"
export CONDA=${HOME}/miniforge
@@ -152,7 +195,11 @@ jobs:
all-python-package-jobs-successful:
if: always()
runs-on: ubuntu-latest
needs: [test, test-latest-versions, test-old-versions]
needs:
- test-latest-versions
- test-macos
- test-linux-aarch64
- test-old-versions
steps:
- name: Note that all tests succeeded
uses: re-actors/alls-green@v1.2.2
76 changes: 0 additions & 76 deletions .vsts-ci.yml
Original file line number Diff line number Diff line change
@@ -223,81 +223,6 @@ jobs:
inputs:
filePath: $(Build.SourcesDirectory)/.ci/test.sh
targetType: 'filePath'
##################
# QEMU_multiarch #
##################
- job: QEMU_multiarch
variables:
BUILD_DIRECTORY: /LightGBM
COMPILER: gcc
PRODUCES_ARTIFACTS: 'true'
pool:
vmImage: ubuntu-22.04
timeoutInMinutes: 180
strategy:
matrix:
bdist:
TASK: bdist
ARCH: aarch64
steps:
- script: |
sudo apt-get update
sudo apt-get install --no-install-recommends -y \
binfmt-support \
qemu \
qemu-user \
qemu-user-static
displayName: 'Install QEMU'
- script: |
docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
displayName: 'Enable Docker multi-architecture support'
- script: |
git clean -d -f -x
displayName: 'Clean source directory'
# LGBM_SKIP_DASK_TESTS=true is set below only because running the tests under emulation is so slow...
# in theory, 'lightgbm.dask' should work without issue on aarch64 Linux systems.
# That could probably be removed as part of https://github.com/microsoft/LightGBM/issues/6788
- script: |
cat > docker-script.sh <<EOF
export CONDA=\$HOME/miniforge
export PATH=\$CONDA/bin:/opt/rh/llvm-toolset-7.0/root/usr/bin:\$PATH
export LD_LIBRARY_PATH=/opt/rh/llvm-toolset-7.0/root/usr/lib64:\$LD_LIBRARY_PATH
Comment on lines -263 to -264
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intentionally did not preserve this in the workflow. It shouldn't be necessary.

docker run \
  --rm \
  -it lightgbm/vsts-agent:manylinux2014_aarch64 \
  bash

echo $PATH
# /opt/rh/llvm-toolset-7.0/root/usr/bin:/opt/rh/devtoolset-10/root/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bi

echo $LD_LIBRARY_PATH
# /opt/rh/llvm-toolset-7.0/root/usr/lib64:/opt/rh/devtoolset-10/root/usr/lib64:/opt/rh/devtoolset-10/root/usr/lib:/opt/rh/devtoolset-10/root/usr/lib64/dyninst:/opt/rh/devtoolset-10/root/usr/lib/dyninst:/usr/local/lib64

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that the correct compiler (gcc) is being used:

-- The C compiler identification is GNU 10.2.1
-- The CXX compiler identification is GNU 10.2.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /opt/rh/devtoolset-10/root/usr/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /opt/rh/devtoolset-10/root/usr/bin/g++ - skipped

(build link)

\$BUILD_DIRECTORY/.ci/setup.sh || exit 1
\$BUILD_DIRECTORY/.ci/test.sh || exit 1
EOF
IMAGE_URI="lightgbm/vsts-agent:manylinux2014_aarch64"
docker pull "${IMAGE_URI}" || exit 1
PLATFORM=$(docker inspect --format='{{.Os}}/{{.Architecture}}' "${IMAGE_URI}") || exit 1
echo "detected image platform: ${PLATFORM}"
docker run \
--platform "${PLATFORM}" \
--rm \
--env AZURE=true \
--env BUILD_ARTIFACTSTAGINGDIRECTORY=$BUILD_ARTIFACTSTAGINGDIRECTORY \
--env BUILD_DIRECTORY=$BUILD_DIRECTORY \
--env COMPILER=$COMPILER \
--env LGBM_SKIP_DASK_TESTS=true \
--env METHOD=$METHOD \
--env OS_NAME=linux \
--env PRODUCES_ARTIFACTS=$PRODUCES_ARTIFACTS \
--env PYTHON_VERSION=$PYTHON_VERSION \
--env TASK=$TASK \
-v "$(Build.SourcesDirectory)":"$BUILD_DIRECTORY" \
-v "$(Build.ArtifactStagingDirectory)":"$(Build.ArtifactStagingDirectory)" \
"${IMAGE_URI}" \
/bin/bash $BUILD_DIRECTORY/docker-script.sh
displayName: 'Setup and run tests'
- task: PublishBuildArtifacts@1
condition: >
and(
succeeded(),
in(variables['TASK'], 'bdist'),
not(startsWith(variables['Build.SourceBranch'], 'refs/pull/'))
)
inputs:
pathtoPublish: '$(Build.ArtifactStagingDirectory)'
artifactName: PackageAssets
artifactType: container
#########
# macOS #
#########
@@ -441,7 +366,6 @@ jobs:
dependsOn:
- Linux
- Linux_latest
- QEMU_multiarch
- macOS
- Windows
- R_artifact
4 changes: 0 additions & 4 deletions tests/python_package_test/test_dask.py
Original file line number Diff line number Diff line change
@@ -55,10 +55,6 @@
pytest.mark.skipif(getenv("TASK", "") == "mpi", reason="Fails to run with MPI interface"),
pytest.mark.skipif(getenv("TASK", "") == "gpu", reason="Fails to run with GPU interface"),
pytest.mark.skipif(getenv("TASK", "") == "cuda", reason="Fails to run with CUDA interface"),
pytest.mark.skipif(
getenv("LGBM_SKIP_DASK_TESTS", "") == "true",
reason="Skipping lightgbm.dask tests (found env variable LGBM_SKIP_DASK_TESTS=true)",
),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

]


Loading
Oops, something went wrong.