-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Description
Motivation
As the [RFC] Intel GPU Upstreaming mentioned, to integrate the new Intel GPU device and its associated features into PyTorch, we need to implement PyTorch CI/CD tests specifically designed for Intel GPU devices. These tests will ensure the quality of incoming pull requests and gate their acceptance accordingly.
Design Philosophy
For the new part CI/CD enabling, we aim to leverage the existing PyTorch CI/CD infrastructure wherever possible. Intel GPU related test will be dispatched to Intel Develop Cloud (IDC) Instances, which provides Intel GPU hardware as self-hosted runners. Based on our understanding of current PyTorch CI/CD tests, we will divide Intel GPU related tests into several categories: pull test, inductor tests and other tests. For inductor tests, we will extend the existing inductor test workflow to accommodate Intel GPU inductor testing. And a new workflow will serve as the entry point for other tests for Intel GPU, mirroring the approach used for other devices. Overall, all Intel GPU tests follow the rules below.
- Docker based builds & tests
- Multiple steps both for inductor and other tests
- Base Docker image build on AWS instance runners which provides PyTorch build and tests environment.
- Wheel build on AWS instance runners directly.
- Tests can be sharded and dispatched on IDC instance runners.
Detail
Entrance of pull test
For the basic build test for pull requests, we will add a new part for Intel GPU specific build in .github/workflows/pull.yml
and triggered by each pull request.
Entrance of Inductor tests
For Inductor tests, we will add a new part for Intel GPU specific tests in .github/workflows/inductor.yml
and triggered by ciflow/inductor
. To avoid breaking other inductor related PRs at the first stage, we plan to add a new ciflow label ciflow/xpu
in .github/pytorch-probot.yml
, and limit the Intel GPU inductor tests only for PR which has both ciflow/inductor
and ciflow/xpu
.
Entrance of other tests
The Intel GPU device related remain tests will be triggered by PR with label ciflow/xpu
or regular triggered by timer. To achieve it, we will add a new entrance workflow .github/workflows/xpu.yml
, which like below content.
name: xpu
on:
push:
branches:
- main
- release/*
tags:
- ciflow/xpu/*
workflow_dispatch:
schedule:
- cron: 0 0 * * *
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref_name }}-${{ github.ref_type == 'branch' && github.sha }}-${{ github.event_name == 'workflow_dispatch' }}-${{ github.event_name == 'schedule' }}
cancel-in-progress: true
jobs:
linux-jammy-xpu-py3_8-build:
name: linux-jammy-xpu-py3.8
uses: ./.github/workflows/_linux-build.yml
with:
build-environment: linux-jammy-xpu-py3.8
docker-image-name: pytorch-linux-jammy-xpu-n-py3
sync-tag: xpu-build
test-matrix: |
{ include: [
{ config: "default", shard: 1, num_shards: 2, runner: "linux.idc.xpu" },
{ config: "default", shard: 2, num_shards: 2, runner: "linux.idc.xpu" },
]}
linux-jammy-xpu-py3_8-test:
name: linux-jammy-xpu-py3.8
uses: ./.github/workflows/_xpu-test.yml
needs: linux-jammy-xpu-py3_8-build
with:
build-environment: linux-jammy-xpu-py3.8
docker-image: ${{ needs.linux-jammy-xpu-py3_8-build.outputs.docker-image }}
test-matrix: ${{ needs.linux-jammy-xpu-py3_8-build.outputs.test-matrix }}
Build & Test
Will add Intel GPU specific base image Dockerfile .ci/docker/ubuntu-xpu/Dockerfile
and Intel GPU part into image build script .ci/docker/build.sh
to support Intel GPU based image build on linux.2xlarge runners
.
For Pytorch wheel build, different with other devices, currently we need dispatch it to IDC instance runners. We will reuse .github/workflows/_linux-build.yml
with Intel GPU specific build-environment and add Intel GPU part into Pytorch build script .ci/pytorch/build.sh
.
For the tests part, we will add a new Intel GPU test workflow .github/workflows/_xpu-test.yml
and some necessary GitHub action such as setup-xpu
, teardown-xpu
, etc. We also will add a new part in test script .ci/pytorch/test.sh
and a series utils scripts for Intel GPU.
cc @seemethere @malfet @pytorch/pytorch-dev-infra @frank-wei @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Status