Add support for targeted GPU architecture builds #171

LoserCheems · 2025-09-20T12:02:01Z

Introduce an architecture input parameter to the build workflow, allowing for targeted builds for specific GPU architectures. Enhance the publish workflow with an architecture matrix for broader compatibility across NVIDIA GPU generations. Implement automatic detection of the preferred SM architecture for improved wheel naming and artifact identification. Clean up the codebase by removing unused imports.

Enables targeting specific compute capabilities during the build process by adding an optional arch parameter to the workflow inputs. This provides more granular control over the build configuration while maintaining backward compatibility with the existing default behavior.

Enables building packages for multiple GPU architectures (80, 86, 89, 90, 100, 120) by adding an architecture matrix parameter to the publish workflow. Expands compatibility across different NVIDIA GPU generations and ensures optimal performance for each target architecture.

Enables building for a single compute capability to reduce build time when targeting specific GPU architectures. Updates wheel naming convention to include arch identifier when specified, ensuring proper artifact identification for architecture-specific builds.

Introduces automatic detection of the preferred SM (Streaming Multiprocessor) architecture from the current CUDA device to improve wheel filename specificity. The detection function safely handles cases where CUDA is unavailable or detection fails by returning None. This enhancement allows for more precise wheel identification based on the actual hardware capabilities rather than relying solely on CUDA version information. Removes unused imports to clean up the codebase.

Copilot

Pull Request Overview

This PR adds support for targeted GPU architecture builds by introducing an architecture input parameter to the build workflows. This allows building wheels optimized for specific NVIDIA GPU compute capabilities rather than the default multi-architecture approach.

Adds architecture matrix support in publish workflow for broader GPU compatibility
Implements automatic SM architecture detection for improved wheel naming
Introduces single-architecture build capability to speed up development builds

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File	Description
setup.py	Adds SM architecture detection function and updates wheel naming to include architecture info
.github/workflows/publish.yml	Adds architecture matrix for building wheels across multiple GPU generations
.github/workflows/build.yml	Adds architecture input parameter for targeted builds
.github/workflows/_build.yml	Implements architecture-specific build logic and wheel naming

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-09-20T12:02:36Z

setup.py


    # Determine wheel URL based on CUDA version, torch version, python version and OS
-    wheel_filename = f"{PACKAGE_NAME}-{flash_version}+cu{cuda_version}torch{torch_version}cxx11abi{cxx11_abi}-{python_version}-{python_version}-{platform_name}.whl"
+    wheel_filename = f"{PACKAGE_NAME}-{flash_version}+sm{sm_arch}cu{cuda_version}torch{torch_version}cxx11abi{cxx11_abi}-{python_version}-{python_version}-{platform_name}.whl"


The wheel filename will include 'sm{sm_arch}' even when sm_arch is None, resulting in 'smNone' in the filename. This should be handled conditionally to maintain backward compatibility when CUDA is unavailable or detection fails.

Suggested change

wheel_filename = f"{PACKAGE_NAME}-{flash_version}+sm{sm_arch}cu{cuda_version}torch{torch_version}cxx11abi{cxx11_abi}-{python_version}-{python_version}-{platform_name}.whl"

sm_arch_str = f"sm{sm_arch}" if sm_arch is not None else ""

plus = "+" if sm_arch_str else ""

wheel_filename = f"{PACKAGE_NAME}-{flash_version}{plus}{sm_arch_str}cu{cuda_version}torch{torch_version}cxx11abi{cxx11_abi}-{python_version}-{python_version}-{platform_name}.whl"

Copilot · 2025-09-20T12:02:37Z

.github/workflows/_build.yml

+            if [ -n "${MATRIX_ARCH}" ]; then
+              tmpname=sm${MATRIX_ARCH}cu${WHEEL_CUDA_VERSION}torch${MATRIX_TORCH_VERSION}cxx11abi${{ inputs.cxx11_abi }}
+            else
+              tmpname=cu${WHEEL_CUDA_VERSION}torch${MATRIX_TORCH_VERSION}cxx11abi${{ inputs.cxx11_abi }}
+            fi
            wheel_name=$(ls dist/*whl | xargs -n 1 basename | sed "s/-/+$tmpname-/2")


[nitpick] The wheel naming logic is duplicated between setup.py and the workflow file. Consider extracting this logic to a shared function or script to avoid inconsistencies and reduce maintenance burden.

Suggested change

if [ -n "${MATRIX_ARCH}" ]; then

tmpname=sm${MATRIX_ARCH}cu${WHEEL_CUDA_VERSION}torch${MATRIX_TORCH_VERSION}cxx11abi${{ inputs.cxx11_abi }}

else

tmpname=cu${WHEEL_CUDA_VERSION}torch${MATRIX_TORCH_VERSION}cxx11abi${{ inputs.cxx11_abi }}

fi

wheel_name=$(ls dist/*whl | xargs -n 1 basename | sed "s/-/+$tmpname-/2")

# Use shared Python script to generate wheel name suffix

wheel_suffix=$(python scripts/wheel_name.py \

--arch "${MATRIX_ARCH}" \

--cuda-version "${WHEEL_CUDA_VERSION}" \

--torch-version "${MATRIX_TORCH_VERSION}" \

--cxx11-abi "${{ inputs.cxx11_abi }}")

wheel_name=$(ls dist/*whl | xargs -n 1 basename | sed "s/-/+${wheel_suffix}-/2")

algo-home added 4 commits September 20, 2025 19:54

Copilot AI review requested due to automatic review settings September 20, 2025 12:02

Copilot AI reviewed Sep 20, 2025

View reviewed changes

LoserCheems merged commit c60a014 into main Sep 20, 2025
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for targeted GPU architecture builds #171

Add support for targeted GPU architecture builds #171

Uh oh!

LoserCheems commented Sep 20, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Sep 20, 2025

Uh oh!

Copilot AI Sep 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-    wheel_filename = f"{PACKAGE_NAME}-{flash_version}+sm{sm_arch}cu{cuda_version}torch{torch_version}cxx11abi{cxx11_abi}-{python_version}-{python_version}-{platform_name}.whl"
+    sm_arch_str = f"sm{sm_arch}" if sm_arch is not None else ""
+    plus = "+" if sm_arch_str else ""
+    wheel_filename = f"{PACKAGE_NAME}-{flash_version}{plus}{sm_arch_str}cu{cuda_version}torch{torch_version}cxx11abi{cxx11_abi}-{python_version}-{python_version}-{platform_name}.whl"

-            if [ -n "${MATRIX_ARCH}" ]; then
-              tmpname=sm${MATRIX_ARCH}cu${WHEEL_CUDA_VERSION}torch${MATRIX_TORCH_VERSION}cxx11abi${{ inputs.cxx11_abi }}
-            else
-              tmpname=cu${WHEEL_CUDA_VERSION}torch${MATRIX_TORCH_VERSION}cxx11abi${{ inputs.cxx11_abi }}
-            fi
-            wheel_name=$(ls dist/*whl | xargs -n 1 basename | sed "s/-/+$tmpname-/2")
+            # Use shared Python script to generate wheel name suffix
+            wheel_suffix=$(python scripts/wheel_name.py \
+              --arch "${MATRIX_ARCH}" \
+              --cuda-version "${WHEEL_CUDA_VERSION}" \
+              --torch-version "${MATRIX_TORCH_VERSION}" \
+              --cxx11-abi "${{ inputs.cxx11_abi }}")
+            wheel_name=$(ls dist/*whl | xargs -n 1 basename | sed "s/-/+${wheel_suffix}-/2")

Add support for targeted GPU architecture builds #171

Add support for targeted GPU architecture builds #171

Uh oh!

Conversation

LoserCheems commented Sep 20, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants