Skip to content

Custom PyTorch wheel support, GPU acceleration, and MPS device support#61

Merged
jewilder merged 7 commits into
microsoft:mainfrom
philnach:customPyTorch
May 9, 2026
Merged

Custom PyTorch wheel support, GPU acceleration, and MPS device support#61
jewilder merged 7 commits into
microsoft:mainfrom
philnach:customPyTorch

Conversation

@philnach
Copy link
Copy Markdown
Member

@philnach philnach commented May 9, 2026

Summary

TL;DR: PyTorch inference test can be instructed to run on GPU or CPU and test has the ability to specify a custom PyTorch wheel to use vs. the default.

Adds the ability to run PyTorch inference with custom-built wheels, optional CUDA/cuDNN installation, GPU/CPU device selection, and Apple MPS (Metal Performance Shaders) support on macOS. All features are opt-in via independent parameters existing default behavior (CPU inferencing vs. GPU) is unchanged.

New Parameters

Parameter Default Description
custom_resources_path '' Local path to a folder containing custom installers/wheels. Uploaded to the DUT during prep.
use_custom_pytorch_wheel false Install torch-*.whl from the custom resources path instead of pip
install_cuda false Silently install cuda_*.exe from the custom resources path
install_cudnn false Silently install cudnn*.exe from the custom resources path
use_gpu true When false, forces CPU-only inference (--no-gpu)

Each parameter is independent. Examples:

# Full custom stack: custom wheel + CUDA + cuDNN
pytorch_inf:custom_resources_path=C:\my_bits pytorch_inf:use_custom_pytorch_wheel=true pytorch_inf:install_cuda=true pytorch_inf:install_cudnn=true

# Official PyTorch + custom CUDA/cuDNN
pytorch_inf:custom_resources_path=C:\my_bits pytorch_inf:install_cuda=true pytorch_inf:install_cudnn=true

# Force CPU-only mode (any platform)
pytorch_inf:use_gpu=false

MPS Device Support

Updated inference.py (shared between Windows and macOS) to detect Apple Metal GPU:

if args.gpu and torch.cuda.is_available():
    device = 'cuda'
elif args.gpu and hasattr(torch.backends, 'mps') and torch.backends.mps.is_available():
    device = 'mps'
else:
    device = 'cpu'

macOS Apple Silicon now uses MPS acceleration by default instead of falling back to CPU. The hasattr guard ensures backward compatibility with older PyTorch versions.

Device Priority

Platform Device used
Windows x64 (NVIDIA GPU) cuda
Windows Arm64 (custom wheel) cuda
macOS Apple Silicon mps
Any platform with use_gpu=false cpu

Prep Script Changes (Windows)

  • Added -customResourcesPath, -installCuda, -installCudnn parameters
  • CUDA Toolkit: silent install from cuda_*.exe, PATH refresh, nvcc --version verification
  • cuDNN: silent install from cudnn*.exe
  • Custom wheel: installs torch-*.whl + requirements_custom.txt (tokenizers, transformers, accelerate)
  • Switched to safetensors==0.8.0rc0 preview version since it now has native Arm64 support saves having to build the safetensors.
  • When custom wheel is used, Python version bumps to 3.13 (x64: 3.13.1, Arm64: 3.13.1-arm)
  • prep_version bumped to "12"

Run Script Changes

  • Windows: accepts -noGpu switch, passes --no-gpu to inference.py
  • macOS: accepts --no-gpu argument, passes through to inference.py
  • CUDA session PATH setup is architecture-neutral (applies whenever CUDA_PATH is set)

macOS Scenario Changes

  • Added use_gpu parameter to mac_pytorch_inf.py
  • mac_pytorch_inf_run.sh accepts --no-gpu and passes it to inference
  • inference.py is now identical between Windows and macOS (copyright header + MPS support)

Example: HOBL.ini PyTorch configuration for Windows

; ============================================================
; PyTorch Inference - Custom wheel + CUDA + cuDNN (full custom)
; ============================================================
[pytorch_inf]
loops: 2
custom_resources_path: C:\pytorch_custom_bits
use_custom_pytorch_wheel: true
install_cuda: true
install_cudnn: true

Files Changed

scenarios/windows/pytorch_inf/pytorch_inf.py                              # +40 lines
scenarios/windows/pytorch_inf/pytorch_inf_resources/pytorch_inf_prep.ps1  # +128 lines
scenarios/windows/pytorch_inf/pytorch_inf_resources/pytorch_inf_run.ps1   # +40 lines
scenarios/windows/pytorch_inf/pytorch_inf_resources/pytorch_inf_teardown.ps1
scenarios/windows/pytorch_inf/pytorch_inf_resources/inference.py          # MPS support
scenarios/windows/pytorch_inf/pytorch_inf_resources/requirements_custom.txt  # NEW
scenarios/windows/pytorch_inf/pytorch_inf_resources/requirements_win_arm64.txt
scenarios/macos/mac_pytorch_inf/mac_pytorch_inf.py                        # use_gpu param
scenarios/macos/mac_pytorch_inf/mac_pytorch_inf_resources/inference.py    # Synced with Windows
scenarios/macos/mac_pytorch_inf/mac_pytorch_inf_resources/mac_pytorch_inf_run.sh  # --no-gpu

@jewilder jewilder merged commit ce6f222 into microsoft:main May 9, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants