Feat/amd rocm support#70
Closed
Justin Darnell (justindarnell) wants to merge 3 commits intoLightricks:mainfrom
Closed
Feat/amd rocm support#70Justin Darnell (justindarnell) wants to merge 3 commits intoLightricks:mainfrom
Justin Darnell (justindarnell) wants to merge 3 commits intoLightricks:mainfrom
Conversation
Tests all critical capabilities needed for the LTX pipeline: - ROCm/HIP detection via torch.version.hip - bfloat16 support (critical — LTX uses bf16 globally) - Core ops: SDPA, Conv3d, GroupNorm, LayerNorm, Generator seeding - torch.compile / triton availability - Memory management (empty_cache, synchronize, large alloc) - FP8 guard logic validation - Transformer-scale stress test (SDPA + FFN + 3D VAE conv) - LTX pipeline import test (ltx-core, ltx-pipelines, transformers) Run with: python scripts/test-rocm-feasibility.py --skip-ltx python scripts/test-rocm-feasibility.py --verbose Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Based on feasibility test results (33/33 critical tests pass on RDNA 3.5): - VRAM: 87.9 GB detected (well above 31 GB LTX threshold) - bfloat16: fully supported on RDNA 3.5 - SDPA, Conv3d, all core ops: pass Backend changes: - services/services_utils.py: add is_rocm_device() helper; fix device_supports_fp8() to return False for ROCm (FP8 not hardware- accelerated on RDNA 3.x) and check sm_89+ for NVIDIA - handlers/pipelines_handler.py: skip torch.compile for ROCm builds (triton not available on ROCm Windows) alongside existing MPS skip - ltx2_server.py: auto-set TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1 on ROCm to enable optimized Flash Attention (SDPA falls back to slow math-attention path without this env var) - pyproject.toml: bump transformers to >=4.55.5 (AMD requirement) Build system: - scripts/prepare-python.ps1: add -GpuBackend parameter (cuda|rocm) ROCm path: forces Python 3.12, filters CUDA-only packages (torch/sageattention/triton-windows), installs ROCm SDK and PyTorch wheels from repo.radeon.com, skips Triton JIT headers step ROCm maps HIP to the CUDA PyTorch API so torch.cuda.is_available() returns True — no frontend or Electron changes needed. Usage: scripts/prepare-python.ps1 -GpuBackend rocm Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- electron-builder.yml: use ${env.LTX_ARTIFACT_SUFFIX} in win artifactName
so ROCm builds produce '*-ROCm-Setup.exe' (empty suffix = default CUDA name)
- scripts/create-installer.ps1: add -GpuBackend param; sets LTX_ARTIFACT_SUFFIX
- scripts/local-build.ps1: add -GpuBackend param; passed to prepare-python.ps1
and create-installer.ps1
- backend/tests/test_device_utils.py: unit tests for is_rocm_device() and
device_supports_fp8() covering ROCm, NVIDIA pre-Ada, Ada, Hopper, CPU, MPS
- README.md: add AMD ROCm row to compatibility table; add Windows AMD system
requirements section with driver, BIOS, and security prerequisites
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Why was this closed out? is there some hard block preventing it from working on an AMD machine? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.