Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable previously disabled FA related Operators in UTs #1389

Open
wants to merge 36 commits into
base: rocm6.2_internal_testing
Choose a base branch
from

Conversation

xinyazhang
Copy link

@xinyazhang xinyazhang commented Apr 8, 2024

They were disabled in AOTriton V1, but V2 should fix most of them.

Passed with

PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TESTING_DEVICE_ONLY_FOR="cuda" python test/test_meta.py -k flash_attention -v
PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TESTING_DEVICE_ONLY_FOR="cuda" python test/test_ops.py -k flash_attention -v
PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TESTING_DEVICE_ONLY_FOR="cuda" python test/test_meta.py -k functional_scaled_dot_product_attention_cuda -v
PYTORCH_TEST_WITH_ROCM=1 PYTORCH_TESTING_DEVICE_ONLY_FOR="cuda" python test/test_ops.py -k functional_scaled_dot_product_attention_cuda -v

image
image
image

pruthvistony and others added 30 commits March 12, 2024 11:53
* changes to build Centos stream 9 images

* Added scripts for centos and centos stream images

* Added an extra line

* Add ninja installation

* Optimized code

* Fixes

* Add comment

* Optimized code

* Added AMDGPU mapping for ROCm 5.2 and invalid-url for rocm_baseurl

Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>
- Rocblas API support is requested
- SWDEV-383635 & sub task - SWDEV-390218
Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com>
* Add hip_basic tensorpipe support to PyTorch

* Enabling hip_basic for Tensorpipe for pyTorch

* removing upstream tensorpipe module

* Adding ROCm specific tensopipe submodule

* tensorpipe submodule updated

* Update the hip invalid device string

* Added ignore for tensorpipe git submodule

* Moved include of tensorpipe_cuda.h to hipify

* Updates based on review comments

* Defining the variable __HIP_PLATFORM_AMD__

* Enabling the UTs

Co-authored-by: Ronak Malik <Ronak.Malik@amd.com>
- Fortran package installation moved after gcc
- Update libtinfo search code in cmake1
- Install libstdc++.so
Reversed the condition as required
- Add missing common_utils.sh
- Update the install vision part
- Move to amdgpu rhel 9.3 builds
- Update to pick python from conda path
- Add a missing package
- Add ROCM_PATH and magma
- Updated repo radeon path
This also fixes a problem in gesvd driver when UV is not needed.
- build_environment is hard coded to value from upstream when
  branch for created, since the dev/QA ENV build_environment
  value can be varing
* Fix the parsing of /etc/os-release

The old code parses OS_DISTRO as 'PRETTY_Ubuntu' on Ubuntu and thus
never links to libtinfo correctly.

* Configurable CMAKE_PREFIX_PATH in CI script.
- This is done as per QA request, needs to be reverted and
  not required to be cherry-picked into later releases.
* Moved NAVI check to the test file

* Revised NAVI check as a function
TestReductionsCUDA.test_nansum_out_dtype_cuda_float32 would fail or pass
depending on the random inputs. Observed by ROCm internal QA testing.
IFU cherry-picks into rocm6.2_internal_testing
pruthvistony and others added 6 commits March 12, 2024 15:39
- Commit from branch pytorch/rocm6.2_internal_testing
* Running triton kernel on ROCM only has one GB/s metric reported

* Update test_kernel_benchmark.py
C++20 mangling rules were recently added to hip-clang. This flag
maintains compatibility since pytorch is at C++17. Otherwise the linker
fails.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants