Enable HIPRTC support as default from ROCm 5.0 #1237

atamazov · 2021-10-22T15:57:31Z

HIPRTC support is added and enabled by default starting from ROCm 5.0
- This will automatically enable HIPRTC testing after CI upgrade
- ⚠️ HIP version must be fixed in Mainline in order to enable HIPRTC testing by QA
Added MIOPEN_DEBUG_USE_HIPRTC env var, which can be used to fall back to COMGR.
Workarounds:
- Added W/A for SWDEV-308073 (WORKAROUND_ISSUE_HIPRTC_TRUE_TYPE)
- Added W/A for SWDEV-307838 (WORKAROUND_ISSUE_HIPRTC_HIPRTC_HEADER_H)

…_PCH_ENFORCE: Removed possibility to enable PCH.

…n Mainline 8733)

…ess. [hiprtc] Added WORKAROUND_ISSUE_HIPRTC_TRUE_TYPE.

… HIPRTC.

…sing types + WORKAROUND_ISSUE_HIPRTC_TRUE_TYPE

…rmer is not necessary.

…d when SWDEV-297217 is resolved)

…RTC_HALF_CONVERSION. Host side changes.

… when "get binary" fails.

…RTC__

# RESOLVED Conflicts: # test/CMakeLists.txt

# RESOLVED Conflicts: # CMakeLists.txt # src/comgr.cpp # test/CMakeLists.txt

CMakeLists.txt

JehandadKhan · 2022-02-14T16:22:11Z

CMakeLists.txt


+# Do not enable HIPRTC by default for older ROCm versions in order to avoid
+# build time errors, because HIPRTC is a relatively new component.
+set_var_to_condition(MIOPEN_USE_HIPRTC_DEFAULT ${MIOPEN_USE_COMGR} AND (${MIOPEN_hip_VERSION_FLAT} GREATER 500000000))


MIOPEN_USE_COMGR does not have a default value, which causes a default cmake run to fail. Such as

CXX=/opt/rocm/llvm/bin/clang++ cmake ..

Please update the PR so that the value of MIOPEN_USE_COMGR is always specified.

JehandadKhan · 2022-02-14T16:57:49Z

src/comgr.cpp

+        auto opts =
+            miopen::SplitSpaceSeparated(options, miopen::comgr::compiler::lc::GetOptionsNoSplit());
+        compiler::lc::RemoveOptionsUnwanted(opts);
+        opts.push_back("-DWORKAROUND_ISSUE_HIPRTC_TRUE_TYPE"); // Workaround for SWDEV-308073


I suggest we refactor these common defines to a place where they may be used both from hip_build_utils.cpp and here(comgr.cpp) so that we something needs to be fixed, it only needs to be fixed in one place.

@JehandadKhan

so that we something needs to be fixed, it only needs to be fixed in one place.

I do not understand. Can you please clarify the use case?

JehandadKhan · 2022-02-14T17:11:04Z

src/composable_kernel/composable_kernel/include/utility/type.hpp

+/// /opt/rocm/include/hip/amd_detail/amd_hip_vector_types.h,
+/// which defines std::true_type as well (which is wrong).
+
+namespace std {


Can we move these to a common file and include that file everywhere instead ?

Would this WA potentially causing the numerical changes in #1237 (comment) ?

Let's fix this issue in follow up PRs

@JehandadKhan

Can we move these to a common file and include that file everywhere instead ?

No. This is workaround. We do not know how the problem will evolve in the future. Applying the "good design practices" can be a waste of time.

junliume · 2022-02-15T19:19:11Z

@atamazov @JehandadKhan @asroy @zjing14 @qianfengz could you take a look at CK related changes?
The composable kernel-related changes in this PR might have affected accuracy:

# ./bin/test_reduce_test --float --D 64 3 280 81 --R 0 --ReduceOp 0 --CompType 1 --N 0 --I 0 --scales 1 0
./bin/test_reduce_test --float --D 64 3 280 81 --R 0 --ReduceOp 0 --CompType 1 --N 0 --I 0 --scales 1 0 
FAILED: 0.255031
Iteration: 0
verify_reduce_no_indices failed
Input Tensor 64, 3, 280, 81
Max diff: 2.27564
Mismatch at 0: 1.00436 != 2.00873

qianfengz · 2022-02-16T04:35:57Z

@atamazov @JehandadKhan @asroy @zjing14 @qianfengz could you take a look at CK related changes? The composable kernel-related changes in this PR might have affected accuracy:
# ./bin/test_reduce_test --float --D 64 3 280 81 --R 0 --ReduceOp 0 --CompType 1 --N 0 --I 0 --scales 1 0
./bin/test_reduce_test --float --D 64 3 280 81 --R 0 --ReduceOp 0 --CompType 1 --N 0 --I 0 --scales 1 0 
FAILED: 0.255031
Iteration: 0
verify_reduce_no_indices failed
Input Tensor 64, 3, 280, 81
Max diff: 2.27564
Mismatch at 0: 1.00436 != 2.00873

Very strange! I could not reproduce the issue on MI100 and MI50 using hiprtc branch. From your test, the reduced result value get on Host is just half that of the GPU.

junliume · 2022-02-16T08:40:21Z

@atamazov @JehandadKhan @asroy @zjing14 @qianfengz could you take a look at CK related changes? The composable kernel-related changes in this PR might have affected accuracy:
# ./bin/test_reduce_test --float --D 64 3 280 81 --R 0 --ReduceOp 0 --CompType 1 --N 0 --I 0 --scales 1 0
./bin/test_reduce_test --float --D 64 3 280 81 --R 0 --ReduceOp 0 --CompType 1 --N 0 --I 0 --scales 1 0 
FAILED: 0.255031
Iteration: 0
verify_reduce_no_indices failed
Input Tensor 64, 3, 280, 81
Max diff: 2.27564
Mismatch at 0: 1.00436 != 2.00873
Very strange! I could not reproduce the issue on MI100 and MI50 using hiprtc branch. From your test, the reduced result value get on Host is just half that of the GPU.

This might not be hiprtc related but instead ROCm 5.0 related. Removing the blocker for this PR for now.

Comments resolved

qianfengz · 2022-02-16T11:23:15Z

@atamazov @JehandadKhan @asroy @zjing14 @qianfengz could you take a look at CK related changes? The composable kernel-related changes in this PR might have affected accuracy:
# ./bin/test_reduce_test --float --D 64 3 280 81 --R 0 --ReduceOp 0 --CompType 1 --N 0 --I 0 --scales 1 0
./bin/test_reduce_test --float --D 64 3 280 81 --R 0 --ReduceOp 0 --CompType 1 --N 0 --I 0 --scales 1 0 
FAILED: 0.255031
Iteration: 0
verify_reduce_no_indices failed
Input Tensor 64, 3, 280, 81
Max diff: 2.27564
Mismatch at 0: 1.00436 != 2.00873
Very strange! I could not reproduce the issue on MI100 and MI50 using hiprtc branch. From your test, the reduced result value get on Host is just half that of the GPU.
This might not be hiprtc related but instead ROCm 5.0 related. Removing the blocker for this PR for now.

This could be compiler issue. I found that the warpSize used by my DirectWarpSize reduction kernels is 64, while the value of handle.GetWavefrontWidth() is 32. My understand is that warpSize is provided as a constant by the compiler, while handle.GetWavefrontWidth() gets the warp size from the HIP Runtime device properties. So the constant value maintained by the compiler under ROCM 5.0 is problematic on Navi. The issue can be worked-around by passing the value of handle.GetWavefrontWidth() to the kernel and let the DirectWarpWise kernel to use it instead of the warpSize

atamazov · 2022-02-22T14:21:19Z

We should stop the chaos from spreading and start adding comments to the relevant tickets.

junliume · 2022-02-22T16:15:03Z

We should stop the chaos from spreading and start adding comments to the relevant tickets.

Yes, currently two priorities for this week: (1) what might have caused workspace diffs in last tuning updates; (2) warpSize inconsistent between different HIP kernel building methods, e.g. hip-Clang and hipRTC.

Each is tracked by an issue in blocking urgency. #1429 and #1431 The first one is actively been resolved. The second has a workaround for now (not sure if there will be other issues though).

junliume · 2022-03-21T06:41:23Z

@atamazov resnet is getting gradient overflow with this PR enabling hipRTC as default. Is it safe to revert it? Thanks!

atamazov · 2022-03-21T15:16:06Z

@junliume Just change MIOPEN_USE_HIPRTC_DEFAULT (line 226 in ./CMakeLists.txt) to something like ...GREATER 900000000)) as a workaround.

atamazov · 2022-03-21T15:18:28Z

Then you'll be able to use -DMIOPEN_USE_HIPRTC=On/Off for experiments.

junliume · 2022-03-21T15:20:06Z

@junliume Just change MIOPEN_USE_HIPRTC_DEFAULT (line 226 in ./CMakeLists.txt) to something like ...GREATER 900000000)) as a workaround.

Thank you! @atamazov

atamazov added 12 commits October 12, 2021 19:02

hiprtc(01) Added MIOPEN_USE_HIPRTC to cmake files

e480597

hiprtc(02) Draft implementation

22fef9a

hiprtc(03) W/A for missing __hipGetPCH in 4.4. MIOPEN_DEBUG_COMGR_HIP…

c5e00ad

…_PCH_ENFORCE: Removed possibility to enable PCH.

hiprtc(04) [tests] Remove WORKAROUND_COMGR_WARNING_ISSUES (no issue i…

977e141

…n Mainline 8733)

hiprtc(05) [HIP iGemm kernels] Add some standard headers for correctn…

2c54302

…ess. [hiprtc] Added WORKAROUND_ISSUE_HIPRTC_TRUE_TYPE.

hiprtc(06) [Naive conv] Fix to work correctly with COMGR+PCH and with…

ff7e5b4

… HIPRTC.

hiprtc(07) [composable_kernel] Fix: Add some standard headers for mis…

3d7fa2f

…sing types + WORKAROUND_ISSUE_HIPRTC_TRUE_TYPE

hiprtc(08) [composable_kernel] Rework: constexpr -> typedef as the fo…

dc1a2e1

…rmer is not necessary.

hiprtc(09) W/A: disable warning about C++17 extensions (to be reverte…

b49d4fb

…d when SWDEV-297217 is resolved)

hiprtc(10) MIOPEN_BUILD_HIPRTC -> __HIPCC_RTC__. WORKAROUND_ISSUE_HIP…

1bb0596

…RTC_HALF_CONVERSION. Host side changes.

hiprtc(11) [NFC] Remove useless stuff

187a63e

hiprtc(12) [ckip-ci][COMGR][HIPRTC] Quality fix: Return binary size 0…

1ab3ee1

… when "get binary" fails.

atamazov added enhancement ON_HOLD urgency_blocker labels Oct 22, 2021

atamazov added this to the ROCm 5.0 milestone Oct 22, 2021

atamazov added 4 commits October 22, 2021 19:01

hiprtc(13) [ckip-ci] Fix leftovers of MIOPEN_BUILD_HIPRTC -> __HIPCC_…

d18c063

…RTC__

[skip-ci] Merge branch 'develop' into hiprtc

2ee18f4

[ci-skip] Merge branch 'develop' into hiprtc

68bb4df

[ci-skip] Merge branch 'develop' into hiprtc

018fedb

# RESOLVED Conflicts: # test/CMakeLists.txt

junliume added complexity_high and removed urgency_blocker labels Nov 4, 2021

atamazov modified the milestones: ROCm 5.0, ROCm 5.1 Dec 2, 2021

atamazov removed the ON_HOLD label Dec 22, 2021

atamazov added 5 commits January 27, 2022 01:18

Merge branch 'develop' into hiprtc

d48a744

# RESOLVED Conflicts: # CMakeLists.txt # src/comgr.cpp # test/CMakeLists.txt

hiprtc(16) Format

dc9b8cc

hiprtc(17) Add HIP source filename to the log after build error.

9da94a4

Merge branch 'develop' into hiprtc

a4f0d32

hiprtc(18) Fix some kernel build warnings

8d508e6

atamazov marked this pull request as ready for review February 11, 2022 22:45

atamazov requested review from asroy, shurale-nkn, jerryyin, junliume and JehandadKhan February 11, 2022 22:46

atamazov added 2 commits February 12, 2022 02:10

Revert changes of W/A for issue 898.

49cfd67

Revert whitespace changes

c7d8f52

JehandadKhan previously requested changes Feb 14, 2022

View reviewed changes

resolve comments and default value issues

bd2d1e3

Merge branch 'develop' into hiprtc

088d930

junliume added the TESTING_CI_PASSED label Feb 16, 2022

junliume approved these changes Feb 16, 2022

View reviewed changes

junliume changed the title ~~HIPRTC support~~ Enable HIPRTC support as default from ROCm 5.0 Feb 16, 2022

junliume merged commit b735eb2 into develop Feb 16, 2022

junliume mentioned this pull request Feb 16, 2022

[Navi21][ROCm5.0] warpSize inconsistent between different HIP kernel building methods #1431

Open

junliume deleted the hiprtc branch June 7, 2022 23:25

atamazov mentioned this pull request Jan 10, 2023

[HIP] We should not be including cmath in our kernels... #1926

Open

atamazov mentioned this pull request Jun 2, 2023

MiOpen Unit Test issue on Windows (redefinition of variables) #2184

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable HIPRTC support as default from ROCm 5.0 #1237

Enable HIPRTC support as default from ROCm 5.0 #1237

atamazov commented Oct 22, 2021 •

edited

Loading

JehandadKhan Feb 14, 2022

JehandadKhan Feb 14, 2022

atamazov Feb 22, 2022

JehandadKhan Feb 14, 2022

junliume Feb 15, 2022

junliume Feb 16, 2022

atamazov Feb 22, 2022

junliume commented Feb 15, 2022

qianfengz commented Feb 16, 2022

junliume commented Feb 16, 2022

qianfengz commented Feb 16, 2022

atamazov commented Feb 22, 2022

junliume commented Feb 22, 2022

junliume commented Mar 21, 2022

atamazov commented Mar 21, 2022

atamazov commented Mar 21, 2022

junliume commented Mar 21, 2022

Enable HIPRTC support as default from ROCm 5.0 #1237

Enable HIPRTC support as default from ROCm 5.0 #1237

Conversation

atamazov commented Oct 22, 2021 • edited Loading

JehandadKhan Feb 14, 2022

Choose a reason for hiding this comment

JehandadKhan Feb 14, 2022

Choose a reason for hiding this comment

atamazov Feb 22, 2022

Choose a reason for hiding this comment

JehandadKhan Feb 14, 2022

Choose a reason for hiding this comment

junliume Feb 15, 2022

Choose a reason for hiding this comment

junliume Feb 16, 2022

Choose a reason for hiding this comment

atamazov Feb 22, 2022

Choose a reason for hiding this comment

junliume commented Feb 15, 2022

qianfengz commented Feb 16, 2022

junliume commented Feb 16, 2022

qianfengz commented Feb 16, 2022

atamazov commented Feb 22, 2022

junliume commented Feb 22, 2022

junliume commented Mar 21, 2022

atamazov commented Mar 21, 2022

atamazov commented Mar 21, 2022

junliume commented Mar 21, 2022

atamazov commented Oct 22, 2021 •

edited

Loading