Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: There is no device can be used to do the computation on HIP/CUDA path #30

Closed
sriharikarnam opened this issue Jan 17, 2018 · 3 comments

Comments

@sriharikarnam
Copy link

We get "Error: There is no device can be used to do the computation" on HIP/CUDA path, while exercising test cases after integrating MiOpen for MXNet HIP Port.
Pre-built MiOpen package is taken from ROCm.($ sudo apt-get install miopen-hip)

Query:
Does pre-built package of MiOpen supports both HIP/CUDA and HIP/ROCm Platforms.

@dagamayank
Copy link
Contributor

Please clarify what do you mean by HIP/CUDA path?

@sriharikarnam
Copy link
Author

@dagamayank : We have done porting of MXNet to HIP, and this port is expected to work on NVCC platform and HCC platform as a common code. We have integrated MiOpen in place of cuDNN. After integration , while testing the port on NVCC platform with MiOpen, we get error as above.

@dagamayank
Copy link
Contributor

@sriharikarnam MIOpen is not meant to work on any other platform apart from ROCm. Therefore, it will not work with nvcc. If you want your HIP port to work with nvcc please use cuDNN.

pfultz2 pushed a commit that referenced this issue Mar 30, 2018
ltqin pushed a commit that referenced this issue Oct 28, 2021
646fcc268 Merge pull request #47 from ROCmSoftwarePlatform/develop
6014185 [Bug Fix] GridwiseGemm_bk0mk1_bk0nk1_mn_xdlops_v2r4 loop issue (#44)
3e91137 Merge pull request #46 from ROCmSoftwarePlatform/miopen_downstream_all
211dae8 Merge branch 'develop' into miopen_downstream_all
5890e30 [Composable Kernel] update develop branch code to ck_upstream
d5297ab fix bug in gridwise gemm xdlops v2r3 (#45)
38a90b6ed Merge pull request #43 from ROCmSoftwarePlatform/develop
c301879 bug fix (#39)
fd49ff8 add nchw atomic , nhwc and nhwc atomic method   for backward weight (#30)
b2dc55f [MIOpen Downstream] Fix Reduction Kernel (#34)
b3e8d57 Tweak GEMM kernel (#38)
846f462 Add VectorType support into StaticBuffer (#27)
dfb80c4 [Enhancements] Several bugfixes and refactoring of dynamic generic reduction  (#1156)
8557901 Merge pull request #1165 from ROCmSoftwarePlatform/develop
f305beb Merge pull request #31 from ROCmSoftwarePlatform/miopen_downstream-dynamic_reduction_pr
b725e3f Merge remote-tracking branch 'origin/develop' into miopen_downstream-dynamic_reduction_pr
88833bd9a Merge pull request #32 from ROCmSoftwarePlatform/develop
df0d681 :Merge remote-tracking branch 'origin/develop' into CK_upstream
f3acd25 Add  a version of Merge transform that use integerdivision and mod (#25)
1961390 GEMM driver and kernel (#29)
627d8ef Backward weight v4r4r2 with xdlops (#18)
10bb811 Misc fixes (#24)
9e80cdc [SWDEV-281541][MSRCHA-100] Implementation of Dynamic Generic Reduction  (#1108)
a7a758d GlobalAtomicAdd for fp32/int32 (#23)
9d3f634 Xdlops refactor fix (#22)
c6f26bb magic division use __umulhi() (#19)
6fe3627 Composable kernel init integration v3 (#1097)
a2ad6d3 refactor dynamic xdlops iGemm (#13)
ba6f79a Added host_conv_wrw for verification (#15)

git-subtree-dir: src/composable_kernel
git-subtree-split: 646fcc268ede841a16cdaafb68aa64803d8390e1
bghimireamd pushed a commit that referenced this issue Mar 24, 2023
* add add new algorithm from v4r4r2

* program once issue

* add split k functiion

* redefine code

* add a matrix unmerge

* add b matrix unmerge k0

* trans a and b to gridegemm

* nhwc init

* no hacks and vector load

* add hacks

* modify some parameter

* fix tuning prometer for fp32

* fix tuning prometer for fp16

* start change gridwise k split

* init ok

* revome a b matrix k0mk1 desc in grid

* carewrite lculate gridsize

* add kbatch to CalculateBottomIndex

* remove some unused funtion

* add clear data function before call kernel

* out hacks

* in hacks

* rename device convolution file and function name

* modify kBatch value

* fix some tuning code

* start from v4r4 nhwc

* nhwc atomic is able to run

* just for fp32

* enable nchw atomic

* tweak

* tweak

* re-arrange gridwise gemm hot loop for wrw

* add wrw v4r5

* v4r4r5 fp16

* v4r4r4 fp16

* v4r4r2 fp16

* V4R4R4XDLNHWC fp16

* V4R4R2XDLATOMICNCHW fp16

* adjust for fp16

* input gridsize

* change kbatch to gridsize

* testing wrw

* clean up

* k_batch to gridsize

* fix bug

* wrw v4r4r4 kbatch change to gride size

* wrw v4r4r2 kbatch change to gride size

* after merge , change gridwise gemm v2r4

* change MakeCBlockClusterAdaptor

* other method use new gridwise gemm

* clean up

* chapad method nge to make_right_pad_transform

* kbatch out from transform function

* clean up and fix bug

* fix bug

* using function type reduce template parameters

* using auto replace define fuction type

* clean up

Co-authored-by: ltqin <letaoqin@amd.com>
Co-authored-by: Chao Liu <chao.liu2@amd.com>
Co-authored-by: Jing Zhang <jizhan@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants