-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
4gb of memory not enough to run AlexNet? #17
Comments
This is happening in the |
Thanks, here is relevant portion of the output. Here, the size argument to hsa_memory_allocate is 0. This happens because the input hsaco file is empty, which seems definitely wrong. Do you have any idea why this is happening? [INFO] Conv(11x11,pad=2,s=4) (128,3,224,224)->(128,64,55,55): 7.885 ms |
Ah, after I deleted kernel caches in .cache/miopen, it worked. Just saying, raising Thanks! |
Hi,
I am trying to run patflick's miopen-benchmark on my R9 Nano.
When I run his AlexNet benchmark, I get 'hipErrorOutOfMemory' from MIOpen, even though I believe 4gb of RAM should be enough to run AlexNet.
Strangely, even if I change the batch size from 128 to 1, I still get the same 'hipErrorOutOfMemory' error (but from different layer).
Below is the output when the batch size is 128. I am using a debug build for MIOpen.
I really appreciate if you could help me figure out what is going on.
$ ./alexnet
[INFO] Number of HIP devices found: 1
[INFO] Device 0: Fiji [Radeon R9 FURY / NANO Series]
[INFO] Arch: 803
[INFO] GMem: 4096 MiB
[INFO] warps: 64
[INFO] CUs: 64
[INFO] MaxClk: 1000000
[INFO] MemClk: 500000
[INFO] drm: /sys/class/drm/card0
[INFO] hwmon: /sys/class/drm/card0/device/hwmon/hwmon1
[DEBUG] Allocating Float Tensor (64,3,11,11), total size: 90 kB
[DEBUG] Allocating Float Tensor (64,3,11,11), total size: 90 kB
[DEBUG] Allocating Float Tensor (128,64,55,55), total size: 96800 kB
[DEBUG] Allocating Float Tensor (128,64,55,55), total size: 96800 kB
[DEBUG] Allocating Float Tensor (128,64,27,27), total size: 23328 kB
[DEBUG] Allocating Float Tensor (192,64,5,5), total size: 1200 kB
[DEBUG] Allocating Float Tensor (192,64,5,5), total size: 1200 kB
[DEBUG] Allocating Float Tensor (128,192,27,27), total size: 69984 kB
[DEBUG] Allocating Float Tensor (128,192,27,27), total size: 69984 kB
[DEBUG] Allocating Float Tensor (128,192,13,13), total size: 16224 kB
[DEBUG] Allocating Float Tensor (384,192,3,3), total size: 2592 kB
[DEBUG] Allocating Float Tensor (384,192,3,3), total size: 2592 kB
[DEBUG] Allocating Float Tensor (128,384,13,13), total size: 32448 kB
[DEBUG] Allocating Float Tensor (128,384,13,13), total size: 32448 kB
[DEBUG] Allocating Float Tensor (256,384,3,3), total size: 3456 kB
[DEBUG] Allocating Float Tensor (256,384,3,3), total size: 3456 kB
[DEBUG] Allocating Float Tensor (128,256,13,13), total size: 21632 kB
[DEBUG] Allocating Float Tensor (128,256,13,13), total size: 21632 kB
[DEBUG] Allocating Float Tensor (256,256,3,3), total size: 2304 kB
[DEBUG] Allocating Float Tensor (256,256,3,3), total size: 2304 kB
[DEBUG] Allocating Float Tensor (128,256,13,13), total size: 21632 kB
[DEBUG] Allocating Float Tensor (128,256,13,13), total size: 21632 kB
[DEBUG] Dims after Features: (128,256,6,6)
[DEBUG] Allocating Float Tensor (128,9216,1,1), total size: 4608 kB
[DEBUG] Allocating Float Tensor (4096,9216,1,1), total size: 147456 kB
[DEBUG] Allocating Float Tensor (4096,9216,1,1), total size: 147456 kB
[DEBUG] Allocating Float Tensor (128,4096,1,1), total size: 2048 kB
[DEBUG] Allocating Float Tensor (128,4096,1,1), total size: 2048 kB
[DEBUG] Allocating Float Tensor (4096,4096,1,1), total size: 65536 kB
[DEBUG] Allocating Float Tensor (4096,4096,1,1), total size: 65536 kB
[DEBUG] Allocating Float Tensor (128,4096,1,1), total size: 2048 kB
[DEBUG] Allocating Float Tensor (128,4096,1,1), total size: 2048 kB
[DEBUG] Allocating Float Tensor (1000,4096,1,1), total size: 16000 kB
[DEBUG] Allocating Float Tensor (1000,4096,1,1), total size: 16000 kB
[DEBUG] Allocating Float Tensor (128,3,224,224), total size: 75264 kB
[DEBUG] Allocating Float Tensor (128,256,6,6), total size: 4608 kB
[INFO] Init fwd
[DEBUG] Allocating Float Tensor (128,1000,1,1), total size: 500 kB
[DEBUG] Init fwd Conv(11x11,pad=2,s=4) (128,3,224,224)->(128,64,55,55) req workspace: 4392300
[DEBUG] >>> Resizing workspace 0 -> 4392300
runcl -DNUM_CH_PER_WG=1 -DNUM_IM_BLKS_X=2 -DNUM_IM_BLKS=14 -DLOCAL_MEM_SIZE=5265 -DSTRIDE_GT_1=1 -DTILE_SZ_X=32 -DTILE_SZ_Y=8 -DUSE_IM_OFF_GUARD=1 src/Kernels/MIOpenUtilKernels.cl -k Im2Col -dumpilisa -r 10 if#0: if#0: if#0: iv#0 10752,1,1/256,1,1
key: miopenIm2Col,
Kernel filename: MIOpenUtilKernels.cl
key: miopenConvolutionFwdAlgoGEMM,tC0_tA0_tB0_colMaj1_m3025_n64_k363_lda3025_ldb363_ldc3025_ws0_f32
key: miopenConvolutionFwdAlgoGEMM_beta,tC0_tA0_tB0_colMaj1_m3025_n64_k363_lda3025_ldb363_ldc3025_ws0_f32
key: miopenConvolutionFwdAlgoGEMM_beta tC0_tA0_tB0_colMaj1_m3025_n64_k363_lda3025_ldb363_ldc3025_ws0_f32
key: miopenConvolutionFwdAlgoGEMM tC0_tA0_tB0_colMaj1_m3025_n64_k363_lda3025_ldb363_ldc3025_ws0_f32
runcl -DMLO_DIR_FORWARD=1 -DMLO_GRP_SZ=256 -DMLO_GRP_SZ0=256 -DMLO_GRP_SZ1=1 -DMLO_GRP_SZ2=1 -DMLO_FILTER_SIZE0=11 -DMLO_FILTER_SIZE1=11 -DMLO_FILTER_PAD0=2 -DMLO_FILTER_PAD1=2 -DMLO_FILTER_STRIDE0=4 -DMLO_FILTER_STRIDE1=4 -DSTRIDE_W=4 -DSTRIDE_H=4 -DMLO_N_OUTPUTS=64 -DMLO_N_INPUTS=3 -DMLO_BATCH_SZ=128 -DMLO_N_BATCH_LOOPS=1 -DMLO_OUT_BATCH_STRIDE=193600 -DMLO_OUT_CHANNEL_STRIDE=3025 -DMLO_OUT_STRIDE=55 -DMLO_IN_BATCH_STRIDE=150528 -DMLO_IN_CHANNEL_STRIDE=50176 -DMLO_IN_STRIDE=224 -DMLO_WEI_BATCH_STRIDE=363 -DMLO_WEI_CHANNEL_STRIDE=121 -DMLO_IN_WIDTH=224 -DMLO_IN_HEIGHT=224 -DMLO_OUT_WIDTH=55 -DMLO_OUT_HEIGHT=55 -DMLO_IN_TILE1=1 -DMLO_IN_TILE0=1 -DMLO_N_LCL_BATCHS=1 -DMLO_N_LCL_OUT_MAPS=6 -DMLO_N_LCL_IN_MAPS=1 -DMLO_IN_PIX_TILE1=1 -DMLO_IN_PIX_TILE0=1 -DMLO_OUT_PIX_TILE1=1 -DMLO_OUT_PIX_TILE0=3 -DMLO_OUT_STACKS=1 -DMLO_IN_STACKS=1 -DMLO_N_WAVES=4 -DMLO_N_FILTER_SPLITS0=3 -DMLO_N_FILTER_SPLITS1=3 -DMLO_PROCESSING_WIDTH=19 -DMLO_OUT_EXTENT1=13 -DMLO_LAST_OUT_EXTENT1=3 -DMLO_N_LCL_BATCHS_PASS2=4 -DMLO_TILE_REPLICATE0=2 -DMLO_TILE_REPLICATE1=1 -DMLO_LCL_BWD_MEM_SZ=726 -DMLO_N_IN_BWD_HORIZ_READS=17 -DMLO_N_IN_BWD_VERT_READS=6 -DMLO_READ_TYPE=_FLOAT10 -DMLO_READ_UNIT=10 -DMLO_HW_WAVE_SZ=64 -DMLO_LG2_WAVE_SZ=6 -DMLO_N_WAVES_MASK=3 -DMLO_CONV_BIAS=0 -cl-denorms-are-zero src/Kernels/MIOpenConvFwd_LxL_11.cl -k MIOpenCvFwd11x11 -dumpilisa -r 10 if#77070336: if#92928: if#99123200: iv#0 1024,11,128/256,1,1
key: miopenConvolutionFwdAlgoDirect,3x224x224x11x11x64x55x55x128xNCHWxFP32x1
Kernel filename: MIOpenConvFwd_LxL_11.cl
runcl -DMLO_DIR_FORWARD=1 -DMLO_GRP_SZ=256 -DMLO_GRP_SZ0=256 -DMLO_GRP_SZ1=1 -DMLO_GRP_SZ2=1 -DMLO_FILTER_SIZE0=11 -DMLO_FILTER_SIZE1=11 -DMLO_FILTER_PAD0=2 -DMLO_FILTER_PAD1=2 -DMLO_FILTER_STRIDE0=4 -DMLO_FILTER_STRIDE1=4 -DSTRIDE_W=4 -DSTRIDE_H=4 -DMLO_N_OUTPUTS=64 -DMLO_N_INPUTS=3 -DMLO_BATCH_SZ=128 -DMLO_N_BATCH_LOOPS=1 -DMLO_OUT_BATCH_STRIDE=193600 -DMLO_OUT_CHANNEL_STRIDE=3025 -DMLO_OUT_STRIDE=55 -DMLO_IN_BATCH_STRIDE=150528 -DMLO_IN_CHANNEL_STRIDE=50176 -DMLO_IN_STRIDE=224 -DMLO_WEI_BATCH_STRIDE=363 -DMLO_WEI_CHANNEL_STRIDE=121 -DMLO_IN_WIDTH=224 -DMLO_IN_HEIGHT=224 -DMLO_OUT_WIDTH=55 -DMLO_OUT_HEIGHT=55 -DMLO_IN_TILE1=1 -DMLO_IN_TILE0=1 -DMLO_N_LCL_BATCHS=1 -DMLO_N_LCL_OUT_MAPS=6 -DMLO_N_LCL_IN_MAPS=1 -DMLO_IN_PIX_TILE1=1 -DMLO_IN_PIX_TILE0=1 -DMLO_OUT_PIX_TILE1=1 -DMLO_OUT_PIX_TILE0=3 -DMLO_OUT_STACKS=1 -DMLO_IN_STACKS=1 -DMLO_N_WAVES=4 -DMLO_N_FILTER_SPLITS0=3 -DMLO_N_FILTER_SPLITS1=3 -DMLO_PROCESSING_WIDTH=19 -DMLO_OUT_EXTENT1=13 -DMLO_LAST_OUT_EXTENT1=3 -DMLO_N_LCL_BATCHS_PASS2=4 -DMLO_TILE_REPLICATE0=2 -DMLO_TILE_REPLICATE1=1 -DMLO_LCL_BWD_MEM_SZ=726 -DMLO_N_IN_BWD_HORIZ_READS=17 -DMLO_N_IN_BWD_VERT_READS=6 -DMLO_READ_TYPE=_FLOAT10 -DMLO_READ_UNIT=10 -DMLO_HW_WAVE_SZ=64 -DMLO_LG2_WAVE_SZ=6 -DMLO_N_WAVES_MASK=3 -DMLO_CONV_BIAS=0 -cl-denorms-are-zero src/Kernels/MIOpenConvFwd_LxL_11.cl -k MIOpenCvFwd11x11_2 -dumpilisa -r 10 if#77070336: if#92928: if#99123200: iv#0 256,11,32/256,1,1
key: miopenConvolutionFwdAlgoDirect_pass2,3x224x224x11x11x64x55x55x128xNCHWxFP32x1x1
[INFO] MIOpen Found 2 fwd algorithms, choosing 1:
[INFO] 0) 1 - time: 5.10683, Memory: 0
[INFO] 1) 0 - time: 15.7082, Memory: 4392300
[DEBUG] Init fwd Conv(5x5,pad=2,s=1) (128,64,27,27)->(128,192,27,27) req workspace: 214466560
[DEBUG] >>> Resizing workspace 4392300 -> 214466560
runcl -DNUM_CH_PER_WG=1 -DNUM_IM_BLKS_X=1 -DNUM_IM_BLKS=4 -DLOCAL_MEM_SIZE=432 -DSTRIDE_GT_1=0 -DTILE_SZ_X=32 -DTILE_SZ_Y=8 -DUSE_IM_OFF_GUARD=1 src/Kernels/MIOpenUtilKernels.cl -k Im2Col -dumpilisa -r 10 if#0: if#0: if#0: iv#0 65536,1,1/256,1,1
key: miopenIm2Col,
Kernel filename: MIOpenUtilKernels.cl
key: miopenConvolutionFwdAlgoGEMM,tC0_tA0_tB0_colMaj1_m729_n192_k1600_lda729_ldb1600_ldc729_ws0_f32
key: miopenConvolutionFwdAlgoGEMM_beta,tC0_tA0_tB0_colMaj1_m729_n192_k1600_lda729_ldb1600_ldc729_ws0_f32
key: miopenConvolutionFwdAlgoGEMM_beta tC0_tA0_tB0_colMaj1_m729_n192_k1600_lda729_ldb1600_ldc729_ws0_f32
key: miopenConvolutionFwdAlgoGEMM tC0_tA0_tB0_colMaj1_m729_n192_k1600_lda729_ldb1600_ldc729_ws0_f32
key: miopenConvolutionFwdAlgoWinograd,64x27x27x5x5x192x27x27x128xNCHWxFP32x1
Kernel filename: conv_u1v1_wheel_alpha_v8_4_4_gfx803.so
runcl -DMLO_HW_WAVE_SZ=64 -DMLO_DIR_FORWARD=1 -DMLO_FILTER_SIZE0=5 -DMLO_FILTER_SIZE1=5 -DMLO_FILTER_PAD0=2 -DMLO_FILTER_PAD1=2 -DMLO_N_OUTPUTS=192 -DMLO_N_INPUTS=64 -DMLO_BATCH_SZ=128 -DMLO_OUT_WIDTH=27 -DMLO_OUT_HEIGHT=27 -DMLO_OUT_BATCH_STRIDE=139968 -DMLO_OUT_CHANNEL_STRIDE=729 -DMLO_OUT_STRIDE=27 -DMLO_IN_WIDTH=27 -DMLO_IN_HEIGHT=27 -DMLO_IN_BATCH_STRIDE=46656 -DMLO_IN_CHANNEL_STRIDE=729 -DMLO_IN_STRIDE=27 -DMLO_IN_TILE0=27 -DMLO_IN_TILE1=27 -DMLO_OUT_TILE0=27 -DMLO_OUT_TILE1=27 -DMLO_GRP_TILE0=16 -DMLO_GRP_TILE1=16 -DMLO_ACTIVE_ALUS=252 -DMLO_N_ALUTILES_PERSTACK=2 -DMLO_OUT_PIX_TILE0=3 -DMLO_OUT_PIX_TILE1=2 -DMLO_N_STACKS=1 -DMLO_N_OUT_TILES=7 -DMLO_N_OUT_TILES_PERSTACK=14 -DMLO_N_IN_TILES_PERSTACK=2 -DMLO_N_READ_PROCS=256 -DMLO_CONV_BIAS=0 -DMLO_ALU_VTILE0=9 -DMLO_ALU_VTILE1=14 src/Kernels/MIOpenConvDirUniC.cl -k MIOpenConvUniC -dumpilisa -r 10 if#23887872: if#1228800: if#71663616: iv#0 256,14,128/256,1,1
key: miopenConvolutionFwdAlgoDirect,64x27x27x5x5x192x27x27x128xNCHWxFP32x1
Kernel filename: MIOpenConvDirUniC.cl
runcl -DCFF_CGEMM_CHOICE_1=1 -DCFF_IMG_SZ_27_27 -DCFF_IMG_H=27 -DCFF_IMG_W=27 -DCFF_BATCH=128 -DCFF_NFILTER=192 -DCFF_CHANNELS=64 -DCFF_HALFW=13404160 src/Kernels/MIOpenConvFFT.cl -k MIOpenConvFFT_fwd_in -dumpilisa -r 10 if#0: if#0: if#0: iv#0 524288,1,1/64,1,1
key: miopenConvolutionFwdAlgoFFT,FFT_x_in_h_27_in_w_27_in_n_128_in_c_64_out_c_192_kernel_0
Kernel filename: MIOpenConvFFT.cl
runcl -DCFF_CGEMM_CHOICE_1=1 -DCFF_IMG_SZ_27_27 -DCFF_IMG_H=27 -DCFF_IMG_W=27 -DCFF_BATCH=128 -DCFF_NFILTER=192 -DCFF_CHANNELS=64 -DCFF_HALFW=13404160 src/Kernels/MIOpenConvFFT.cl -k MIOpenConvFFT_fwd_we -dumpilisa -r 10 if#0: if#0: if#0: iv#0 786432,1,1/64,1,1
key: miopenConvolutionFwdAlgoFFT,FFT_x_in_h_27_in_w_27_in_n_128_in_c_64_out_c_192_kernel_1
runcl -DCFF_CGEMM_CHOICE_1=1 -DCFF_IMG_SZ_27_27 -DCFF_IMG_H=27 -DCFF_IMG_W=27 -DCFF_BATCH=128 -DCFF_NFILTER=192 -DCFF_CHANNELS=64 -DCFF_HALFW=13404160 src/Kernels/MIOpenConvFFT.cl -k MIOpenConvFFT_transpose_in -dumpilisa -r 10 if#0: if#0: if#0: iv#0 1114112,1,1/256,1,1
key: miopenConvolutionFwdAlgoFFT,FFT_x_in_h_27_in_w_27_in_n_128_in_c_64_out_c_192_kernel_2
runcl -DCFF_CGEMM_CHOICE_1=1 -DCFF_IMG_SZ_27_27 -DCFF_IMG_H=27 -DCFF_IMG_W=27 -DCFF_BATCH=128 -DCFF_NFILTER=192 -DCFF_CHANNELS=64 -DCFF_HALFW=13404160 src/Kernels/MIOpenConvFFT.cl -k MIOpenConvFFT_transpose_we -dumpilisa -r 10 if#0: if#0: if#0: iv#0 1671168,1,1/256,1,1
key: miopenConvolutionFwdAlgoFFT,FFT_x_in_h_27_in_w_27_in_n_128_in_c_64_out_c_192_kernel_3
runcl -DCFF_CGEMM_CHOICE_1=1 -DCFF_IMG_SZ_27_27 -DCFF_IMG_H=27 -DCFF_IMG_W=27 -DCFF_BATCH=128 -DCFF_NFILTER=192 -DCFF_CHANNELS=64 -DCFF_HALFW=13404160 src/Kernels/MIOpenConvFFT.cl -k MIOpenConvFFT_cgemm -dumpilisa -r 10 if#0: if#0: if#0: iv#0 48,32,544/16,16,1
key: miopenConvolutionFwdAlgoFFT,FFT_x_in_h_27_in_w_27_in_n_128_in_c_64_out_c_192_kernel_4
runcl -DCFF_CGEMM_CHOICE_1=1 -DCFF_IMG_SZ_27_27 -DCFF_IMG_H=27 -DCFF_IMG_W=27 -DCFF_BATCH=128 -DCFF_NFILTER=192 -DCFF_CHANNELS=64 -DCFF_HALFW=13404160 src/Kernels/MIOpenConvFFT.cl -k MIOpenConvFFT_transpose_out -dumpilisa -r 10 if#0: if#0: if#0: iv#0 3342336,1,1/256,1,1
key: miopenConvolutionFwdAlgoFFT,FFT_x_in_h_27_in_w_27_in_n_128_in_c_64_out_c_192_kernel_5
runcl -DCFF_CGEMM_CHOICE_1=1 -DCFF_IMG_SZ_27_27 -DCFF_IMG_H=27 -DCFF_IMG_W=27 -DCFF_BATCH=128 -DCFF_NFILTER=192 -DCFF_CHANNELS=64 -DCFF_HALFW=13404160 src/Kernels/MIOpenConvFFT.cl -k MIOpenConvFFT_inv_out -dumpilisa -r 10 if#0: if#0: if#0: iv#0 1572864,1,1/64,1,1
key: miopenConvolutionFwdAlgoFFT,FFT_x_in_h_27_in_w_27_in_n_128_in_c_64_out_c_192_kernel_6
key: miopenConvolutionFwdAlgoFFT FFT_x_in_h_27_in_w_27_in_n_128_in_c_64_out_c_192_kernel_0
key: miopenConvolutionFwdAlgoFFT FFT_x_in_h_27_in_w_27_in_n_128_in_c_64_out_c_192_kernel_1
key: miopenConvolutionFwdAlgoFFT FFT_x_in_h_27_in_w_27_in_n_128_in_c_64_out_c_192_kernel_2
key: miopenConvolutionFwdAlgoFFT FFT_x_in_h_27_in_w_27_in_n_128_in_c_64_out_c_192_kernel_3
key: miopenConvolutionFwdAlgoFFT FFT_x_in_h_27_in_w_27_in_n_128_in_c_64_out_c_192_kernel_4
key: miopenConvolutionFwdAlgoFFT FFT_x_in_h_27_in_w_27_in_n_128_in_c_64_out_c_192_kernel_5
key: miopenConvolutionFwdAlgoFFT FFT_x_in_h_27_in_w_27_in_n_128_in_c_64_out_c_192_kernel_6
[INFO] MIOpen Found 4 fwd algorithms, choosing 2:
[INFO] 0) 2 - time: 3.4942, Memory: 214466560
[INFO] 1) 3 - time: 6.27657, Memory: 0
[INFO] 2) 1 - time: 11.4962, Memory: 0
[INFO] 3) 0 - time: 21.2581, Memory: 4665600
[DEBUG] Init fwd Conv(3x3,pad=1,s=1) (128,192,13,13)->(128,384,13,13) req workspace: 0
key: miopenConvolutionFwdAlgoWinograd,192x13x13x3x3x384x13x13x128xNCHWxFP32x1
Kernel filename: conv_3x3_wheel_alpha_v3_0b_gfx803_m30.so
[INFO] MIOpen Found 1 fwd algorithms, choosing 3:
[INFO] 0) 3 - time: 2.32301, Memory: 0
[DEBUG] Init fwd Conv(3x3,pad=1,s=1) (128,384,13,13)->(128,256,13,13) req workspace: 0
key: miopenConvolutionFwdAlgoWinograd,384x13x13x3x3x256x13x13x128xNCHWxFP32x1
[INFO] MIOpen Found 1 fwd algorithms, choosing 3:
[INFO] 0) 3 - time: 3.01709, Memory: 0
[DEBUG] Init fwd Conv(3x3,pad=1,s=1) (128,256,13,13)->(128,256,13,13) req workspace: 0
key: miopenConvolutionFwdAlgoWinograd,256x13x13x3x3x256x13x13x128xNCHWxFP32x1
[INFO] MIOpen Found 1 fwd algorithms, choosing 3:
[INFO] 0) 3 - time: 2.0419, Memory: 0
[INFO] Begin warmup runs
[INFO] ======= BEGIN FWD =======
key: miopenConvolutionFwdAlgoDirect 3x224x224x11x11x64x55x55x128xNCHWxFP32x1
key: miopenConvolutionFwdAlgoDirect_pass2 3x224x224x11x11x64x55x55x128xNCHWxFP32x1x1
[INFO] Conv(11x11,pad=2,s=4) (128,3,224,224)->(128,64,55,55): 5.074 ms
runcl -DMLO_NRN_GROUP_SZ0=256 -DMLO_NRN_GROUP_SZ1=1 -DMLO_NRN_OP_ID=3 -DMLO_N_PIXS_OFF=0 -DMLO_MAP_SZ=24780800 -DMLO_MAP_SZ_ALIGNED=6195200 -DMLO_READ_UNIT=4 src/Kernels/MIOpenNeuron.cl -k MIOpenNeuronFwd -dumpilisa -r 10 if#0: if#0: if#0: iv#0 6195200,1,1/256,1,1
key: miopenActivationForward,64x55x55x3x3x64x55x55x128xNCHWxFP32x1
Kernel filename: MIOpenNeuron.cl
MIOpen Error: /home/masa/MIOpen/src/hipoc/hipoc_program.cpp:96: Failed creating module hipErrorOutOfMemory
error: 'StatusUnknownError '(7) at ./layers.hpp:277
The text was updated successfully, but these errors were encountered: