Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STORE_OUTPUT存在typo,Adreno GPU计算反卷积f2s2出错 #69

Closed
chillingche opened this issue Oct 20, 2021 · 1 comment
Closed

STORE_OUTPUT存在typo,Adreno GPU计算反卷积f2s2出错 #69

chillingche opened this issue Oct 20, 2021 · 1 comment

Comments

@chillingche
Copy link

./test_deconvolution_ocl 24 256 128 1 2 2 2 0                                                        <

[DEBUG] thread 13883 OCLContext 0x61531c6278 constructor start
[DEBUG] thread 13883 try to dlopen libQUALCOMM_Adreno_660_map.so failed, dlopen failed: library "libQUALCOMM_Adreno_660_map.so" not found, create kernel from source code
[DEBUG] thread 13883 gcl_kernel_source 0xb40000714c3a1250 constructor
[DEBUG] thread 13883 OCLContext 0x61531c6278 constructor end
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_f2s2_qc_iom_12 runInfo: ls <0 0 0> executeTime = 153.856000 us
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_f2s2_qc_iom_22 runInfo: ls <0 0 0> executeTime = 130.816000 us
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_f2s2_qc_iom_32 runInfo: ls <0 0 0> executeTime = 153.088000 us
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_f2s2_qc_iom_42 runInfo: ls <0 0 0> executeTime = 122.880000 us
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_f2s2_qc_iom_14 runInfo: ls <0 0 0> executeTime = 143.872000 us
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_f2s2_qc_iom_24 runInfo: ls <0 0 0> executeTime = 102.144000 us
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_f2s2_qc_iom_34 runInfo: ls <0 0 0> executeTime = 118.016000 us
[DEBUG] thread 13883 enqueue_fill_image runInfo: executeTime = 15.872000 us
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_trans_fltbuf_44 runInfo: executeTime = 5.888000 us
[DEBUG] thread 13883 DATATRANS>>> enqueue_write_buffer runInfo: executeTime = 129.024000 us
[DEBUG] thread 13883 KERNEL>>> unknow_mem_trans_om_nchw_to_nchwc4 runInfo: executeTime = 113.920000 us
[INFO] thread 13883 warm up gpu:
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_f2s2_qc_iom_24 runInfo: ls <0 0 0> executeTime = 102.912000 us
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_f2s2_qc_iom_24 runInfo: ls <0 0 0> executeTime = 100.864000 us
[DEBUG] thread 13883 KERNEL>>> unknow_deconv_gemm_f2s2_qc_iom_24 runInfo: ls <0 0 0> executeTime = 98.048000 us
[DEBUG] thread 13883 KERNEL>>> unknow_mem_trans_im_nchwc4_to_nchw runInfo: executeTime = 51.968000 us
[DEBUG] thread 13883 DATATRANS>>> enqueue_read_buffer runInfo: executeTime = 16.896000 us
[INFO] thread 13883 16bit,         Deonvolution,                                    (1 24 256 128)+(24 1 2 2)/(2 0)=(1 1 512 256),    TIME    0.098ms,        GFLOPS   65.504
abs(diff) >= 1.000000e+00f, number = 23
abs(diff) >= 1.000000e-01f, number = 822
abs(diff) >= 1.000000e-02f, number = 164
abs(diff) >= 1.000000e-03f, number = 1084
abs(diff) >= 1.000000e-04f, number = 85300
abs(diff) >= 1.000000e-05f, number = 3176
abs(diff) >= 0.000000e+00f, number = 40503
maxabs = 1.530273, a = 0.000000, b = 1.530273 @ 428
maxrel = 976.562500, a = -0.000244, b = 0.000244 @ 73386
[DEBUG] thread 13883 OCLContext 0x61531c6278 deconstructor start
[DEBUG] thread 13883 gcl_kernel_source 0xb40000714c3a1250 constructor
[DEBUG] thread 13883 OCLContext 0x61531c6278 deconstructor end
@yunfanxiao
Copy link
Contributor

非常感谢您的反馈!会在近期修复。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants