Skip to content

MarkIngramUK/ocl-compile-benchmark

Repository files navigation

Compilation times for the Addition kernel:

__kernel void Addition(__global float* input1, __global float* input2, __global float* output)  
{  
	int index = get_global_id(0);  
	output[index] = input1[index] + input2[index];  
}

AMD Radeon PRO W6800 (OpenCL 2.0 AMD-APP (3354.13)):

Run Duration (ms)
1 467.382
2 106.683
3 106.861
4 107.879
5 107.799
6 104.983
7 107.775
8 105.926
9 107.524
10 107.95
- -
Average 143.076

AMD Radeon RX 5700 XT (OpenCL 2.0 AMD-APP (3188.4)):

Run Duration (ms)
1 2729.53
2 1254.58
3 1242.51
4 1278.31
5 1202.36
6 1269.07
7 1237.95
8 1267.08
9 1252
10 1268.22
- -
Average 1400.16

AMD Radeon (TM) R9 390 Series (OpenCL 2.0 AMD-APP (3110.7)):

Run Duration (ms)
1 282.61
2 26.6979
3 27.009
4 27.0421
5 26.917
6 28.2667
7 29.3081
8 27.0859
9 25.7717
10 29.2355
- -
Average 52.9943

Intel(R) Core(TM) i9-7940X CPU @ 3.10GHz (OpenCL 2.1 (Build 0)):

Run Duration (ms)
1 47.1116
2 19.7415
3 18.9358
4 19.5806
5 19.0173
6 19.5361
7 19.1919
8 19.4186
9 19.511
10 19.8893
- -
Average 22.1934

Nvidia GeForce GTX 1650 (OpenCL 1.2 CUDA):

Run Duration (ms)
1 470.605
2 0.3794
3 0.3591
4 0.2699
5 0.2626
6 0.2632
7 0.2584
8 0.2573
9 0.2595
10 0.2619
- -
Average 47.3176

Note: The Nvidia driver appears to cache compiled kernels, leading to a more impressive average.

Intel(R) UHD Graphics 630 (OpenCL 2.1 NEO ):

Run Duration (ms)
1 647.542
2 319.691
3 315.673
4 322.019
5 317.625
6 319.044
7 319.101
8 364.771
9 347.076
10 382.742
- -
Average 365.528

About

Test project to demonstrate poor compilation performance in AMD OpenCL driver

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published