# ./conv_timing.py 36 Op: forward Image shape, filter shape (128, 3, 128, 128) (96, 3, 5, 5) Imagenet-like MKL OpenMP Time(ms) Speedup Notes 1 None 733 0.44 single thread 36 None 324 1.00 BLAS only, baseline None 36 247 1.31 only OMP set 2 1 461 0.70 BLAS only 4 1 371 0.87 BLAS only 8 1 354 0.92 BLAS only 16 1 342 0.95 BLAS only 32 1 331 0.98 BLAS only 1 2 609 0.53 OMP only 1 4 423 0.77 OMP only 1 8 353 0.92 OMP only 1 16 297 1.09 OMP only 1 32 227 1.43 OMP only 1 36 225 1.44 OMP only 36 1 328 0.99 MKL=36 OMP=1 18 2 296 1.10 MKL=18 OMP=2 12 3 264 1.23 MKL=12 OMP=3 9 4 253 1.28 MKL=9 OMP=4 6 6 231 1.40 MKL=6 OMP=6 4 9 214 1.51 MKL=4 OMP=9 3 12 205 1.58 MKL=3 OMP=12 2 18 208 1.56 MKL=2 OMP=18 1 36 229 1.42 MKL=1 OMP=36 Op: forward Image shape, filter shape (128, 85, 2, 258) (64, 85, 2, 20) Spectrogram-like MKL OpenMP Time(ms) Speedup Notes 1 None 282 0.56 single thread 36 None 156 1.00 BLAS only, baseline None 36 27 5.66 only OMP set 2 1 210 0.75 BLAS only 4 1 158 0.99 BLAS only 8 1 138 1.13 BLAS only 16 1 164 0.95 BLAS only 32 1 158 0.99 BLAS only 1 2 153 1.03 OMP only 1 4 88 1.78 OMP only 1 8 52 2.99 OMP only 1 16 31 4.93 OMP only 1 32 19 7.95 OMP only 1 36 26 5.86 OMP only 36 1 159 0.98 MKL=36 OMP=1 18 2 91 1.72 MKL=18 OMP=2 12 3 65 2.39 MKL=12 OMP=3 9 4 56 2.77 MKL=9 OMP=4 6 6 32 4.77 MKL=6 OMP=6 4 9 28 5.42 MKL=4 OMP=9 3 12 24 6.41 MKL=3 OMP=12 2 18 19 7.95 MKL=2 OMP=18 1 36 19 7.95 MKL=1 OMP=36 Op: gradInputs Image shape, filter shape (128, 3, 128, 128) (96, 3, 5, 5) Imagenet-like MKL OpenMP Time(ms) Speedup Notes 1 None 1401 0.37 single thread 36 None 522 1.00 BLAS only, baseline None 36 315 1.65 only OMP set 2 1 947 0.55 BLAS only 4 1 704 0.74 BLAS only 8 1 590 0.88 BLAS only 16 1 570 0.92 BLAS only 32 1 522 1.00 BLAS only 1 2 856 0.61 OMP only 1 4 651 0.80 OMP only 1 8 496 1.05 OMP only 1 16 371 1.41 OMP only 1 32 270 1.93 OMP only 1 36 266 1.96 OMP only 36 1 522 1.00 MKL=36 OMP=1 18 2 437 1.19 MKL=18 OMP=2 12 3 377 1.38 MKL=12 OMP=3 9 4 345 1.51 MKL=9 OMP=4 6 6 311 1.68 MKL=6 OMP=6 4 9 284 1.83 MKL=4 OMP=9 3 12 261 2.00 MKL=3 OMP=12 2 18 251 2.08 MKL=2 OMP=18 1 36 280 1.86 MKL=1 OMP=36 Op: gradInputs Image shape, filter shape (128, 85, 2, 258) (64, 85, 2, 20) Spectrogram-like MKL OpenMP Time(ms) Speedup Notes 1 None 595 0.50 single thread 36 None 294 1.00 BLAS only, baseline None 36 64 4.59 only OMP set 2 1 449 0.66 BLAS only 4 1 318 0.93 BLAS only 8 1 278 1.06 BLAS only 16 1 308 0.96 BLAS only 32 1 297 0.99 BLAS only 1 2 310 0.95 OMP only 1 4 172 1.71 OMP only 1 8 124 2.37 OMP only 1 16 68 4.32 OMP only 1 32 48 6.04 OMP only 1 36 50 5.82 OMP only 36 1 296 1.00 MKL=36 OMP=1 18 2 222 1.32 MKL=18 OMP=2 12 3 122 2.41 MKL=12 OMP=3 9 4 106 2.77 MKL=9 OMP=4 6 6 75 3.88 MKL=6 OMP=6 4 9 61 4.81 MKL=4 OMP=9 3 12 52 5.60 MKL=3 OMP=12 2 18 45 6.51 MKL=2 OMP=18 1 36 47 6.17 MKL=1 OMP=36 Op: gradWeights Image shape, filter shape (128, 3, 128, 128) (96, 3, 5, 5) Imagenet-like MKL OpenMP Time(ms) Speedup Notes 1 None 695 0.26 single thread 36 None 177 1.00 BLAS only, baseline None 36 829 0.21 only OMP set 2 1 411 0.43 BLAS only 4 1 274 0.65 BLAS only 8 1 213 0.83 BLAS only 16 1 197 0.90 BLAS only 32 1 178 1.00 BLAS only 1 2 361 0.49 OMP only 1 4 183 0.97 OMP only 1 8 155 1.14 OMP only 1 16 69 2.55 OMP only 1 32 38 4.61 OMP only 1 36 38 4.64 OMP only 36 1 181 0.98 MKL=36 OMP=1 18 2 104 1.70 MKL=18 OMP=2 12 3 77 2.28 MKL=12 OMP=3 9 4 87 2.03 MKL=9 OMP=4 6 6 57 3.07 MKL=6 OMP=6 4 9 47 3.72 MKL=4 OMP=9 3 12 44 3.98 MKL=3 OMP=12 2 18 40 4.40 MKL=2 OMP=18 1 36 40 4.40 MKL=1 OMP=36 Op: gradWeights Image shape, filter shape (128, 85, 2, 258) (64, 85, 2, 20) Spectrogram-like MKL OpenMP Time(ms) Speedup Notes 1 None 247 0.49 single thread 36 None 120 1.00 BLAS only, baseline None 36 33 3.64 only OMP set 2 1 187 0.64 BLAS only 4 1 191 0.63 BLAS only 8 1 141 0.85 BLAS only 16 1 139 0.87 BLAS only 32 1 122 0.98 BLAS only 1 2 134 0.90 OMP only 1 4 80 1.50 OMP only 1 8 55 2.17 OMP only 1 16 29 4.07 OMP only 1 32 21 5.60 OMP only 1 36 22 5.35 OMP only 36 1 116 1.04 MKL=36 OMP=1 18 2 67 1.79 MKL=18 OMP=2 12 3 54 2.23 MKL=12 OMP=3 9 4 51 2.35 MKL=9 OMP=4 6 6 34 3.54 MKL=6 OMP=6 4 9 31 3.84 MKL=4 OMP=9 3 12 22 5.47 MKL=3 OMP=12 2 18 25 4.80 MKL=2 OMP=18 1 36 22 5.35 MKL=1 OMP=36