# ./conv_timing.py 36 Op: forward Image shape, filter shape (128, 3, 128, 128) (96, 3, 5, 5) Imagenet-like MKL OpenMP Time(ms) Speedup Notes 1 None 719 0.42 single thread 36 None 300 1.00 BLAS only, baseline None 36 68 4.41 only OMP set 2 1 427 0.70 BLAS only 4 1 328 0.92 BLAS only 8 1 286 1.05 BLAS only 16 1 279 1.08 BLAS only 32 1 291 1.03 BLAS only 1 2 344 0.87 OMP only 1 4 186 1.61 OMP only 1 8 162 1.85 OMP only 1 16 82 3.63 OMP only 1 32 56 5.35 OMP only 1 36 56 5.30 OMP only 36 1 307 0.98 MKL=36 OMP=1 18 2 157 1.91 MKL=18 OMP=2 12 3 117 2.56 MKL=12 OMP=3 9 4 98 3.07 MKL=9 OMP=4 6 6 76 3.95 MKL=6 OMP=6 4 9 68 4.41 MKL=4 OMP=9 3 12 62 4.82 MKL=3 OMP=12 2 18 59 5.06 MKL=2 OMP=18 1 36 56 5.30 MKL=1 OMP=36 Op: forward Image shape, filter shape (128, 85, 2, 258) (64, 85, 2, 20) Spectrogram-like MKL OpenMP Time(ms) Speedup Notes 1 None 281 0.56 single thread 36 None 158 1.00 BLAS only, baseline None 36 30 5.26 only OMP set 2 1 217 0.73 BLAS only 4 1 161 0.98 BLAS only 8 1 135 1.16 BLAS only 16 1 165 0.96 BLAS only 32 1 160 0.99 BLAS only 1 2 151 1.04 OMP only 1 4 131 1.20 OMP only 1 8 53 2.94 OMP only 1 16 29 5.41 OMP only 1 32 21 7.33 OMP only 1 36 20 7.66 OMP only 36 1 161 0.98 MKL=36 OMP=1 18 2 96 1.64 MKL=18 OMP=2 12 3 65 2.41 MKL=12 OMP=3 9 4 44 3.53 MKL=9 OMP=4 6 6 34 4.61 MKL=6 OMP=6 4 9 27 5.72 MKL=4 OMP=9 3 12 23 6.80 MKL=3 OMP=12 2 18 21 7.23 MKL=2 OMP=18 1 36 21 7.45 MKL=1 OMP=36 Op: gradInputs Image shape, filter shape (128, 3, 128, 128) (96, 3, 5, 5) Imagenet-like MKL OpenMP Time(ms) Speedup Notes 1 None 1395 0.35 single thread 36 None 492 1.00 BLAS only, baseline None 36 136 3.62 only OMP set 2 1 933 0.53 BLAS only 4 1 661 0.74 BLAS only 8 1 528 0.93 BLAS only 16 1 493 1.00 BLAS only 32 1 483 1.02 BLAS only 1 2 666 0.74 OMP only 1 4 428 1.15 OMP only 1 8 216 2.27 OMP only 1 16 159 3.08 OMP only 1 32 101 4.85 OMP only 1 36 104 4.70 OMP only 36 1 497 0.99 MKL=36 OMP=1 18 2 281 1.75 MKL=18 OMP=2 12 3 212 2.32 MKL=12 OMP=3 9 4 157 3.14 MKL=9 OMP=4 6 6 147 3.35 MKL=6 OMP=6 4 9 119 4.12 MKL=4 OMP=9 3 12 112 4.40 MKL=3 OMP=12 2 18 104 4.73 MKL=2 OMP=18 1 36 104 4.72 MKL=1 OMP=36 Op: gradInputs Image shape, filter shape (128, 85, 2, 258) (64, 85, 2, 20) Spectrogram-like MKL OpenMP Time(ms) Speedup Notes 1 None 595 0.49 single thread 36 None 293 1.00 BLAS only, baseline None 36 60 4.83 only OMP set 2 1 473 0.62 BLAS only 4 1 323 0.91 BLAS only 8 1 279 1.05 BLAS only 16 1 308 0.95 BLAS only 32 1 299 0.98 BLAS only 1 2 308 0.95 OMP only 1 4 291 1.01 OMP only 1 8 130 2.26 OMP only 1 16 64 4.56 OMP only 1 32 44 6.66 OMP only 1 36 47 6.21 OMP only 36 1 293 1.00 MKL=36 OMP=1 18 2 173 1.69 MKL=18 OMP=2 12 3 120 2.43 MKL=12 OMP=3 9 4 97 3.02 MKL=9 OMP=4 6 6 77 3.78 MKL=6 OMP=6 4 9 65 4.48 MKL=4 OMP=9 3 12 53 5.51 MKL=3 OMP=12 2 18 48 6.02 MKL=2 OMP=18 1 36 47 6.18 MKL=1 OMP=36 Op: gradWeights Image shape, filter shape (128, 3, 128, 128) (96, 3, 5, 5) Imagenet-like MKL OpenMP Time(ms) Speedup Notes 1 None 699 0.26 single thread 36 None 179 1.00 BLAS only, baseline None 36 802 0.22 only OMP set 2 1 396 0.45 BLAS only 4 1 277 0.65 BLAS only 8 1 208 0.86 BLAS only 16 1 194 0.92 BLAS only 32 1 178 1.00 BLAS only 1 2 359 0.50 OMP only 1 4 269 0.66 OMP only 1 8 155 1.15 OMP only 1 16 64 2.80 OMP only 1 32 41 4.28 OMP only 1 36 40 4.46 OMP only 36 1 177 1.01 MKL=36 OMP=1 18 2 104 1.71 MKL=18 OMP=2 12 3 80 2.22 MKL=12 OMP=3 9 4 68 2.61 MKL=9 OMP=4 6 6 70 2.53 MKL=6 OMP=6 4 9 47 3.80 MKL=4 OMP=9 3 12 42 4.21 MKL=3 OMP=12 2 18 41 4.30 MKL=2 OMP=18 1 36 42 4.22 MKL=1 OMP=36 Op: gradWeights Image shape, filter shape (128, 85, 2, 258) (64, 85, 2, 20) Spectrogram-like MKL OpenMP Time(ms) Speedup Notes 1 None 248 0.49 single thread 36 None 121 1.00 BLAS only, baseline None 36 30 3.95 only OMP set 2 1 183 0.66 BLAS only 4 1 178 0.68 BLAS only 8 1 131 0.92 BLAS only 16 1 130 0.94 BLAS only 32 1 122 0.99 BLAS only 1 2 206 0.59 OMP only 1 4 100 1.21 OMP only 1 8 52 2.30 OMP only 1 16 27 4.49 OMP only 1 32 23 5.27 OMP only 1 36 22 5.42 OMP only 36 1 123 0.98 MKL=36 OMP=1 18 2 71 1.71 MKL=18 OMP=2 12 3 54 2.24 MKL=12 OMP=3 9 4 44 2.75 MKL=9 OMP=4 6 6 34 3.49 MKL=6 OMP=6 4 9 31 3.87 MKL=4 OMP=9 3 12 23 5.19 MKL=3 OMP=12 2 18 20 5.92 MKL=2 OMP=18 1 36 24 5.00 MKL=1 OMP=36