Skip to content

Latest commit

 

History

History
22 lines (12 loc) · 524 Bytes

README.md

File metadata and controls

22 lines (12 loc) · 524 Bytes

Chapter 8: Speedup Your Program

Compiler options (-O3 for GCC, -Wall, etc…)

SIMD (SSE, AVX, NEON, RISC-V, Universal Intrinsics of OpenCV)

OpenMP

Memory Hierarchies and Speed

Crop ROI from a 2D Matrix

Intel, ARM and RISC-V Architechture

Lab:

Create two 1Mx1K float matrices matA and matB, compute matA + matB.

  • compute the result row by row and col by col, compare the performance difference
  • use -O3 to improve the speed
  • improve the speed using SIMD, will the speed be improved? Why?