Skip to content

Commit

Permalink
Remove GEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK
Browse files Browse the repository at this point in the history
Fixes #117
  • Loading branch information
jart committed Jan 12, 2018
1 parent fcf32e7 commit 3bdc185
Show file tree
Hide file tree
Showing 4 changed files with 4 additions and 32 deletions.
11 changes: 4 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,13 +54,10 @@ Current optimized code paths:
* ARM with NEON (both 32bit and 64bit).
* Intel x86 with SSE 4.1 (both 32bit and 64bit).

If you are building for x86, it's important that you pass in the `-msse4.1`
compiler flag when building, or you'll end up using slow reference code. If
you're building with Bazel, you can do this by running `bazel build gemmlowp:all
--copt=-msse4.1`. If you're building for a machine with no SIMD support in
gemmlowp then by default you'll see an error. If you want to run with the
reference implementations anyway, you can override the error by specifying
`GEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK` as a build define.
When building for x86, it's important to pass `-msse4.1` or `-march=native` to
the compiler, otherwise gemmlowp will use slow reference code. Bazel users can
compile by running `bazel build --config=opt //gemmlowp:all` or alternatively
pass the `--copt=-msse4.1` flag.

Details of what it takes to make an efficient port of gemmlowp, namely writing a
suitable GEMM kernel and accompanying packing code, are explained in this file:
Expand Down
5 changes: 0 additions & 5 deletions contrib/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,6 @@ set(CMAKE_CXX_STANDARD 11)

get_filename_component(gemmlowp_src ${gemmlowp_SOURCE_DIR} PATH)

# Enabling SIMD is recommended for modern x86 machines
#add_definitions(-msse4)
# However here we take the best compatibility
add_definitions(-DGEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK)

if(WIN32)
# one can enable simd from the cmake command line, ie -DCMAKE_CXX_FLAGS="/arch:AVX2
add_definitions(-DNOMINMAX -DWIN64 -DWIN32_LEAN_AND_MEAN -DNOGDI)
Expand Down
3 changes: 0 additions & 3 deletions eight_bit_int_gemm/eight_bit_int_gemm.cc
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,6 @@
// See the License for the specific language governing permissions and
// limitations under the License.

#ifndef GEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK
#define GEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK
#endif
#include "eight_bit_int_gemm.h"

#include <memory>
Expand Down
17 changes: 0 additions & 17 deletions internal/kernel_default.h
Original file line number Diff line number Diff line change
Expand Up @@ -79,23 +79,6 @@ GEMMLOWP_SET_DEFAULT_KERNEL(false, false, SSE4_32_Kernel4x4Depth2)
#include "kernel_sse.h"
GEMMLOWP_SET_DEFAULT_KERNEL(false, false, SSE4_64_Kernel12x4Depth2)
#else
#ifndef GEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK
#if defined __ARM_ARCH_5TE__
// SIMD is not available on this platform. The slow fallback will be used.
// Don't require GEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK because there's nothing
// the user can do about it.
#elif defined __powerpc__
// There is currently no fast kernel using SIMD instructions on POWER. Don't
// require GEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK because there's nothing the user
// can do about it.
#else
#error \
"SIMD not enabled, you'd be getting a slow software fallback. Consider \
enabling SIMD extensions (for example using -msse4 if you're on modern x86). \
If that's not an option, and you would like to continue with the \
slow fallback, define GEMMLOWP_ALLOW_SLOW_SCALAR_FALLBACK."
#endif
#endif
#include "kernel_reference.h"
namespace gemmlowp {
typedef ReferenceKernel<KernelFormat<
Expand Down

0 comments on commit 3bdc185

Please sign in to comment.