Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Dynamic hook simd level seems not involve sse4.2 #38623

Closed
chasingegg opened this issue Nov 7, 2023 · 7 comments
Closed

[C++] Dynamic hook simd level seems not involve sse4.2 #38623

chasingegg opened this issue Nov 7, 2023 · 7 comments
Labels
Component: C++ Type: usage Issue is a user question

Comments

@chasingegg
Copy link

Describe the usage question you have. Please include as many useful details as possible.

I have briefly go through the building related code, it seems that we can not use SSE4_2 simd level at runtime?

template <typename KernelType>
const KernelType* DispatchExactImpl(const std::vector<KernelType*>& kernels,
                                    const std::vector<TypeHolder>& values) {
  const KernelType* kernel_matches[SimdLevel::MAX] = {nullptr};

  // Validate arity
  for (const auto& kernel : kernels) {
    if (kernel->signature->MatchesInputs(values)) {
      kernel_matches[kernel->simd_level] = kernel;
    }
  }

  // Dispatch as the CPU feature
#if defined(ARROW_HAVE_RUNTIME_AVX512) || defined(ARROW_HAVE_RUNTIME_AVX2)
  auto cpu_info = arrow::internal::CpuInfo::GetInstance();
#endif
#if defined(ARROW_HAVE_RUNTIME_AVX512)
  if (cpu_info->IsSupported(arrow::internal::CpuInfo::AVX512)) {
    if (kernel_matches[SimdLevel::AVX512]) {
      return kernel_matches[SimdLevel::AVX512];
    }
  }
#endif
#if defined(ARROW_HAVE_RUNTIME_AVX2)
  if (cpu_info->IsSupported(arrow::internal::CpuInfo::AVX2)) {
    if (kernel_matches[SimdLevel::AVX2]) {
      return kernel_matches[SimdLevel::AVX2];
    }
  }
#endif
  if (kernel_matches[SimdLevel::NONE]) {
    return kernel_matches[SimdLevel::NONE];
  }

  return nullptr;
}

ARROW_HAVE_RUNTIME_SSE4_2 never used at runtime? Do I miss something, if this is a bug, I'd like to fix it.

Component(s)

C++

@chasingegg chasingegg added the Type: usage Issue is a user question label Nov 7, 2023
@kou kou changed the title Dynamic hook simd level seems not involve sse4.2 [C++] Dynamic hook simd level seems not involve sse4.2 Nov 7, 2023
@kou
Copy link
Member

kou commented Nov 7, 2023

It seems that we don't have any kernel for SSE4.2. So I think that this is not a bug.

Anyway, do you want to implement a kernel for SSE4.2? Then we can support SSE4.2 in the dispatch code.

@js8544
Copy link
Collaborator

js8544 commented Nov 8, 2023

Adding to kou's comment, the code you pasted are specific to the arrow compute functions. Currently no compute functions have a SSE kernel implementation so there's no need yet to check SSE in runtime.

There are other places where arrow uses SSE such as in handling csv and json files. ARROW_HAVE_SSE4_2 is used instead of ARROW_HAVE_RUNTIME_SSE4_2 because they don't require runtime dispatching, unlike compute functions.

@chasingegg
Copy link
Author

chasingegg commented Nov 8, 2023

Thanks for your clarification! When I compile arrow by default options, is it supposed to run any x86_64 platforms even without sse4_2 since it will dynamically choose max simd level we can use?

@js8544
Copy link
Collaborator

js8544 commented Nov 8, 2023

is it supposed to run any x86_64 platforms even without sse4_2

It depends. Do you plan to compile it on a machine with SSE4.2 support and run it on machines without? Then you should propably pass -DARROW_C_FLAGS_${BUILD_TYPE}=-march=<your target architecture> (i.e. cross compiling) to CMake to disable auto-generated SSE4.2 instructions (see this), as well as -DARROW_SIMD_LEVEL=NONE for arrow's explicit SSE4.2 intrinsics.

Also if you plan to use it on Windows, POPCNT may be a required instruction (see #21840).

it will dynamically choose max simd level we can use

Nope. Only the compute functions do this. Other places use the compile time constant ARROW_HAVE_SSE4_2.

@chasingegg
Copy link
Author

is it supposed to run any x86_64 platforms even without sse4_2

It depends. Do you plan to compile it on a machine with SSE4.2 support and run it on machines without? Then you should propably pass -DARROW_C_FLAGS_${BUILD_TYPE}=-march=<your target architecture> (i.e. cross compiling) to CMake to disable auto-generated SSE4.2 instructions (see this), as well as -DARROW_HAVE_SSE4_2=OFF for arrow's explicit SSE4.2 intrinsics.

Also if you plan to use it on Windows, POPCNT may be a required instruction (see #21840).

it will dynamically choose max simd level we can use

Nope. Only the compute functions do this. Other places use the compile time constant ARROW_HAVE_SSE4_2.

Thank you for your reply, sorry I don't make my case clear, I want to compile it on a machine with avx and run it on SSE4.2 machine, it should work right? Also as you mentioned, some compile time constant is ARROW_HAVE_SSE4_2.

@js8544
Copy link
Collaborator

js8544 commented Nov 8, 2023

I see. The default ARROW_SIMD_LEVEL is SSE4_2 so it should be fine.

@chasingegg
Copy link
Author

Thanks for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: C++ Type: usage Issue is a user question
Projects
None yet
Development

No branches or pull requests

3 participants