Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVX2 support doesn't compile with MSVC #3193

Closed
2 of 4 tasks
borrrden opened this issue Jan 5, 2024 · 3 comments
Closed
2 of 4 tasks

AVX2 support doesn't compile with MSVC #3193

borrrden opened this issue Jan 5, 2024 · 3 comments

Comments

@borrrden
Copy link
Contributor

borrrden commented Jan 5, 2024

Summary

The AVX2 implementation of FAISS appears to use intrinsics that are GNU specific (or at least don't port to Windows via MSVC).

Platform

Windows 10.0.19045
MSVC 2022

Faiss version: git tag v1.7.4

Installed from: Attempting to compile

Faiss compilation options:

Probably doesn't much matter but using OpenBLAS compiled myself
CMake set(FAISS_OPT_LEVEL "avx2" CACHE INTERNAL "") and link with faiss_avx2 instead of faiss

Running on:

  • CPU
  • GPU

Interface:

  • C++
  • Python

Reproduction instructions

  1. Get any BLAS implementation in order to satisfy the dependency requirements for building this library
  2. Standard CMake dance (build directory, etc)
  3. cmake -DFAISS_OPT_LEVEL=avx2 -DFAISS_ENABLE_GPU=OFF -DFAISS_ENABLE_PYTHON=OFF .. (assuming the source directory is one level up from the build directory)
  4. cmake --build . --config Release --target faiss_avx2 ..

Fix instructions

This boils down to 3 distinct issues:

  1. The definition of _mm_prefetch in MSVC (and actually in the Intel Intrinsics Guide suggesting GCC / clang are mistaken) takes a const char * as the first argument, not const void *. This base is peppered with calls directly passing float pointers.
  2. __m128i_u and __m256i_u are not portable. However they are the exact same datatype as the non _u variants, just with some implied side effects so I'm guessing it is safe to define them away to their non _u variants #if defined(_MSC_VER) && !defined(__clang__).
  3. By default Visual Studio implemented OpenMP 2.0, which doesn't allow signed indexes in for loops. The following code violates this
    #pragma omp parallel for schedule(dynamic)
    for (size_t i = 0; i < nx_p; i += NX_POINTS_PER_LOOP) {
    kernel<DIM, NX_POINTS_PER_LOOP, NY_POINTS_PER_LOOP>(
    x, y, y_transposed.data(), ny, res, y_norms, i);
    }

However, issue 3 can be worked around with some trickery, involving override CMake's FindMP module by setting OpenMP_CXX_FLAGS="-openmp:llvm". This turns on the "better" support (which is OpenMP 2.5 with a few 3.1 features including unsigned index types) for OpenMP in Visual Studio as noted here

@algoriddle
Copy link
Contributor

I will add AVX2 to our contbuild to verify.

@algoriddle
Copy link
Contributor

Confirmed that MSVC is not supported by the current implementation of AVX2. We welcome any PRs.

@borrrden
Copy link
Contributor Author

Any opinions on the way to approach this? I had simply cast all the prefetch calls blindly from float * to char *. I'm assuming that this function, taking a void * in the case of Linux, is not going to be doing anything special due to the fact that it is a float. It compiles and appears to function but if someone who knows better than me has a different approach let me know.

Also do you care if the iterator above is signed or unsigned? Does it have any implications? Because as I mentioned it can be either fixed in code or in build option (though the build option will create a dependency on LLVM libomp which I think at this point might be included in the MSVC runtime but I haven't looked carefully into that yet)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants