Skip to content

Add Metal GPU backend for Apple Silicon (IndexFlat) #5144

Closed
Evandabest wants to merge 17 commits into
facebookresearch:mainfrom
Evandabest:metal-backend-initial
Closed

Add Metal GPU backend for Apple Silicon (IndexFlat) #5144
Evandabest wants to merge 17 commits into
facebookresearch:mainfrom
Evandabest:metal-backend-initial

Conversation

@Evandabest
Copy link
Copy Markdown
Contributor

@Evandabest Evandabest commented Apr 24, 2026

Summary

Adds a new Metal compute backend so FAISS can run GPU-accelerated similarity search on Apple Silicon (M1/M2/M3/M4/M5). This is just an initial pr, it introduces the build system integration, core Metal infrastructure, and a working IndexFlat with L2 and inner product search. I have some follow up prs with more functionality that I plan to submit.

Addresses #2386 — FAISS currently has no GPU acceleration path on Apple Silicon, forcing CPU-only execution. This Metal backend enables GPU-accelerated search on M-series chips, with follow-up PRs adding IVF indexes, and kernel optimizations.

All Metal code lives in a new faiss/gpu_metal/ directory, keeping faiss/gpu/ (CUDA/ROCm) completely untouched. The backend is opt-in via cmake -DFAISS_ENABLE_METAL=ON -DFAISS_ENABLE_GPU=OFF.

What's included

  • Build system: New FAISS_ENABLE_METAL CMake option. When enabled, requires Apple platform, links Metal/MetalKit/MPS/Foundation frameworks, and builds libfaiss_metal.a. Three small additions to root CMakeLists.txt.
  • MetalResources: Owns MTLDevice, MTLCommandQueue, and buffer allocation/deallocation. Mirrors GpuResources roles.
  • MetalIndex: Base class for Metal-backed indexes (mirrors GpuIndex). Single-device only (device 0).
  • MetalIndexFlat: Flat index storing vectors in an MTLBuffer. Supports add, add_with_ids, reset, search for both METRIC_L2 and METRIC_INNER_PRODUCT.
  • MSL compute kernels: l2_squared_matrix, ip_matrix (distance computation), and topk (sorted insertion, k ≤ 256). Embedded as a compile-time string in MetalFlatKernels.mm.
  • Backend abstraction: get_num_gpus(), index_cpu_to_metal_gpu(), index_metal_gpu_to_cpu(), StandardMetalResources, and a GpuIndexFlat typedef for API parity.
  • Tests: 8 gtest cases in TestMetalIndexFlat.mm — L2/IP search correctness vs CPU IndexFlat, add_with_ids, reset, empty search, get_num_gpus, and CPU↔Metal cloning in both directions. All tests skip gracefully if no Metal device is available.
  • Python/SWIG bindings

Planned follow up prs

  • Larger k support (tiling, heap-based top-k)
  • IVFFlat, IVFPQ, IVFSQ, BinaryFlat
  • Kernel optimizations (fused distance+top-k, GEMM-tiled distance, simdgroup/threadgroup parallel selection)

Build and test

mkdir -p build_metal && cd build_metal                                        
cmake .. -DFAISS_ENABLE_METAL=ON -DFAISS_ENABLE_GPU=OFF \                     
         -DBUILD_TESTING=ON -DFAISS_ENABLE_PYTHON=OFF \                       
         -DCMAKE_PREFIX_PATH="/opt/homebrew/opt/libomp;/opt/homebrew"         
cmake --build . --target faiss_metal TestMetalIndexFlat                       
ctest -R TestMetalIndexFlat                                                   

The mkdir -p avoids the error if the directory exists, and -DFAISS_ENABLE_PYTHON=OFF skips the Python dependency that isn't needed for this PR.

@meta-cla meta-cla Bot added the CLA Signed label Apr 24, 2026
@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync Bot commented Apr 24, 2026

@trang-nm-nguyen has imported this pull request. If you are a Meta employee, you can view this in D102284936.

@mnorris11
Copy link
Copy Markdown
Contributor

Thanks for this contribution!

Github Actions supports apple silicon. Let's add a new Github Actions runner with runs-on some macos, like macos-15 or macos-latest.

We can add to build pull request, similar to this block:

linux-x86_64-GPU-cmake:
name: Linux x86_64 GPU (cmake)
needs: linux-x86_64-cmake
runs-on: 4-core-ubuntu-gpu-t4
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Build and Test (cmake)
uses: ./.github/actions/build_cmake
with:
gpu: ON
.

@Evandabest
Copy link
Copy Markdown
Contributor Author

@mnorris11 Thank you for taking a look at my PR, I have added a new Github Actions Runner for macos-15 in my latest commit

Comment thread faiss/gpu_metal/MetalIndexFlat.mm Outdated
}

void MetalIndexFlat::add_with_ids(idx_t n, const float* x, const idx_t* xids) {
if (n == 0) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just FAISS_THROW_IF_NOT_MSG(!ids, "add_with_ids not supported"); to start with, to match existing behavior. Right now copyTo doesn't preserve the ids.

Comment on lines +145 to +155
cmake -B build \
-DFAISS_ENABLE_METAL=ON \
-DFAISS_ENABLE_GPU=OFF \
-DFAISS_ENABLE_PYTHON=OFF \
-DBUILD_TESTING=ON \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_PREFIX_PATH="$(brew --prefix libomp)" \
.
cmake --build build --target faiss_metal TestMetalIndexFlat -j$(sysctl -n hw.logicalcpu)
- name: Test
run: cd build && ctest -R TestMetalIndexFlat --output-on-failure
Copy link
Copy Markdown
Contributor

@mnorris11 mnorris11 Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the existing file ./.github/actions/build_cmake and update it with any required commands?

@Evandabest
Copy link
Copy Markdown
Contributor Author

@mnorris11 I addressed your comments in the latest commits, I replaced FAISS_THROW_MSG("add_with_ids not supported") to match existing behavior and added a Metal specific steps to the build and gated the conda dependent steps (which isn't available on the macOS runner) with inputs.metal != 'ON'

@mnorris11
Copy link
Copy Markdown
Contributor

@mnorris11 I addressed your comments in the latest commits, I replaced FAISS_THROW_MSG("add_with_ids not supported") to match existing behavior and added a Metal specific steps to the build and gated the conda dependent steps (which isn't available on the macOS runner) with inputs.metal != 'ON'

@Evandabest you might need to update the cmake a bit, looks like an error:

CMake Error at perf_tests/CMakeLists.txt:21 (find_package):
  By not providing "Findgflags.cmake" in CMAKE_MODULE_PATH this project has
  asked CMake to find a package configuration file provided by "gflags", but
  CMake did not find one.

  Could not find a package configuration file provided by "gflags" with any
  of the following names:

    gflags.cps
    gflagsConfig.cmake
    gflags-config.cmake

  Add the installation prefix of "gflags" to CMAKE_PREFIX_PATH or set
  "gflags_DIR" to a directory containing one of the above files.  If "gflags"
  provides a separate development package or SDK, be sure it has been
  installed.


-- Configuring incomplete, errors occurred!

@Evandabest Evandabest force-pushed the metal-backend-initial branch from af374fe to 503e84d Compare April 27, 2026 05:08
@Evandabest
Copy link
Copy Markdown
Contributor Author

Evandabest commented Apr 27, 2026

@mnorris11 I addressed your comments in the latest commits, I replaced FAISS_THROW_MSG("add_with_ids not supported") to match existing behavior and added a Metal specific steps to the build and gated the conda dependent steps (which isn't available on the macOS runner) with inputs.metal != 'ON'

@Evandabest you might need to update the cmake a bit, looks like an error:

CMake Error at perf_tests/CMakeLists.txt:21 (find_package):
  By not providing "Findgflags.cmake" in CMAKE_MODULE_PATH this project has
  asked CMake to find a package configuration file provided by "gflags", but
  CMake did not find one.

  Could not find a package configuration file provided by "gflags" with any
  of the following names:

    gflags.cps
    gflagsConfig.cmake
    gflags-config.cmake

  Add the installation prefix of "gflags" to CMAKE_PREFIX_PATH or set
  "gflags_DIR" to a directory containing one of the above files.  If "gflags"
  provides a separate development package or SDK, be sure it has been
  installed.


-- Configuring incomplete, errors occurred!

@mnorris11 Thank you for bringing that up, I added gflags to the brew install

@mdouze
Copy link
Copy Markdown
Contributor

mdouze commented Apr 28, 2026

Thanks for the PR.
Please convert the .mm test to C++ and implement the Python wrapper + test.

Comment thread faiss/gpu_metal/GpuIndexFlat.h Outdated
Comment on lines +1 to +22
// @lint-ignore-every LICENSELINT
/**
* Copyright (c) Meta Platforms, Inc. and its affiliates.
*
* This source code is licensed under the MIT license found in the
* LICENSE file in the root directory of this source tree.
*
* Unified name for flat GPU index when Metal backend is built.
* Include this when using the Metal backend for API parity with
* faiss::gpu::GpuIndexFlat.
*/

#pragma once

#include <faiss/gpu_metal/MetalIndexFlat.h>

namespace faiss {

/// When FAISS is built with Metal backend, GpuIndexFlat is MetalIndexFlat.
using GpuIndexFlat = gpu_metal::MetalIndexFlat;

} // namespace faiss
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Team discussed this PR: Let's remove the GpuIndexFlat file in this PR. AMD ROCm is based on CUDA, so it makes sense to share the GpuIndexFlat with NVIDIA for classic Faiss GPU (non-cuVS), but Metal will be a different implementation, so users can simply use MetalIndexFlat.

Copy link
Copy Markdown
Contributor Author

@Evandabest Evandabest Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you guys for reviewing this. I have removed GpuIndexFlat file from this pr. Makes sense since Metal's implementation is separate from Cuda/ROCm, will keep this in mind for future implementations (MetalIndexIVFFlat, MetalIndexIVFPQ, etc.)

@Evandabest Evandabest force-pushed the metal-backend-initial branch from 7081695 to 78997b6 Compare April 29, 2026 04:37
@Evandabest Evandabest force-pushed the metal-backend-initial branch from 78997b6 to 7081695 Compare April 29, 2026 05:24
@Evandabest
Copy link
Copy Markdown
Contributor Author

Evandabest commented Apr 29, 2026

@mnorris11 I am failing some checks after pushing my changes. after removing the GpuIndexFlat.h file, the Facebook internal linter fails. Would love to know what's going on and how I can fix it if possible. Also, I merged in the main branch while fixing a merge conflict in ‎.github/actions/build_cmake/action.yml‎. After doing this the Linux x86_64 Dynamic Dispatch (cmake) CI test seems to be failing and it appears unrelated to the changes I made. Thank you

@mnorris11
Copy link
Copy Markdown
Contributor

@mnorris11 I am failing some checks after pushing my changes. after removing the GpuIndexFlat.h file, the Facebook internal linter fails. Would love to know what's going on and how I can fix it if possible. Also, I merged in the main branch while fixing a merge conflict in ‎.github/actions/build_cmake/action.yml‎. After doing this the Linux x86_64 Dynamic Dispatch (cmake) CI test seems to be failing and it appears unrelated to the changes I made. Thank you

No worries about the internal linter, I can just format that before we merge.
For the DD signal, I will retry the job. We can ignore this too. This test seems a bit flaky, will discuss with the team.

@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync Bot commented Apr 30, 2026

@mnorris11 merged this pull request in 66cea52.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants