Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faiss-gpu: index_cpu_to_gpu() hangs (doesn't occur with the conda package) #54

Closed
igor0 opened this issue Apr 2, 2022 · 18 comments
Closed
Labels
bug Something isn't working

Comments

@igor0
Copy link

igor0 commented Apr 2, 2022

When I install faiss-gpu via pip, index_cpu_to_gpu() seems to hang forever. For example, this code sample hangs for me:

import faiss

index_flat = faiss.IndexFlatL2(16)
gpu_index_flat = faiss.index_cpu_to_gpu(faiss.StandardGpuResources(), 0, index_flat)

The index_cpu_to_gpu() will hang, spinning a CPU at 100% seemingly forever, mostly in libnvidia-ptxjitcompiler.so.510.54.

I confirmed this on two different machines with different GPUs (A100 and A10G).

I found two workarounds:

  1. Downgrade to faiss==1.5.3. I hit the problem with 1.6.0 or later.
  2. Use the conda package. I don't observe the issue with conda, even with the latest version of faiss-gpu (1.7.2)

Maybe faiss-gpu wheel isn't built with CUDA 11 support, and that's why it doesn't work with A100 / A10G?

@rom1504
Copy link

rom1504 commented Apr 3, 2022

https://github.com/kyamagu/faiss-wheels/blob/main/scripts/build_Linux.sh#L8 it's indeed built for cuda 10
could probably be updated there
(you can also do a custom build with these instructions https://github.com/kyamagu/faiss-wheels#prerequisite-1 )

@kyamagu
Copy link
Owner

kyamagu commented Apr 3, 2022

This is likely due to CMAKE_CUDA_ARCHITECTURES=35-real;50-real;60-real;70-real;75 in the build script, which does not include Ampere architecture at all. Probably adding 80 here solves the hand, though I suspect that is only triggering a long JIT compilation.

https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#gpu-compilation

@kyamagu
Copy link
Owner

kyamagu commented Apr 3, 2022

Turns out nvcc cannot handle compute capability 8.0 with cuda 10.x. Will check cuda upgrade considerations like minimum driver requirement.

Meanwhile, you can check if the application is really crashing or not by setting CUDA_FORCE_PTX_JIT=1 as described in the documentation.
https://docs.nvidia.com/cuda/ampere-compatibility-guide/index.html#verifying-ampere-compatibility

@kyamagu kyamagu added the bug Something isn't working label Apr 3, 2022
@kyamagu
Copy link
Owner

kyamagu commented Apr 4, 2022

Check if the artifact in this CI run solves the issue. These wheels are built with CUDA 11.0.

@igor0
Copy link
Author

igor0 commented Apr 5, 2022

@kyamagu: I tried out the wheels, but not successfully: [..]

EDIT: Looks like this was about Python version. Let me try to fix that.
EDIT2: I've tried various Python versions (3.6, 3.7, 3.8), but I can't get any of the wheels installed. I always get "... is not a supported wheel on this platform." But it's possible that I'm doing something wrong. I'm on g5.2xlarge AWS instance, so it should just be x86_x64 Linux.

@kyamagu
Copy link
Owner

kyamagu commented Apr 5, 2022

@igor0 What is the message if you do python -m pip install /path/to/specific/faiss_gpu-1.7.2-cp37-cp37-manylinux_2_17_x86_64.manylinux2014_x86_64.whl?

@igor0
Copy link
Author

igor0 commented Apr 7, 2022

OK, yeah, that seems to work! (Not sure how that's different from what I was trying previously.)

That wheel seems to fix my trivial repro, so it addresses the issue, at least as far as I can tell. Presumably this particular wheel won't work for someone who has CUDA 10, but that's a different problem. That's why PyTorch versions are so complicated.

@kyamagu kyamagu closed this as completed Apr 11, 2022
@Doragd
Copy link

Doragd commented Jun 7, 2022

OK, yeah, that seems to work! (Not sure how that's different from what I was trying previously.)

That wheel seems to fix my trivial repro, so it addresses the issue, at least as far as I can tell. Presumably this particular wheel won't work for someone who has CUDA 10, but that's a different problem. That's why PyTorch versions are so complicated.

Hi, how can I access the wheel from the github action?

@Doragd
Copy link

Doragd commented Jun 7, 2022

I install the faiss-gpu from the pypi . it does not work.... I also use the A100

@kyamagu
Copy link
Owner

kyamagu commented Jun 7, 2022

The wheel is not yet available on PyPI. There is an issue in wheel package size that prevents CUDA 11 based wheel from pypi upload.

@Doragd
Copy link

Doragd commented Jun 7, 2022

The wheel is not yet available on PyPI. There is an issue in wheel package size that prevents CUDA 11 based wheel from pypi upload.

How could I download the wheel file from the action you mentioned above

@kyamagu
Copy link
Owner

kyamagu commented Jun 7, 2022

#54 (comment)

@Doragd
Copy link

Doragd commented Jun 7, 2022

#54 (comment)

Sorry to bother you. But I do not know how to download the wheel from the action. All I can see is the building log when I click into this link

@kyamagu
Copy link
Owner

kyamagu commented Jun 7, 2022

See the artifact in the actions

@Doragd
Copy link

Doragd commented Jun 7, 2022

See the artifact in the actions

Thanks~ It works~

@anubhav562
Copy link

Hello @kyamagu !

The artifact that you mentioned has already expired! Can you please tell us an alternative to try if possible?

@anubhav562
Copy link

Hey everyone!

Thanks a lot for this chain! My issue got solved!

Issue: I was not able to use FAISS on the NVIDIA A100. The FAISS index was not getting pushed to the GPU!

For people who want a solution at one place. Please follow these steps below:

  • As faiss-gpu 1.7.3 is not available on pip. You can download the artifact directory (a directory containing multiple wheel files based on different systems) from here -> https://github.com/kyamagu/faiss-wheels/actions/runs/3487300515 . Go to bottom of the page -> locate artifacts and download it to your machine.

  • Once you have the artifacts folder, you need to see which file is the most suitable for you system. For example: I used the following file: faiss_gpu-1.7.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl . The filename itself indicates the configurations of a system. In the filename cp38 could mean CPython 3.8 (that is what I think). The artifacts directory also has many wheels for different systems. Locate the one compatible for your machine.

  • Once you have located your compatible wheel just run the following:
    python -m pip install /path/to/wheel/faiss_gpu-1.7.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

And you are good to go!

Thanks to @kyamagu for maintaining the repo!

@iamlockelightning
Copy link

Hey everyone!

Thanks a lot for this chain! My issue got solved!

Issue: I was not able to use FAISS on the NVIDIA A100. The FAISS index was not getting pushed to the GPU!

For people who want a solution at one place. Please follow these steps below:

  • As faiss-gpu 1.7.3 is not available on pip. You can download the artifact directory (a directory containing multiple wheel files based on different systems) from here -> https://github.com/kyamagu/faiss-wheels/actions/runs/3487300515 . Go to bottom of the page -> locate artifacts and download it to your machine.
  • Once you have the artifacts folder, you need to see which file is the most suitable for you system. For example: I used the following file: faiss_gpu-1.7.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl . The filename itself indicates the configurations of a system. In the filename cp38 could mean CPython 3.8 (that is what I think). The artifacts directory also has many wheels for different systems. Locate the one compatible for your machine.
  • Once you have located your compatible wheel just run the following:
    python -m pip install /path/to/wheel/faiss_gpu-1.7.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

And you are good to go!

Thanks to @kyamagu for maintaining the repo!

https://github.com/kyamagu/faiss-wheels/releases/tag/v1.7.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants