Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when install flash-attn #36

Closed
Richar-Du opened this issue Apr 22, 2023 · 7 comments
Closed

Error when install flash-attn #36

Richar-Du opened this issue Apr 22, 2023 · 7 comments

Comments

@Richar-Du
Copy link

When I run pip intall flash-attn, it raises an error:
ERROR: Could not build wheels for flash-attn, which is required to install pyproject.toml-based projects

However, I have run pip install -e . and successfully installed llava. Do you know how to solve this problem?

@haotian-liu
Copy link
Owner

Hi @Richar-Du, thank you for your interest in our work.

flash-attn is not required for running the inference of LLaVA, so that error message you were seeing regarding flash-attn should be not related to LLaVA at all (including the pyproject.toml part).

Can you provide: (1) the full error log, and wrap with ``` as well; (2) your system environment, including the OS, CUDA version, and GPU type?

@brightdeng
Copy link

The following is my full error log

(llava) E:\LLaVA>pip install flash-attn
Collecting flash-attn
Using cached flash_attn-1.0.3.post0.tar.gz (2.0 MB)
Preparing metadata (setup.py) ... done
Requirement already satisfied: torch in c:\users\admin.conda\envs\llava\lib\site-packages (from flash-attn) (2.0.0)
Collecting einops (from flash-attn)
Using cached einops-0.6.1-py3-none-any.whl (42 kB)
Requirement already satisfied: packaging in c:\users\admin.conda\envs\llava\lib\site-packages (from flash-attn) (23.1)
Requirement already satisfied: filelock in c:\users\admin.conda\envs\llava\lib\site-packages (from torch->flash-attn) (3.12.0)
Requirement already satisfied: typing-extensions in c:\users\admin.conda\envs\llava\lib\site-packages (from torch->flash-attn) (4.5.0)
Requirement already satisfied: sympy in c:\users\admin.conda\envs\llava\lib\site-packages (from torch->flash-attn) (1.11.1)
Requirement already satisfied: networkx in c:\users\admin.conda\envs\llava\lib\site-packages (from torch->flash-attn) (3.1)
Requirement already satisfied: jinja2 in c:\users\admin.conda\envs\llava\lib\site-packages (from torch->flash-attn) (3.1.2)
Requirement already satisfied: MarkupSafe>=2.0 in c:\users\admin.conda\envs\llava\lib\site-packages (from jinja2->torch->flash-attn) (2.1.2)
Requirement already satisfied: mpmath>=0.19 in c:\users\admin.conda\envs\llava\lib\site-packages (from sympy->torch->flash-attn) (1.3.0)
Building wheels for collected packages: flash-attn
Building wheel for flash-attn (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [127 lines of output]
No CUDA runtime is found, using CUDA_HOME='C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.1'

  Warning: Torch did not find available GPUs on this system.
   If your intention is to cross-compile, this is not an error.
  By default, Apex will cross-compile for Pascal (compute capabilities 6.0, 6.1, 6.2),
  Volta (compute capability 7.0), Turing (compute capability 7.5),
  and, if the CUDA version is >= 11.0, Ampere (compute capability 8.0).
  If you wish to cross-compile for a single specific architecture,
  export TORCH_CUDA_ARCH_LIST="compute capability" before running setup.py.



  torch.__version__  = 2.0.0+cpu


  fatal: not a git repository (or any of the parent directories): .git
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build\lib.win-amd64-cpython-310
  creating build\lib.win-amd64-cpython-310\flash_attn
  copying flash_attn\attention_kernl.py -> build\lib.win-amd64-cpython-310\flash_attn
  copying flash_attn\bert_padding.py -> build\lib.win-amd64-cpython-310\flash_attn
  copying flash_attn\flash_attention.py -> build\lib.win-amd64-cpython-310\flash_attn
  copying flash_attn\flash_attn_interface.py -> build\lib.win-amd64-cpython-310\flash_attn
  copying flash_attn\flash_attn_triton.py -> build\lib.win-amd64-cpython-310\flash_attn
  copying flash_attn\flash_attn_triton_og.py -> build\lib.win-amd64-cpython-310\flash_attn
  copying flash_attn\flash_attn_triton_single_query.py -> build\lib.win-amd64-cpython-310\flash_attn
  copying flash_attn\flash_attn_triton_tmp.py -> build\lib.win-amd64-cpython-310\flash_attn
  copying flash_attn\flash_attn_triton_tmp_og.py -> build\lib.win-amd64-cpython-310\flash_attn
  copying flash_attn\flash_attn_triton_varlen.py -> build\lib.win-amd64-cpython-310\flash_attn
  copying flash_attn\flash_blocksparse_attention.py -> build\lib.win-amd64-cpython-310\flash_attn
  copying flash_attn\flash_blocksparse_attn_interface.py -> build\lib.win-amd64-cpython-310\flash_attn
  copying flash_attn\fused_softmax.py -> build\lib.win-amd64-cpython-310\flash_attn
  copying flash_attn\rotary.py -> build\lib.win-amd64-cpython-310\flash_attn
  copying flash_attn\__init__.py -> build\lib.win-amd64-cpython-310\flash_attn
  creating build\lib.win-amd64-cpython-310\flash_attn\layers
  copying flash_attn\layers\patch_embed.py -> build\lib.win-amd64-cpython-310\flash_attn\layers
  copying flash_attn\layers\rotary.py -> build\lib.win-amd64-cpython-310\flash_attn\layers
  copying flash_attn\layers\__init__.py -> build\lib.win-amd64-cpython-310\flash_attn\layers
  creating build\lib.win-amd64-cpython-310\flash_attn\losses
  copying flash_attn\losses\cross_entropy.py -> build\lib.win-amd64-cpython-310\flash_attn\losses
  copying flash_attn\losses\cross_entropy_apex.py -> build\lib.win-amd64-cpython-310\flash_attn\losses
  copying flash_attn\losses\cross_entropy_parallel.py -> build\lib.win-amd64-cpython-310\flash_attn\losses
  copying flash_attn\losses\__init__.py -> build\lib.win-amd64-cpython-310\flash_attn\losses
  creating build\lib.win-amd64-cpython-310\flash_attn\models
  copying flash_attn\models\bert.py -> build\lib.win-amd64-cpython-310\flash_attn\models
  copying flash_attn\models\gpt.py -> build\lib.win-amd64-cpython-310\flash_attn\models
  copying flash_attn\models\gptj.py -> build\lib.win-amd64-cpython-310\flash_attn\models
  copying flash_attn\models\gpt_j.py -> build\lib.win-amd64-cpython-310\flash_attn\models
  copying flash_attn\models\gpt_neox.py -> build\lib.win-amd64-cpython-310\flash_attn\models
  copying flash_attn\models\llama.py -> build\lib.win-amd64-cpython-310\flash_attn\models
  copying flash_attn\models\opt.py -> build\lib.win-amd64-cpython-310\flash_attn\models
  copying flash_attn\models\vit.py -> build\lib.win-amd64-cpython-310\flash_attn\models
  copying flash_attn\models\__init__.py -> build\lib.win-amd64-cpython-310\flash_attn\models
  creating build\lib.win-amd64-cpython-310\flash_attn\modules
  copying flash_attn\modules\block.py -> build\lib.win-amd64-cpython-310\flash_attn\modules
  copying flash_attn\modules\embedding.py -> build\lib.win-amd64-cpython-310\flash_attn\modules
  copying flash_attn\modules\mha.py -> build\lib.win-amd64-cpython-310\flash_attn\modules
  copying flash_attn\modules\mlp.py -> build\lib.win-amd64-cpython-310\flash_attn\modules
  copying flash_attn\modules\__init__.py -> build\lib.win-amd64-cpython-310\flash_attn\modules
  creating build\lib.win-amd64-cpython-310\flash_attn\ops
  copying flash_attn\ops\activations.py -> build\lib.win-amd64-cpython-310\flash_attn\ops
  copying flash_attn\ops\fused_dense.py -> build\lib.win-amd64-cpython-310\flash_attn\ops
  copying flash_attn\ops\gelu_activation.py -> build\lib.win-amd64-cpython-310\flash_attn\ops
  copying flash_attn\ops\layer_norm.py -> build\lib.win-amd64-cpython-310\flash_attn\ops
  copying flash_attn\ops\rms_norm.py -> build\lib.win-amd64-cpython-310\flash_attn\ops
  copying flash_attn\ops\__init__.py -> build\lib.win-amd64-cpython-310\flash_attn\ops
  creating build\lib.win-amd64-cpython-310\flash_attn\triton
  copying flash_attn\triton\fused_attention.py -> build\lib.win-amd64-cpython-310\flash_attn\triton
  copying flash_attn\triton\__init__.py -> build\lib.win-amd64-cpython-310\flash_attn\triton
  creating build\lib.win-amd64-cpython-310\flash_attn\utils
  copying flash_attn\utils\benchmark.py -> build\lib.win-amd64-cpython-310\flash_attn\utils
  copying flash_attn\utils\distributed.py -> build\lib.win-amd64-cpython-310\flash_attn\utils
  copying flash_attn\utils\generation.py -> build\lib.win-amd64-cpython-310\flash_attn\utils
  copying flash_attn\utils\pretrained.py -> build\lib.win-amd64-cpython-310\flash_attn\utils
  copying flash_attn\utils\__init__.py -> build\lib.win-amd64-cpython-310\flash_attn\utils
  running build_ext
  C:\Users\admin\.conda\envs\llava\lib\site-packages\torch\utils\cpp_extension.py:359: UserWarning: Error checking compiler version for cl: [WinError 2] 系统找不到指定的文件。
    warnings.warn(f'Error checking compiler version for {compiler}: {error}')
  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "C:\Users\admin\AppData\Local\Temp\pip-install-tp31ysl7\flash-attn_b02fe69769be4953b2cf3debf59648b6\setup.py", line 163, in <module>
      setup(
    File "C:\Users\admin\.conda\envs\llava\lib\site-packages\setuptools\__init__.py", line 87, in setup
      return distutils.core.setup(**attrs)
    File "C:\Users\admin\.conda\envs\llava\lib\site-packages\setuptools\_distutils\core.py", line 185, in setup
      return run_commands(dist)
    File "C:\Users\admin\.conda\envs\llava\lib\site-packages\setuptools\_distutils\core.py", line 201, in run_commands
      dist.run_commands()
    File "C:\Users\admin\.conda\envs\llava\lib\site-packages\setuptools\_distutils\dist.py", line 969, in run_commands
      self.run_command(cmd)
    File "C:\Users\admin\.conda\envs\llava\lib\site-packages\setuptools\dist.py", line 1208, in run_command
      super().run_command(command)
    File "C:\Users\admin\.conda\envs\llava\lib\site-packages\setuptools\_distutils\dist.py", line 988, in run_command
      cmd_obj.run()
    File "C:\Users\admin\.conda\envs\llava\lib\site-packages\wheel\bdist_wheel.py", line 325, in run
      self.run_command("build")
    File "C:\Users\admin\.conda\envs\llava\lib\site-packages\setuptools\_distutils\cmd.py", line 318, in run_command
      self.distribution.run_command(command)
    File "C:\Users\admin\.conda\envs\llava\lib\site-packages\setuptools\dist.py", line 1208, in run_command
      super().run_command(command)
    File "C:\Users\admin\.conda\envs\llava\lib\site-packages\setuptools\_distutils\dist.py", line 988, in run_command
      cmd_obj.run()
    File "C:\Users\admin\.conda\envs\llava\lib\site-packages\setuptools\_distutils\command\build.py", line 132, in run
      self.run_command(cmd_name)
    File "C:\Users\admin\.conda\envs\llava\lib\site-packages\setuptools\_distutils\cmd.py", line 318, in run_command
      self.distribution.run_command(command)
    File "C:\Users\admin\.conda\envs\llava\lib\site-packages\setuptools\dist.py", line 1208, in run_command
      super().run_command(command)
    File "C:\Users\admin\.conda\envs\llava\lib\site-packages\setuptools\_distutils\dist.py", line 988, in run_command
      cmd_obj.run()
    File "C:\Users\admin\.conda\envs\llava\lib\site-packages\setuptools\command\build_ext.py", line 84, in run
      _build_ext.run(self)
    File "C:\Users\admin\.conda\envs\llava\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 346, in run
      self.build_extensions()
    File "C:\Users\admin\.conda\envs\llava\lib\site-packages\torch\utils\cpp_extension.py", line 499, in build_extensions
      _check_cuda_version(compiler_name, compiler_version)
    File "C:\Users\admin\.conda\envs\llava\lib\site-packages\torch\utils\cpp_extension.py", line 383, in _check_cuda_version
      torch_cuda_version = packaging.version.parse(torch.version.cuda)
    File "C:\Users\admin\.conda\envs\llava\lib\site-packages\pkg_resources\_vendor\packaging\version.py", line 49, in parse
      return Version(version)
    File "C:\Users\admin\.conda\envs\llava\lib\site-packages\pkg_resources\_vendor\packaging\version.py", line 264, in __init__
      match = self._regex.search(version)
  TypeError: expected string or bytes-like object
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for flash-attn
Running setup.py clean for flash-attn
Failed to build flash-attn
ERROR: Could not build wheels for flash-attn, which is required to install pyproject.toml-based projects

@haotian-liu
Copy link
Owner

Hi it seems that it's a Windows machine, and it cannot find the CUDA/GPU? I am not familiar with compiling these on Windows so I may not be able to offer much help on this.

One thing I would like to mention is that the flash attention is only needed for training. So you may go ahead without installing the flash-attn and run the demo/inference.

@Richar-Du
Copy link
Author

Hi @Richar-Du, thank you for your interest in our work.

flash-attn is not required for running the inference of LLaVA, so that error message you were seeing regarding flash-attn should be not related to LLaVA at all (including the pyproject.toml part).

Can you provide: (1) the full error log, and wrap with ``` as well; (2) your system environment, including the OS, CUDA version, and GPU type?

I want to run the training code and the error is:
flash-attn.log

The OS is CentOS Linux release 7.6.1810 (Core) x86_64, the CUDA is 11.4, and the GPU is NVIDIA A100-SXM4-80GB.

Thanks in advance.

@haotian-liu
Copy link
Owner

Hi @Richar-Du, sorry I just saw your comment. I am not sure if this is the cause, as it is an issue with the flash-attn and not our repo itself. But it seems that you may need a newer GCC compiler. Can you try use a newer gcc compiler in $PATH and rerun pip install? Thanks!

      /home/hadoop-ba-dealrank/dolphinfs_hdd_hadoop-ba-dealrank/duyifan04/miniconda3/envs/llava/lib/python3.8/site-packages/torch/include/c10/util/C++17.h:16:2: error: #error "You're trying to build PyTorch with a too old version of GCC. We need GCC 5 or later."
       #error \
        ^

@XipengY
Copy link

XipengY commented May 6, 2023

@Richar-Du Maybe you can checkout your NVCC version, you'd better use NVCC>11.7, hope to help you!

@shinganEuler
Copy link

I meet the same problem. Here's how to solve it.
1、You should install cuda11.7, cuda12 or above will failed.
2、You should install gcc11 and g++11 or below, as cuda11.7 required. You should run g++ --version to make sure you install the correct version.
3、conda install -c nvidia/label/cuda-11.7.0 cuda-nvcc
4、export CPATH=/usr/local/cuda-11.7/targets/x86_64-linux/include:$CPATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.7/targets/x86_64-linux/lib:$LD_LIBRARY_PATH
export PATH=/usr/local/cuda-11.7/bin:$PATH
5、Finally, run pip install flash-attn

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants