Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No Module Named 'torch' #246

Open
MilesQLi opened this issue May 28, 2023 · 46 comments
Open

No Module Named 'torch' #246

MilesQLi opened this issue May 28, 2023 · 46 comments

Comments

@MilesQLi
Copy link

When I run pip install flash-attn, it says that. But obviously, it is wrong. See screenshot.
image

@ulysses500
Copy link

Same issue for me.

@ulysses500
Copy link

ulysses500 commented May 28, 2023

Workaround: install the previous version pip install flash_attn==1.0.5

@smeyerhot
Copy link

smeyerhot commented May 28, 2023

I am seeing the same problem on every flash_attn version. I am using Cuda 12.1 on the new g2 vm instance from gcp. https://cloud.google.com/compute/docs/accelerator-optimized-machines#g2-vms. The underlying GPU is the Nvidia L4 which uses Ada.

@smeyerhot
Copy link

Workaround: install the previous version pip install flash_attn==1.0.5

This might work in some scenarios but not all.

@tridao
Copy link
Contributor

tridao commented May 28, 2023

Can you try python -m pip install flash-attn?
It's possible that pip and python -m pip refer to different environments.

Getting the dependencies right for all setup is hard. We had torch in the dependency in 1.0.5, but for some users it would download a new version of torch instead of using the existing one. So for 1.0.6 we leave torch out of the dependency.

@official-elinas
Copy link

Getting the same issue. I also tried python -m pip install flash-attn as you suggested with the same failure.

@nrailg
Copy link

nrailg commented May 29, 2023

same problem here.

@tridao
Copy link
Contributor

tridao commented May 29, 2023

I don't know a right solution that works for all setups, happy to hear suggestions.

We recommend the Pytorch container from Nvidia, which has all the required tools to install FlashAttention.

@smeyerhot
Copy link

I believe this is an incompatibility issue with cuda 12.1 version of torch.

Using the following torch version solves my probem.

torch==2.0.0+cu117

@MilesQLi
Copy link
Author

@smeyerhot I use the exact version, but it doesn't work. See the screenshot.

image

@smeyerhot
Copy link

smeyerhot commented May 29, 2023

@MilesQLi

I believe this is an incompatibility issue with cuda 12.1 version of torch.

Using the following torch version solves my probem.

torch==2.0.0+cu117

Sorry! This didn't fix things... apologies on the false hope.

@MilesQLi
Copy link
Author

@smeyerhot No problem. Thanks a lot anyway!

@jzsbioinfo
Copy link

same problem

1 similar comment
@quant-cracker
Copy link

same problem

@leucocyte123
Copy link

pip install flash-attn==1.0.5 might help. I am using torch 1.13 and cuda 12.0.

@Maykeye
Copy link

Maykeye commented May 30, 2023

I had the same issue with pip. Workaround was to compile from source, worked as a charm

In [1]: import flash_attn

In [2]: import torch

In [3]: torch.__version__
Out[3]: '2.0.1+cu117'

In [4]: flash_attn.__version__
Out[4]: '1.0.6'

@Evan-aja
Copy link

I had the same issue with pip. Workaround was to compile from source, worked as a charm

In [1]: import flash_attn

In [2]: import torch

In [3]: torch.__version__
Out[3]: '2.0.1+cu117'

In [4]: flash_attn.__version__
Out[4]: '1.0.6'

I also had the same issue, but my system needs Cuda 12.1 (2x Nvidia L4). so using torch 117 is not an option.

this is also my workaround and it works like a charm.

my system uses Fedora Server

@official-elinas
Copy link

official-elinas commented May 31, 2023

I compiled it myself using a docker container and I still get this when executing
RuntimeError: Expected q_dtype == torch::kFloat16 || ((is_sm8x || is_sm90) && q_dtype == torch::kBFloat16) to be true

@xwyzsn
Copy link

xwyzsn commented May 31, 2023

try pip install flash-attn --no-build-isolation fixed my problem.
pip docs
to fix this problem, maybe adding torch dependency into pyproject.toml can help

@official-elinas
Copy link

@xwyzsn Unfortunately this only worked on my windows system, not linux. But I feel we're making progress.

@tridao
Copy link
Contributor

tridao commented May 31, 2023

to fix this problem, maybe adding torch dependency into pyproject.toml can help

We had torch in the dependency in 1.0.5, but for some users it would download a new version of torch instead of using the existing one.
I'm not really an expert in Python packaging, so it's possible I'm doing sth wrong.

@xwyzsn
Copy link

xwyzsn commented Jun 1, 2023

@xwyzsn Unfortunately this only worked on my windows system, not linux. But I feel we're making progress.

Hi, actually I am using linux. It also worked well. I assume that you may missed some other package to build this up in your linux system.

--no-build-isolation ...Build dependencies specified by PEP 518 must be already installed if this option is used.

@bansky-cl
Copy link

same problem to me, i solve by check my device and torch cuda version.

@Wraken
Copy link

Wraken commented Jun 1, 2023

try pip install flash-attn --no-build-isolation fixed my problem. pip docs to fix this problem, maybe adding torch dependency into pyproject.toml can help

This fixed the torch problem, but now I got an other error. Might be related to something else tho.

        435 |         function(_Functor&& __f)
            |                                                                                                                                                 ^
      /usr/include/c++/11/bits/std_function.h:435:145: note:         ‘_ArgTypes’
      /usr/include/c++/11/bits/std_function.h:530:146: error: parameter packs not expanded with ‘...’:
        530 |         operator=(_Functor&& __f)
            |                                                                                                                                                  ^
      /usr/include/c++/11/bits/std_function.h:530:146: note:         ‘_ArgTypes’
      error: command '/usr/bin/nvcc' failed with exit code 255

@vchiley
Copy link

vchiley commented Jun 2, 2023

Screenshot 2023-06-02 at 3 50 03 PM

@xwyzsn ninja was removed, then torch was removed, then ninja was re-added. Next logical step is to re-add torch. right??? 😄

@CallShaul
Copy link

Same issue in Kubuntu 20 with torch 2.0.1 and cuda 11.8 python 3.9 / python 3.10 and flash-attn versions 0.2.8 / 1.0.4 / 1.0.5 / 1.0.6 / 1.0.7 with and without --no-build-isolation flag.

@BaohaoLiao
Copy link

Thanks to the previous answers, I can install it successfully. Here is my experience:
Environment
torchv2.0.0 + cuda11.7 on Ubuntu

  1. I meet error as ModuleNotFoundError: No module named 'torch', then I install as pip install flash-attn --no-build-isolation
  2. It raises another error as ModuleNotFoundError: No module named 'packaging', then I install this package as pip install packaging
  3. re-run the installation, another error comes RuntimeError: The current installed version of g++ (4.8.5) is less than the minimum required version by CUDA 11.7 (6.0.0). Please make sure to use an adequate version of g++ (>=6.0.0, <12.0).
  4. I use a higher version of g++9.0, and it finally works

@Talkvibes
Copy link

Workaround: install the previous version pip install flash_attn==1.0.5

image how do tackle this

@Martion-z
Copy link

try pip install flash-attn --no-build-isolation fixed my problem. pip docs to fix this problem, maybe adding torch dependency into pyproject.toml can help

This fixed the torch problem, but now I got an other error. Might be related to something else tho.

        435 |         function(_Functor&& __f)
            |                                                                                                                                                 ^
      /usr/include/c++/11/bits/std_function.h:435:145: note:         ‘_ArgTypes’
      /usr/include/c++/11/bits/std_function.h:530:146: error: parameter packs not expanded with ‘...’:
        530 |         operator=(_Functor&& __f)
            |                                                                                                                                                  ^
      /usr/include/c++/11/bits/std_function.h:530:146: note:         ‘_ArgTypes’
      error: command '/usr/bin/nvcc' failed with exit code 255

I had the same issue. Could you solve this?

@tridao
Copy link
Contributor

tridao commented Jul 5, 2023

@Martion-z looks the same as #172 and #225. Some version of CUDA doesn't like gcc 11. Downgrading to gcc 10 might work.

@shahules786
Copy link

conda install -c conda-forge cudatoolkit-dev
pip flash_attn==1.0.5

This woked for me.

@Crysflair
Copy link

pip install flash-attn==1.0.5

Thanks! This solves the above error. But there's still a new one occurs: The detected CUDA version (12.1) mismatches the version that was used to compile.

@Crysflair
Copy link

pip install flash-attn==1.0.5

Thanks! This solves the above error. But there's still a new one occurs: The detected CUDA version (12.1) mismatches the version that was used to compile.

Problem solved. I installed the nightly version of pytorch, and then install flash-attn with the no-build-isolation option.

pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu121
# Successfully installed torch-2.1.0.dev20230710+cu121
pip install flash-attn --no-build-isolation
# Successfully built flash-attn
# Installing collected packages: ninja, flash-attn
# Successfully installed flash-attn-1.0.8 ninja-1.11.1

Note the wheel building process takes a long time. Don't kill it and just wait.

@Nan-Do
Copy link

Nan-Do commented Jul 12, 2023

There are a lot of things that can go wrong when installing this package.
Next I'm going to share a recipe that should work right now using conda.

Some remarks:

  • Don't use pip to install any cuda or pytorch libraries (this include any package that might reference them before having them installed via conda)
  • Make sure you don't have several versions of CUDA installed, if that is your case the installation might fail.
  • Try to install ninja from your distribution's package manager. The process might work without having it installed but it does a much better job detecting the environment.
  • This work with the current torch version and cuda 11.8 no warranties that it will keep working for future versions.
  • If you couldn't install it before I strongly recommend to start from a fresh conda environment.

Create a new environment if you don't have already one.

conda create -n flash_attn python=3.10.11
conda activate flash_attn

These are the required packages with their required versions.

conda install -c conda-forge gcc=11.3
conda install -c conda-forge gxx=11.3
conda install cuda -c nvidia/label/cuda-11.8.0
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
pip install packaging
pip install flash_attn --no-build-isolation

@jinghan23
Copy link

There are a lot of things that can go wrong when installing this package. Next I'm going to share a recipe that should work right now using conda.

Some remarks:

  • Don't use pip to install any cuda or pytorch libraries (this include any package that might reference them before having them installed via conda)
  • Make sure you don't have several versions of CUDA installed, if that is your case the installation might fail.
  • Try to install ninja from your distribution's package manager. The process might work without having it installed but it does a much better job detecting the environment.
  • This work with the current torch version and cuda 11.8 no warranties that it will keep working for future versions.
  • If you couldn't install it before I strongly recommend to start from a fresh conda environment.

Create a new environment if you don't have already one.

conda create -n flash_attn python=3.10.11
conda activate flash_attn

These are the required packages with their required versions.

conda install -c conda-forge gcc=11.3
conda install -c conda-forge gxx=11.3
conda install cuda -c nvidia/label/cuda-11.8.0
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
pip install packaging
pip install flash_attn --no-build-isolation

Thanks for your kind and detailed recipe! But I still meet this problem...

Collecting flash_attn                                                                                                                                                
  Using cached flash_attn-1.0.9.tar.gz (1.8 MB)                                                                                                                      
  Preparing metadata (setup.py) ... error                                                                                                                            
  error: subprocess-exited-with-error                                                                                                                                
                                                                                                                                                                     
  × python setup.py egg_info did not run successfully.                                                                                                               
  │ exit code: 1                                                                                                                                                     
  ╰─> [21 lines of output]                                                                                                                                           
      Traceback (most recent call last):                                                                                                                             
        File "<string>", line 2, in <module>                                                                                                                         
        File "<pip-setuptools-caller>", line 34, in <module>                                                                                                         
        File "/tmp/pip-install-57_5gf64/flash-attn_939101cdc95c431a947f582f325cfb21/setup.py", line 111, in <module>                                                 
          _, bare_metal_version = get_cuda_bare_metal_version(CUDA_HOME)                                                                                             
        File "/tmp/pip-install-57_5gf64/flash-attn_939101cdc95c431a947f582f325cfb21/setup.py", line 26, in get_cuda_bare_metal_version                               
          raw_output = subprocess.check_output([cuda_dir + "/bin/nvcc", "-V"], universal_newlines=True)                                                              
        File "/home/adseadmin/miniconda3/envs/lmtest0/lib/python3.10/subprocess.py", line 421, in check_output                                                       
          return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,                                                                                           
        File "/home/adseadmin/miniconda3/envs/lmtest0/lib/python3.10/subprocess.py", line 503, in run                                                                
          with Popen(*popenargs, **kwargs) as process:                                                                                                               
        File "/home/adseadmin/miniconda3/envs/lmtest0/lib/python3.10/subprocess.py", line 971, in __init__                                                           
          self._execute_child(args, executable, preexec_fn, close_fds,                                                                                               
        File "/home/adseadmin/miniconda3/envs/lmtest0/lib/python3.10/subprocess.py", line 1863, in _execute_child                                                    
          raise child_exception_type(errno_num, err_msg, err_filename)                                                                                               
      FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/cuda-11.5/bin/nvcc'                                                                        
                                                                                                                                                                     
                                                                                                                                                                     
      torch.__version__  = 2.0.1                                                                                                                                     
                                                                                                                                                                     
                                                                                                                                                                     
      [end of output]                                                                                                                                                
                                                                                                                                                                     
  note: This error originates from a subprocess, and is likely not a problem with pip.                                                                               
error: metadata-generation-failed                                                                                                                                    
                                                                                                                                                                     
× Encountered error while generating package metadata.                                                                                                               
╰─> See above for output.                                                                                                                                            
                                                                                                                                                                     
note: This is an issue with the package mentioned above, not pip.                                                                                                    
hint: See above for details. 

@Nan-Do
Copy link

Nan-Do commented Jul 17, 2023

@jinghan23 See point two, you have several versions of CUDA installed and the installation is failing because of that.

@Wraken
Copy link

Wraken commented Jul 19, 2023

The solution I've found working :

  • Find version of cuda used by torch
  • Install the right cuda toolkit, for me it was cuda 11.7 -> follow instruction here
    Example :
$ wget -d https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64/cuda-keyring_1.0-1_all.deb
$ sudo dpkg -i cuda-keyring_1.0-1_all.deb
$ sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64/3bf863cc.pub
$ add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64/ /"
$ add-apt-repository contrib
$ sudo apt-get update
$ sudo apt-get -y install cuda-11-7
  • Install g++-10 and gcc-10
  • Install flash-attn like this : CXX=g++-10 CC=gcc-10 LD=g++-10 pip3 install flash-attn==v1.0.3.post0

I've not tried on other version of flash-attn, but I think it should works too

@karandua2016
Copy link

karandua2016 commented Aug 10, 2023

The workaround that worked for me was to downgrade the cuda runtime version. My driver version is still 12.2 but the runtime version is now 11.7. It is also much faster than installing with no build isolation.

conda install -c "nvidia/label/cuda-11.7.0" cuda-toolkit
pip3 install flash-attn==1.0.5

@HadiAskari
Copy link

flash-attn is a very problematic library

@xylcbd
Copy link

xylcbd commented Feb 21, 2024

you MUST install flash-atten after torch been installed

@fakerybakery
Copy link

Solution: Install PyTorch first, then install FA2.

Example:

WRONG:

$ pip install torch flash-attn

CORRECT:

$ pip install torch
$ pip install flash-attn

(source)

@phoerious
Copy link

phoerious commented Apr 24, 2024

I've tried adding torch also as a build and not just a runtime dependency in my pyproject.toml. It doubled the install time, but actually ended up not working.

My workaround now is to have it as an optional dependency:

[project.optional-dependencies]
# Flash attention cannot be installed alongside normal dependencies,
# since it requires torch during build time. Install with
#     pip install '.[flash-attn]'
# after installing everything else first.
flash-attn = [
    "flash-attn>=2.5.7"
]

And then do two pip installs. One without [flash-attn] and then one with [flash-attn].

It's also worth noting that flash-attn is extremely picky when it comes to the pip and wheel versions. With the following build requirements:

[build-system]
requires = ["setuptools>=69.0.0", "wheel>=0.43.0"]

and a pip install --upgrade pip before everything, it works. Without that, I get strange build errors due do missing wheel or packaging.

@cclough
Copy link

cclough commented May 9, 2024

pip install flash-attn==1.0.5 might help. I am using torch 1.13 and cuda 12.0.

after I do this, I get this error:

TypeError: MHA.__init__() got an unexpected keyword argument 'num_heads_kv'

@mirekphd
Copy link

mirekphd commented May 24, 2024

We had torch in the dependency in 1.0.5, but for some users it would download a new version of torch instead of using the existing one. I'm not really an expert in Python packaging, so it's possible I'm doing sth wrong.

You are right that omitting in your requirements such a slow to install (compilation-requiring) and popular dependency like torch is the best practice (judging from the wrappers on "classic" ML algos such as xgboost or lightgbm).

The problem here is that your installer tries to import torch, which is a not a good idea, because it fails unless developers/maintainers are able to guarantee the expected installation sequence (first torch, then flash-attn), which really should not be expected from batch installation processes or in new environments. This assumption routinely fails in Dockerfiles of GPU-enabled Docker containers, because there we install GPU-enabled packages such as torch as the last ones precisely because some of their wrappers still (despite all the educational efforts :) contain CPU-only versions as their requirements and we need to exchange the wrong CPU-only torch (tensorflow, xgboost, ...) for the correct GPU-enabled version, and we do it at the very end of the installation process, after all their wrappers and reverse dependencies have been already installed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.