`ImportError: This modeling file requires... flash_attn` #8

llimllib · 2023-05-04T00:40:26Z

Trying to follow the instructions on an m1 mac, I get the above error.

Unfortunately, attempting to install flash_attn does not succeed, due to: RuntimeError: flash_attn was requested, but nvcc was not found., which may be just an unfortunate aspect of not having an nvidia card.

Anyway, the point is probably you should add flash_attn to your list of required modules?

The text was updated successfully, but these errors were encountered:

pirroh · 2023-05-04T01:39:09Z

From the error message, it looks like the CUDA drivers are not installed.
Can you test if you can run successfully the commands nvidia-smi and nvcc --version?

Also, we already list flash_attn among the suggested dependencies -- check the README!
We haven't tested the model on M1/M2 Macs yet, so in case of further blockers, I can recommend to run with the default attention implementation in PyTorch.

llimllib · 2023-05-04T01:40:56Z

First of all, you need to install the latest versions of the following dependencies:

einops
sentencepiece
torch
transformers

is the section I read? If flash_attn is listed, I don't see it

llimllib · 2023-05-04T01:42:24Z

(an m1 mac has no nvidia card so I don't think I can install nvcc? Too bad, but I get that some stuff can't run without an nvidia card)

llimllib · 2023-05-04T01:46:08Z

now I see that you listed it in the model description, but it appears to be necessary for inference as well, so it should be included in that list of required python packages is what I mean

pirroh · 2023-05-04T01:50:11Z

You don't need flash attention for inference -- it's a "nice to have" that makes inference faster, but to my knowledge it works only on NVIDIA GPUs (as you need CUDA).
In your case, you should load the model as indicated in the first half of that section:

from transformers import AutoModelForCausalLM

# load model
model = AutoModelForCausalLM.from_pretrained('replit/replit-code-v1-3b', trust_remote_code=True)

Hope this helps. Also, make sure to run on the latest version of the Transformers library!

llimllib · 2023-05-04T01:52:28Z

That's exactly what I did that caused the error to occur!

pirroh · 2023-05-04T01:59:12Z

Can you run pip install --upgrade transformers, and try again?

llimllib · 2023-05-04T02:06:21Z

I will do so tomorrow (I have to re-download the model now), but I was working in a clean virtualenv

llimllib · 2023-05-04T02:07:03Z

(which I assume means pip will download the newest version of a lib? But maybe that assumption is false if there's a previously cached version?)

llimllib · 2023-05-04T12:18:07Z

I'm unable to reproduce. Sincere apologies for the noise and wasting your time, and thanks for the model

pirroh · 2023-05-04T17:11:42Z

No problem! Glad it worked in the end :)

omaratef3221 · 2024-01-01T11:26:16Z

I have mac m2 max 32 GB. pip install --upgrade transformers has worked perfectly for me thanks @pirroh

kabelklaus · 2024-05-11T01:13:42Z

for me it doesn't work with pip install --upgrade transformers

tim3in · 2024-11-06T08:59:29Z

Can you run pip install --upgrade transformers, and try again?

Yes, it worked. Thank you. :)

pirroh self-assigned this May 4, 2023

pirroh added the inference Everything about model inference label May 4, 2023

madhavatreplit self-assigned this May 4, 2023

llimllib closed this as completed May 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`ImportError: This modeling file requires... flash_attn` #8

`ImportError: This modeling file requires... flash_attn` #8

llimllib commented May 4, 2023

pirroh commented May 4, 2023

llimllib commented May 4, 2023

llimllib commented May 4, 2023

llimllib commented May 4, 2023

pirroh commented May 4, 2023

llimllib commented May 4, 2023

pirroh commented May 4, 2023

llimllib commented May 4, 2023

llimllib commented May 4, 2023

llimllib commented May 4, 2023

pirroh commented May 4, 2023

omaratef3221 commented Jan 1, 2024 •

edited

Loading

kabelklaus commented May 11, 2024

tim3in commented Nov 6, 2024

ImportError: This modeling file requires... flash_attn #8

ImportError: This modeling file requires... flash_attn #8

Comments

llimllib commented May 4, 2023

pirroh commented May 4, 2023

llimllib commented May 4, 2023

llimllib commented May 4, 2023

llimllib commented May 4, 2023

pirroh commented May 4, 2023

llimllib commented May 4, 2023

pirroh commented May 4, 2023

llimllib commented May 4, 2023

llimllib commented May 4, 2023

llimllib commented May 4, 2023

pirroh commented May 4, 2023

omaratef3221 commented Jan 1, 2024 • edited Loading

kabelklaus commented May 11, 2024

tim3in commented Nov 6, 2024

`ImportError: This modeling file requires... flash_attn` #8

`ImportError: This modeling file requires... flash_attn` #8

omaratef3221 commented Jan 1, 2024 •

edited

Loading