-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AttributeError: 'LlamaLikeModel' object has no attribute 'layers' #5778
Comments
Bumping this, happens to all the AWQ (thebloke) models I've tried EDIT: try ticking no_inject_fused_attention |
Thanks ticking no_inject_fused_attention works. But as far as I understand it this slows it down a bit. The speed is still high but I think this needs fixing especially because it worked on the older versions. (Should have done a backup before updating) Maybe I will look into how to go back to an older version using git if that is possible. Sadly this is still just a workaround not a real fix. |
Agreed, it seems to ruin the output quality aswell. Bumping |
As far as I understand it it just slows the process. But my understanding is very limited. |
Yes. |
I also have the exact same problem. The quality of model output is noticeably worse for me. Speed is actually about the same after checking the infused attention do-dad. |
Same issue here, are there solid alternatives to using AWQ (thebloke) models? |
For quality, use gguf, for speed, use exl2. |
Any recommended gguf models for a 24gb 3090 ti + 32gb of RAM? Focused on comprehensive roleplay. Direct sources would be great. |
you mentioned bug appeared after update. Did you try to install older snapshot? What version was you before updating? |
Have not tested it with an older version. Installing it new does not work. I am not that well versed using GitHub and have not looked into how to install an older version. Would be great if you give me a short tutorial on how to do that. |
Had the same problem, switched to GPTQ. |
I just think there are not that many people who I think we are not doing something wrong. It seems to be happening with everybody who tries the bloke models on an updated installation You can try and replicate it too prove that it can work. |
Sorry, accidentally tapped the close issue button |
I deleted rocm itself from my system like this: (uninstalling chapter) Oh, you have NVIDEA. But still, seems like some necessary parts of a system are not detected. |
Ticking no_inject_fused_attention is a workaround that makes it work but also makes the output worse. We get the error message on default settings. If I understand your message right than you thought we only get the error with it ticked, but it is the other way around. |
Agree with above. With it unticked/not checked it throws the error and does
not work. Ticking it aka turning it off makes it work but the out put is
very low quality. I tried rolling back to previous version but it did not
work. Same error.
…On Sat, Apr 6, 2024, 9:35 AM VinPre ***@***.***> wrote:
I deleted rocm itself from my system like this: (uninstalling chapter)
https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/native-install/ubuntu.html#
And yet models are loading, and text is generating with
no_inject_fused_attention ticked. So, maybe the issue is connected to rocm
wrong installation.
Oh, you have NVIDEA. But still, seems like some necessary parts of a
system are not detected.
Ticking no_inject_fused_attention is a workaround that makes it work but
also makes the output worse.
We get the error message on default settings.
If I understand your message right than you thought we only get the error
with it ticked, but it is the other way around.
—
Reply to this email directly, view it on GitHub
<#5778 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANKWAE3R27CUEWXJH6GOGA3Y4AI5NAVCNFSM6AAAAABFQSXYEWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBRGEZDAMRSGY>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Can you try going back to a version from January? That's what I had before I updated.
|
Yes. I will roll back to a January version and report back.
…On Sat, Apr 6, 2024, 9:47 AM VinPre ***@***.***> wrote:
Can you try going back to a version from January? That's what I had before
I updated.
Agree with above. With it unticked/not checked it throws the error and
does not work. Ticking it aka turning it off makes it work but the out put
is very low quality. I tried rolling back to previous version but it did
not work. Same error.
—
Reply to this email directly, view it on GitHub
<#5778 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANKWAEZP324LM6VOHP6SJBDY4AKHJAVCNFSM6AAAAABFQSXYEWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBRGEZDENBWGQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
My previous rollback was to the version right before the most recent update.
…On Sat, Apr 6, 2024, 9:47 AM VinPre ***@***.***> wrote:
Can you try going back to a version from January? That's what I had before
I updated.
Agree with above. With it unticked/not checked it throws the error and
does not work. Ticking it aka turning it off makes it work but the out put
is very low quality. I tried rolling back to previous version but it did
not work. Same error.
—
Reply to this email directly, view it on GitHub
<#5778 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANKWAEZP324LM6VOHP6SJBDY4AKHJAVCNFSM6AAAAABFQSXYEWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBRGEZDENBWGQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Yeah that's to recent I would say.
Ok great |
No, my problem is the same as yours, ticking that flag let generate text, but i don't wanna tick it. And what is strange is that ticking that box allowed me to generate text even without ROCm. I tried to roll back and got nothing. Well, actually, i got a bunch of different errors. Maybe i just couldn't make it right. |
Switched to trying gguf model. There another current bug with those ones. (#5812) |
I'm having the exact same issue. Turning the mentioned option off, also works. Plus other models such as GGUF. Freshly installed it last week. |
GGUF is premium tier quality depending on the model and what you look for (roleplay in OP's case) but it is painfully slow sadly (especially evaluation) |
Same here, with AWQ models |
Small trick I found for GGUF that lowers quality though is changing the context size to something lower. |
I also have this issue, I'm not sure if it's an issue with this itself, or a bug in one of the updated libraries that was also updated when I ran the update_windows.bat file. Unfortunately I also didn't make a back-up before running the update (lesson learned ><) but I'm fairly proficient in GIT, so I will try and disable to GIT/Code update then re-set my version back to an older one and report back. |
I downloaded "text-generation-webui-snapshot-2024-02-11" (that was roughly when I first installed the tool). I commented out line 271 in onclick.py to prevent it from updating Then I ran the normal start_windows.bat and let it download/install all the libraries and such as expected. After that I was able to load the same AWQ model that threw the error on the later versions and chat without errors. The model used was "TheBloke_Wizard-Vicuna-7B-Uncensored-AWQ". In the latest version, this model will load but when you attempt to chat it will give the error "AttributeError: 'LlamaLikeModel' object has no attribute 'layers'" as others mentioned above. When loading the exact same model (just copied from models folder) I know there are a lot of snapshots between the 2024-02-11 and latest, but I wanted to go back to a version that I was super confident worked. |
So @jgreiner1024 has found a good workaround people can try if they want to revert back and use AWQ on the older versions. I will keep the issue open and hope that somebody will fix this for future versions so people don't need to go through reverting back to older versions. |
What even is the difference between that causes the error? |
Looks like nobody here knows. Was hoping for somebody able to identify the problem to arrive here but that has not happened yet. |
@VinPre I'm going to find out exactly what snapshot it breaks this weekend, I'll work my way through them until it breaks then do a diff on the snapshot and the previous snapshot, that should help identify at least where to start looking. |
That will be inefficient, use binary search instead |
yes that was my plan, working through them now, wish me luck! lol |
Ok |
Same issue here. Godspeed o7 |
I have been working on snapshot 2024-03-10, there's a second git pull somewhere that only happens during the conda download and install, if I copy over the existing conda install it won't update the code. With my other testing so far, I believe it may be a dependency issue, not actually an issue with the code inside this git repos, but I haven't confirmed that yet as I ran into this second git pull that updates the code on me. I was hoping to have an answer but the second git pull threw me for a loop as it was a while before I noticed it was happening and it messed up a lot of my testing. I'm taking a break for the night and will try and look into it more tomorrow. PS. if someone knows where the git pull that happens during the conda download/install could comment that would be very helpful so I don't have to chase it down tomorrow. |
Chiming in to say I'm also seeing this behavior. Did a fresh install of Windows today (unrelated) and so I cloned the latest repository of OB and most things work as expected, however AWQ does throw 'LlamaLikeModel' object has no attribute 'layers' when trying to generate a response and no_inject_fused_attention is not ticked. This was working on a previous build but I believe that one was originally from like February 2024. |
I've fixed this locally by changing line 1250 in modeling_llama.py (see the traceback above for the full path) to this: I doubt it's the proper fix, I'm not familiar with this code, it's just what I've found by looking at LlamaLikeModel class. It's probably some incompatibility between transformers and awq module versions. Anyway, with this change I can get responses from AWQ models. Maybe it'll be useful for someone to get this fixed properly. |
I ticked the
I tried this and it worked for me. I was only getting one model to work from the "Bloke" collection. Its was a Mistral one. |
I think this is an AutoAWQ issue and needs to be reported to AutoAWQ github.com/casper-hansen/AutoAWQ I did post a discussion thread in Q&A casper-hansen/AutoAWQ#462 |
Also having this issue. |
Describe the bug
I am not able to generate text using AWQ models since i updated.
I am able to load the model but once I write something no text gets returned and the console displays an AttributeError.
Is there an existing issue for this?
Reproduction
Update oobabooga / Or reinstall it
Load a model like TheBloke_Mythalion-Kimiko-v2-AWQ
Go to chat
Write something to get an answer
Empty answer gets returned
Screenshot
No response
Logs
System Info
The text was updated successfully, but these errors were encountered: