Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'LlamaLikeModel' object has no attribute 'layers' #5778

Open
1 task done
VinPre opened this issue Mar 31, 2024 · 43 comments
Open
1 task done

AttributeError: 'LlamaLikeModel' object has no attribute 'layers' #5778

VinPre opened this issue Mar 31, 2024 · 43 comments
Labels
bug Something isn't working

Comments

@VinPre
Copy link

VinPre commented Mar 31, 2024

Describe the bug

I am not able to generate text using AWQ models since i updated.
I am able to load the model but once I write something no text gets returned and the console displays an AttributeError.

Is there an existing issue for this?

  • I have searched the existing issues

Reproduction

Update oobabooga / Or reinstall it

Load a model like TheBloke_Mythalion-Kimiko-v2-AWQ
Go to chat
Write something to get an answer
Empty answer gets returned

Screenshot

No response

Logs

19:36:40-544335 INFO     Starting Text generation web UI
19:36:40-549404 INFO     Loading the extension "gallery"

Running on local URL:  http://127.0.0.1:7860

19:37:27-118196 INFO     Loading "TheBloke_Mythalion-Kimiko-v2-AWQ"
Replacing layers...: 100%|█████████████████████████████████████████████████████████████| 40/40 [00:02<00:00, 14.02it/s]
Fusing layers...: 100%|████████████████████████████████████████████████████████████████| 40/40 [00:00<00:00, 49.76it/s]
19:37:33-908362 INFO     LOADER: "AutoAWQ"
19:37:33-909891 INFO     TRUNCATION LENGTH: 4096
19:37:33-910910 INFO     INSTRUCTION TEMPLATE: "Metharme"
19:37:33-911947 INFO     Loaded the model in 6.79 seconds.
Traceback (most recent call last):
  File "E:\gpt\text-generation-webui\modules\callbacks.py", line 61, in gentask
    ret = self.mfunc(callback=_callback, *args, **self.kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\gpt\text-generation-webui\modules\text_generation.py", line 389, in generate_with_callback
    shared.model.generate(**kwargs)
  File "E:\gpt\text-generation-webui\installer_files\env\Lib\site-packages\awq\models\base.py", line 110, in generate
    return self.model.generate(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\gpt\text-generation-webui\installer_files\env\Lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "E:\gpt\text-generation-webui\installer_files\env\Lib\site-packages\transformers\generation\utils.py", line 1575, in generate
    result = self._sample(
             ^^^^^^^^^^^^^
  File "E:\gpt\text-generation-webui\installer_files\env\Lib\site-packages\transformers\generation\utils.py", line 2694, in _sample
    model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\gpt\text-generation-webui\installer_files\env\Lib\site-packages\transformers\models\llama\modeling_llama.py", line 1250, in prepare_inputs_for_generation
    past_key_values = getattr(getattr(self.model.layers[0], "self_attn", {}), "past_key_value", None)
                                      ^^^^^^^^^^^^^^^^^
  File "E:\gpt\text-generation-webui\installer_files\env\Lib\site-packages\torch\nn\modules\module.py", line 1688, in __getattr__
    raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'LlamaLikeModel' object has no attribute 'layers'
Output generated in 0.26 seconds (0.00 tokens/s, 0 tokens, context 72, seed 234193679)

System Info

NVIDIA geforce 3090ti
32GB ram
7800x3d
Windows 10
@VinPre VinPre added the bug Something isn't working label Mar 31, 2024
@realcoloride
Copy link

realcoloride commented Apr 1, 2024

Bumping this, happens to all the AWQ (thebloke) models I've tried

EDIT: try ticking no_inject_fused_attention

@VinPre
Copy link
Author

VinPre commented Apr 1, 2024

Bumping this, happens to all the AWQ (thebloke) models I've tried

EDIT: try ticking no_inject_fused_attention

Thanks ticking no_inject_fused_attention works. But as far as I understand it this slows it down a bit. The speed is still high but I think this needs fixing especially because it worked on the older versions.

(Should have done a backup before updating) Maybe I will look into how to go back to an older version using git if that is possible.

Sadly this is still just a workaround not a real fix.

@realcoloride
Copy link

Agreed, it seems to ruin the output quality aswell. Bumping

@VinPre
Copy link
Author

VinPre commented Apr 2, 2024

Agreed, it seems to ruin the output quality aswell. Bumping

As far as I understand it it just slows the process. But my understanding is very limited.
Do you mean the quality also suffers?

@realcoloride
Copy link

Yes.

@calmtortoise
Copy link

I also have the exact same problem. The quality of model output is noticeably worse for me. Speed is actually about the same after checking the infused attention do-dad.

@JaiQiBoi
Copy link

JaiQiBoi commented Apr 4, 2024

Same issue here, are there solid alternatives to using AWQ (thebloke) models?

@realcoloride
Copy link

For quality, use gguf, for speed, use exl2.

@VinPre
Copy link
Author

VinPre commented Apr 6, 2024

For quality, use gguf, for speed, use exl2.

Any recommended gguf models for a 24gb 3090 ti + 32gb of RAM? Focused on comprehensive roleplay. Direct sources would be great.

@Kitterss
Copy link

Kitterss commented Apr 6, 2024

you mentioned bug appeared after update. Did you try to install older snapshot? What version was you before updating?
hard to believe that this bug is really appeared for everyone, and only 5 people reacted. Maybe we just did something wrong

@VinPre
Copy link
Author

VinPre commented Apr 6, 2024

you mentioned bug appeared after update. Did you try to install older snapshot? What version was you before updating? hard to believe that this bug is really appeared for everyone, and only 5 people reacted. Maybe we just did something wrong

Have not tested it with an older version. Installing it new does not work.

I am not that well versed using GitHub and have not looked into how to install an older version. Would be great if you give me a short tutorial on how to do that.

@feldgendler
Copy link

Had the same problem, switched to GPTQ.

@VinPre
Copy link
Author

VinPre commented Apr 6, 2024

you mentioned bug appeared after update. Did you try to install older snapshot? What version was you before updating? hard to believe that this bug is really appeared for everyone, and only 5 people reacted. Maybe we just did something wrong

I just think there are not that many people who
-use the bloke model
-did an update
-search for the solution and find this

I think we are not doing something wrong. It seems to be happening with everybody who tries the bloke models on an updated installation

You can try and replicate it too prove that it can work.

@VinPre VinPre closed this as completed Apr 6, 2024
@VinPre VinPre reopened this Apr 6, 2024
@VinPre
Copy link
Author

VinPre commented Apr 6, 2024

Sorry, accidentally tapped the close issue button

@Kitterss
Copy link

Kitterss commented Apr 6, 2024

I deleted rocm itself from my system like this: (uninstalling chapter)
https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/native-install/ubuntu.html#
And yet models are loading, and text is generating with no_inject_fused_attention ticked.
So, maybe the issue is connected to rocm wrong installation.

Oh, you have NVIDEA. But still, seems like some necessary parts of a system are not detected.

@VinPre
Copy link
Author

VinPre commented Apr 6, 2024

I deleted rocm itself from my system like this: (uninstalling chapter) https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/native-install/ubuntu.html# And yet models are loading, and text is generating with no_inject_fused_attention ticked. So, maybe the issue is connected to rocm wrong installation.

Oh, you have NVIDEA. But still, seems like some necessary parts of a system are not detected.

Ticking no_inject_fused_attention is a workaround that makes it work but also makes the output worse.

We get the error message on default settings.

If I understand your message right than you thought we only get the error with it ticked, but it is the other way around.

@calmtortoise
Copy link

calmtortoise commented Apr 6, 2024 via email

@VinPre
Copy link
Author

VinPre commented Apr 6, 2024

Can you try going back to a version from January? That's what I had before I updated.

Agree with above. With it unticked/not checked it throws the error and does not work. Ticking it aka turning it off makes it work but the out put is very low quality. I tried rolling back to previous version but it did not work. Same error.

@calmtortoise
Copy link

calmtortoise commented Apr 6, 2024 via email

@calmtortoise
Copy link

calmtortoise commented Apr 6, 2024 via email

@VinPre
Copy link
Author

VinPre commented Apr 6, 2024

My previous rollback was to the version right before the most recent update.

Yeah that's to recent I would say.

Yes. I will roll back to a January version and report back.

Ok great

@Kitterss
Copy link

Kitterss commented Apr 6, 2024

I deleted rocm itself from my system like this: (uninstalling chapter) https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/native-install/ubuntu.html# And yet models are loading, and text is generating with no_inject_fused_attention ticked. So, maybe the issue is connected to rocm wrong installation.
Oh, you have NVIDEA. But still, seems like some necessary parts of a system are not detected.

Ticking no_inject_fused_attention is a workaround that makes it work but also makes the output worse.

We get the error message on default settings.

If I understand your message right than you thought we only get the error with it ticked, but it is the other way around.

No, my problem is the same as yours, ticking that flag let generate text, but i don't wanna tick it. And what is strange is that ticking that box allowed me to generate text even without ROCm.

I tried to roll back and got nothing. Well, actually, i got a bunch of different errors. Maybe i just couldn't make it right.

@Kitterss
Copy link

Kitterss commented Apr 7, 2024

Switched to trying gguf model. There another current bug with those ones. (#5812)
But that thread gave me idea, why we can't rollback. #5812 (comment)
The one_click.py (which is loaded after loading start_*****.sh) has an update function in it. So one does not simply rollback. Some tricks needed.

@ruizcrp
Copy link

ruizcrp commented Apr 8, 2024

you mentioned bug appeared after update. Did you try to install older snapshot? What version was you before updating? hard to believe that this bug is really appeared for everyone, and only 5 people reacted. Maybe we just did something wrong

I'm having the exact same issue. Turning the mentioned option off, also works. Plus other models such as GGUF. Freshly installed it last week.

@realcoloride
Copy link

GGUF is premium tier quality depending on the model and what you look for (roleplay in OP's case) but it is painfully slow sadly (especially evaluation)

@EvgeneKuklin
Copy link

Same here, with AWQ models

@realcoloride
Copy link

Small trick I found for GGUF that lowers quality though is changing the context size to something lower.

@jgreiner1024
Copy link

I also have this issue, I'm not sure if it's an issue with this itself, or a bug in one of the updated libraries that was also updated when I ran the update_windows.bat file. Unfortunately I also didn't make a back-up before running the update (lesson learned ><) but I'm fairly proficient in GIT, so I will try and disable to GIT/Code update then re-set my version back to an older one and report back.

@jgreiner1024
Copy link

I downloaded "text-generation-webui-snapshot-2024-02-11" (that was roughly when I first installed the tool). I commented out line 271 in onclick.py to prevent it from updating
(for non-programmers just add a pound sign in front of this linelike below)
#run_cmd("git pull --autostash", assert_success=True, environment=True)

Then I ran the normal start_windows.bat and let it download/install all the libraries and such as expected.

After that I was able to load the same AWQ model that threw the error on the later versions and chat without errors.

The model used was "TheBloke_Wizard-Vicuna-7B-Uncensored-AWQ".

In the latest version, this model will load but when you attempt to chat it will give the error "AttributeError: 'LlamaLikeModel' object has no attribute 'layers'" as others mentioned above. When loading the exact same model (just copied from models folder)

I know there are a lot of snapshots between the 2024-02-11 and latest, but I wanted to go back to a version that I was super confident worked.

@VinPre
Copy link
Author

VinPre commented Apr 12, 2024

I downloaded "text-generation-webui-snapshot-2024-02-11" (that was roughly when I first installed the tool). I commented out line 271 in onclick.py to prevent it from updating (for non-programmers just add a pound sign in front of this linelike below) #run_cmd("git pull --autostash", assert_success=True, environment=True)

Then I ran the normal start_windows.bat and let it download/install all the libraries and such as expected.

After that I was able to load the same AWQ model that threw the error on the later versions and chat without errors.

The model used was "TheBloke_Wizard-Vicuna-7B-Uncensored-AWQ".

In the latest version, this model will load but when you attempt to chat it will give the error "AttributeError: 'LlamaLikeModel' object has no attribute 'layers'" as others mentioned above. When loading the exact same model (just copied from models folder)

I know there are a lot of snapshots between the 2024-02-11 and latest, but I wanted to go back to a version that I was super confident worked.

So @jgreiner1024 has found a good workaround people can try if they want to revert back and use AWQ on the older versions. I will keep the issue open and hope that somebody will fix this for future versions so people don't need to go through reverting back to older versions.

@realcoloride
Copy link

What even is the difference between that causes the error?

@VinPre
Copy link
Author

VinPre commented Apr 12, 2024

What even is the difference between that causes the error?

Looks like nobody here knows. Was hoping for somebody able to identify the problem to arrive here but that has not happened yet.

@jgreiner1024
Copy link

@VinPre I'm going to find out exactly what snapshot it breaks this weekend, I'll work my way through them until it breaks then do a diff on the snapshot and the previous snapshot, that should help identify at least where to start looking.

@EvgeneKuklin
Copy link

EvgeneKuklin commented Apr 12, 2024

@jgreiner1024 I'll work my way through them until it breaks

That will be inefficient, use binary search instead

@jgreiner1024
Copy link

@jgreiner1024 I'll work my way through them until it breaks

That will be inefficient, use binary search instead

yes that was my plan, working through them now, wish me luck! lol

@VinPre
Copy link
Author

VinPre commented Apr 12, 2024

yes that was my plan, working through them now, wish me luck! lol

Ok

@architectdrone
Copy link

@jgreiner1024 I'll work my way through them until it breaks

That will be inefficient, use binary search instead

yes that was my plan, working through them now, wish me luck! lol

Same issue here. Godspeed o7

@jgreiner1024
Copy link

I have been working on snapshot 2024-03-10, there's a second git pull somewhere that only happens during the conda download and install, if I copy over the existing conda install it won't update the code.

With my other testing so far, I believe it may be a dependency issue, not actually an issue with the code inside this git repos, but I haven't confirmed that yet as I ran into this second git pull that updates the code on me.

I was hoping to have an answer but the second git pull threw me for a loop as it was a while before I noticed it was happening and it messed up a lot of my testing. I'm taking a break for the night and will try and look into it more tomorrow.

PS. if someone knows where the git pull that happens during the conda download/install could comment that would be very helpful so I don't have to chase it down tomorrow.

@clrabbit
Copy link

Chiming in to say I'm also seeing this behavior. Did a fresh install of Windows today (unrelated) and so I cloned the latest repository of OB and most things work as expected, however AWQ does throw 'LlamaLikeModel' object has no attribute 'layers' when trying to generate a response and no_inject_fused_attention is not ticked. This was working on a previous build but I believe that one was originally from like February 2024.

@gilith1
Copy link

gilith1 commented Apr 18, 2024

I've fixed this locally by changing line 1250 in modeling_llama.py (see the traceback above for the full path) to this:
past_key_values = getattr(getattr(self.model.blocks[0], "self_attn", {}), "past_key_value", None)

I doubt it's the proper fix, I'm not familiar with this code, it's just what I've found by looking at LlamaLikeModel class. It's probably some incompatibility between transformers and awq module versions. Anyway, with this change I can get responses from AWQ models. Maybe it'll be useful for someone to get this fixed properly.

@HaysianSmelt
Copy link

I ticked the

Bumping this, happens to all the AWQ (thebloke) models I've tried

EDIT: try ticking no_inject_fused_attention

I tried this and it worked for me. I was only getting one model to work from the "Bloke" collection. Its was a Mistral one.

@alok-abhishek
Copy link

I think this is an AutoAWQ issue and needs to be reported to AutoAWQ github.com/casper-hansen/AutoAWQ

I did post a discussion thread in Q&A casper-hansen/AutoAWQ#462

@jllrr
Copy link

jllrr commented May 30, 2024

Also having this issue.
But commenting out line 271 as suggested doesn't work for me; it just freezes on the miniconda install. How do I rollback to an older version without it properly updating to the main branch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests