Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model "t5-base" on the Hub doesn't have a tokenizer #20

Closed
lqiang2003cn opened this issue Jul 26, 2023 · 7 comments
Closed

Model "t5-base" on the Hub doesn't have a tokenizer #20

lqiang2003cn opened this issue Jul 26, 2023 · 7 comments

Comments

@lqiang2003cn
Copy link

hi, thanks for sharing this work.
i followed the instructions and built vima and vima bench successfully. but when i ran command like this:

python3 scripts/example.py --ckpt=2M.ckpt --device=cuda --partition=novel_object_generalization --task=pick_in_order_then_restore

i got the following errors:

pybullet build time: May 20 2022 19:45:31
[INFO] 17 tasks loaded
[2023-07-26T12:45:49Z ERROR cached_path::cache] ETAG fetch for https://huggingface.co/t5-base/resolve/main/tokenizer.json failed with fatal error
Traceback (most recent call last):
File "/home/lq/ws_vima/VIMA/scripts/example.py", line 74, in
tokenizer = Tokenizer.from_pretrained("t5-base")
Exception: Model "t5-base" on the Hub doesn't have a tokenizer

any ideas? thanks in advance.

@lqiang2003cn
Copy link
Author

it turns out to be a network blocking problem and i solved it by using a proxy. Then i tried the command again and the program continued its execution until failed on the following error:

**Traceback (most recent call last):
File "/home/lq/ws_vima/VIMA/scripts/example.py", line 506, in
main(arg)
File "/home/lq/ws_vima/vima_env/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, kwargs)
File "/home/lq/ws_vima/VIMA/scripts/example.py", line 84, in main
policy = create_policy_from_ckpt(cfg.ckpt, cfg.device)
File "/home/lq/ws_vima/VIMA/vima/init.py", line 11, in create_policy_from_ckpt
policy_instance.load_state_dict(
File "/home/lq/ws_vima/vima_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for VIMAPolicy:
Unexpected key(s) in state_dict: "xattn_gpt.h.0.attn.bias".

am i using the wrong pointcheck file? or it doesn't match the code? Thanks

@liuqinglong110
Copy link

it turns out to be a network blocking problem and i solved it by using a proxy. Then i tried the command again and the program continued its execution until failed on the following error:

**Traceback (most recent call last): File "/home/lq/ws_vima/VIMA/scripts/example.py", line 506, in main(arg) File "/home/lq/ws_vima/vima_env/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, kwargs) File "/home/lq/ws_vima/VIMA/scripts/example.py", line 84, in main policy = create_policy_from_ckpt(cfg.ckpt, cfg.device) File "/home/lq/ws_vima/VIMA/vima/init.py", line 11, in create_policy_from_ckpt policy_instance.load_state_dict( File "/home/lq/ws_vima/vima_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for VIMAPolicy: Unexpected key(s) in state_dict: "xattn_gpt.h.0.attn.bias".

am i using the wrong pointcheck file? or it doesn't match the code? Thanks

Yes, I also encountered the same problem. Using other models, similar errors also occur.

@lqiang2003cn
Copy link
Author

in the file init.py in the vima folder, i change line 13 from:
strict = True
to
strict = False,
then the model loaded correctly and the program continued. it's rather a hack and i am not sure the consequence because i didn't debug further into the code.. Maybe there are some missing features because the change. looking forward to the author's response.
@liuqinglong110 you can try it out

@liuqinglong110
Copy link

liuqinglong110 commented Jul 29, 2023

Thank you very much, I can successfully run it using the method you provided. Before that, I changed the init.py in the vima folder to the following:
import os
import torch

from .policy import *

def create_policy_from_ckpt(ckpt_path, device):
assert os.path.exists(ckpt_path), "Checkpoint path does not exist"
checkpoint = torch.load(ckpt_path, map_location=device)
policy_instance = VIMAPolicy(**checkpoint["cfg"])

 state_dict = checkpoint["state_dict"]
 new_state_dict = {k.replace("policy.", ""): v for k, v in state_dict.items()}

 try:
     policy_instance.load_state_dict(new_state_dict, strict=True)
     print("Successfully loaded the status dictionary.")
 except RuntimeError as e:
     print("An error occurred while loading the state dictionary: {}".format(e))
     # output error message
    
     # Traverse the keys of the state dictionary, skip the keys that failed to load
     for key in state_dict.keys():
         try:
             policy_instance.load_state_dict({key.replace("policy.", ""): state_dict[key]}, strict=False)
             print("Successfully loaded key: {}".format(key))
         except:
             print("Unable to load key: {}".format(key))
             # Output keys that could not be loaded

 policy_instance.eval()
 return policy_instance

This method can also run the demo. But obviously this way is not elegant enough. Thanks again.

@lqiang2003cn
Copy link
Author

you are welcome. Closing it, hope it will help people encountering the same problem.

yunfanjiang added a commit that referenced this issue Jul 29, 2023
@yunfanjiang
Copy link
Member

Hi @lqiang2003cn and @liuqinglong110 ,

Thanks for your interest in our project and brining this up to our attention. It turns out that HF Transformers recently changed casual mask in attention block to non-persistent parameter, which means they won't be included in the state dict. For compatibility I just forced to include them in the state dict per this commit d165e53. Feel free to let me know if there are further questions. Thanks.

@Chris-Chow
Copy link

I use the function Tokenizer.from_file('tokenizer.json') instead of Tokenizer.from_pretrained('t5-base') to solve the problem, where tokenizer.json is the tokenzier config of the "t5-base" model which is downloaded in the huggingface of t5-base model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants