Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pretrained efficientnet_b0_rwrightman-3dd342df state_dict fails sha256 check #7744

Closed
ptrblck opened this issue Jul 13, 2023 · 20 comments 路 Fixed by #7898
Closed

Pretrained efficientnet_b0_rwrightman-3dd342df state_dict fails sha256 check #7744

ptrblck opened this issue Jul 13, 2023 · 20 comments 路 Fixed by #7898
Labels

Comments

@ptrblck
Copy link
Contributor

ptrblck commented Jul 13, 2023

馃悰 Describe the bug

Originally reported in the forum.

I am able to reproduce the issue using torchvision==0.16.0.dev20230709+cu121:

model = models.efficientnet_b0(pretrained=True)

/home/pbialecki/miniforge3/envs/nightly_pip_cu121/lib/python3.10/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
  warnings.warn(
/home/pbialecki/miniforge3/envs/nightly_pip_cu121/lib/python3.10/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=EfficientNet_B0_Weights.IMAGENET1K_V1`. You can also use `weights=EfficientNet_B0_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
Downloading: "https://download.pytorch.org/models/efficientnet_b0_rwightman-3dd342df.pth" to /home/pbialecki/.cache/torch/hub/checkpoints/efficientnet_b0_rwightman-3dd342df.pth
100%|鈻堚枅鈻堚枅鈻堚枅鈻堚枅鈻堚枅| 20.5M/20.5M [00:00<00:00, 52.8MB/s]
Traceback (most recent call last):

  Cell In[75], line 1
    model = models.efficientnet_b0(pretrained=True)

  File ~/miniforge3/envs/nightly_pip_cu121/lib/python3.10/site-packages/torchvision/models/_utils.py:142 in wrapper
    return fn(*args, **kwargs)

  File ~/miniforge3/envs/nightly_pip_cu121/lib/python3.10/site-packages/torchvision/models/_utils.py:228 in inner_wrapper
    return builder(*args, **kwargs)

  File ~/miniforge3/envs/nightly_pip_cu121/lib/python3.10/site-packages/torchvision/models/efficientnet.py:770 in efficientnet_b0
    return _efficientnet(

  File ~/miniforge3/envs/nightly_pip_cu121/lib/python3.10/site-packages/torchvision/models/efficientnet.py:360 in _efficientnet
    model.load_state_dict(weights.get_state_dict(progress=progress, check_hash=True))

  File ~/miniforge3/envs/nightly_pip_cu121/lib/python3.10/site-packages/torchvision/models/_api.py:90 in get_state_dict
    return load_state_dict_from_url(self.url, *args, **kwargs)

  File ~/miniforge3/envs/nightly_pip_cu121/lib/python3.10/site-packages/torch/hub.py:757 in load_state_dict_from_url
    download_url_to_file(url, cached_file, hash_prefix, progress=progress)

  File ~/miniforge3/envs/nightly_pip_cu121/lib/python3.10/site-packages/torch/hub.py:653 in download_url_to_file
    raise RuntimeError('invalid hash value (expected "{}", got "{}")'

RuntimeError: invalid hash value (expected "3dd342df", got "7f5810bc96def8f7552d5b7e68d53c4786f81167d28291b21c0d90e1fca14934")

Downloading the state_dict manually also confirms the sha256 checksum:

wget https://download.pytorch.org/models/efficientnet_b0_rwightman-3dd342df.pth
sha256sum efficientnet_b0_rwightman-3dd342df.pth 
7f5810bc96def8f7552d5b7e68d53c4786f81167d28291b21c0d90e1fca14934  efficientnet_b0_rwightman-3dd342df.pth

which does not match the one stored in the file name.

Versions

torchvision==0.16.0.dev20230709+cu121

@NicolasHug
Copy link
Member

Thanks for the report @ptrblck . I'll take a look on the S3 bucket and see if the file has changed. It would be strange though, I suspect / hope it was just a typo we introduced originally

@NicolasHug NicolasHug added the bug label Jul 14, 2023
@atalman
Copy link
Contributor

atalman commented Jul 18, 2023

hi @NicolasHug any progress on this ?

@malfet
Copy link
Contributor

malfet commented Jul 18, 2023

Hmm, it was last modified in Aug 2021:

% aws s3 ls s3://pytorch/models/efficientnet_b0_rwightman-3dd342df.pth
2021-08-25 06:11:27   21444401 efficientnet_b0_rwightman-3dd342df.pth
% shasum -a 256 ~/.cache/torch/hub/checkpoints/efficientnet_b0_rwightman-3dd342df.pth
7f5810bc96def8f7552d5b7e68d53c4786f81167d28291b21c0d90e1fca14934  /Users/nshulga/.cache/torch/hub/checkpoints/efficientnet_b0_rwightman-3dd342df.pth

[Edit] Torchvision checked out at 2925df7 works, but latest trunk 29418e3 fails

[Edit2] #7219 added sha-sum checks, but I guess we've never validated that it is correct.

@ptrblck
Copy link
Contributor Author

ptrblck commented Jul 18, 2023

Torchvision checked out at 2925df7 works, but latest trunk 29418e3 fails

Is it because #7219 landed between these commits?

@malfet
Copy link
Contributor

malfet commented Jul 18, 2023

@ptrblck bingo :) So it was like that for a while, but it was not checked before... Running the script to validate published checksums right now...

@NicolasHug
Copy link
Member

NicolasHug commented Aug 29, 2023

Thanks a lot all for the reports and investigations.

I re-uploaded the weights with the proper hashes on S3 and also fixed the URL in #7898.

@ar0ck
Copy link

ar0ck commented Oct 10, 2023

I can still reproduce this issue with torchvision==0.17.0.dev20231009.

@NicolasHug, running your script from #7898 I get:

Following models need to be fixed:
efficientnet_b0 EfficientNet_B0_Weights.IMAGENET1K_V1 invalid hash value (expected "3dd342df", got "7f5810bc96def8f7552d5b7e68d53c4786f81167d28291b21c0d90e1fca14934")
efficientnet_b2 EfficientNet_B2_Weights.IMAGENET1K_V1 invalid hash value (expected "bcdf34b7", got "c35c147384e385a5bab5a8eabdabbe5a3df0487ee4a554108626ae474a5bf755")
efficientnet_b3 EfficientNet_B3_Weights.IMAGENET1K_V1 invalid hash value (expected "cf984f9c", got "b3899882250c22946d0229d266049fcd133c169233530b36b9ffa7983988362f")
efficientnet_b4 EfficientNet_B4_Weights.IMAGENET1K_V1 invalid hash value (expected "7eb33cd5", got "23ab8bcd5bdbef61a7a43b91adcad81f622fd7f36fb4935a569828d77888c44e")
efficientnet_b5 EfficientNet_B5_Weights.IMAGENET1K_V1 invalid hash value (expected "b6417697", got "1a07897c0d357db7981640f6be44a63420f11deb932344a69768b62ebe272946")
efficientnet_b6 EfficientNet_B6_Weights.IMAGENET1K_V1 invalid hash value (expected "c76e70fd", got "24a108a596a00ad522bdfb9a2d98a9bd31819dfd7e37dd96bad6bb5bc00d6015")
efficientnet_b7 EfficientNet_B7_Weights.IMAGENET1K_V1 invalid hash value (expected "dcc49843", got "c5b4e57e0da1a10bb4f8a5b410dd3b7b6e7461f9b757209be29409707c3d87dd")

@NicolasHug
Copy link
Member

Thanks for the report @ar0ck . I'm confused as to how I missed those in the past, but I can reproduce...

We'll fix those in the next [bugfix] release

@marcellobullo
Copy link

Thanks for the effort. Any workaround to fix it temporarily?

@NicolasHug
Copy link
Member

NicolasHug commented Oct 11, 2023

EDIT: This is fixed in torchvision 0.17, just install "torchvision>0.16" and this should be fixed. If you're stuck with 0.16, do as described below


Original comment:

Overriding torchvision.models._api.WeightsEnum.get_state_dict() should do the trick, LMK if it doesn't work

from torchvision.models import efficientnet_b0, EfficientNet_B0_Weights
from torchvision.models._api import WeightsEnum
from torch.hub import load_state_dict_from_url

def get_state_dict(self, *args, **kwargs):
    kwargs.pop("check_hash")
    return load_state_dict_from_url(self.url, *args, **kwargs)
WeightsEnum.get_state_dict = get_state_dict

efficientnet_b0(weights=EfficientNet_B0_Weights.IMAGENET1K_V1)
efficientnet_b0(weights="DEFAULT")

@fafqqeew
Copy link

Overriding torchvision.models._api.WeightsEnum.get_state_dict() did it , thank you .

@dimakpa
Copy link

dimakpa commented Oct 24, 2023

Guys, an error started to occur when loading the model. A week ago everything was loading well

model = efficientnet_b6(pretrained=True)

RuntimeError: invalid hash value (expected "c76e70fd", got "24a108a596a00ad522bdfb9a2d98a9bd31819dfd7e37dd96bad6bb5bc00d6015")

@NicolasHug
Copy link
Member

@dimakpa please see #7744 (comment) until the fix is available in the next bugfix release (planned for mid Nov)

@William171409
Copy link

Thank you very much! Looking forward to the final solution.

@korablique
Copy link

return load_state_dict_from_url(self.url, *args, **kwargs)

What is self.url?

@NicolasHug
Copy link
Member

It's an attribute of the WeightEnum class which will become relevant after doing

WeightsEnum.get_state_dict = get_state_dict

@mustious
Copy link

mustious commented Feb 4, 2024

For anyone who might still be stuck with the same error. The major issue is from torchvision==0.16.0. An easy workaround is to install higher versions or nightly. Tested and worked on torchvision v0.16.1, v0.16.2, v0.17.0.

pip install "torchvision>0.16.0" 

@winnie128
Copy link

I had similar problem when running code in Jupyter Notebook.

I could download efficientnet_b3 model by PyCharm without any issue
model_ft = models.efficientnet_b3(weights=models.EfficientNet_B3_Weights.IMAGENET1K_V1)
And the torchvision version:0.15.2.

I installed torchvision version 0.15.2 for Jupyter Notebook and ran the same model loading. I got error message:

RuntimeError: invalid hash value (expected "cf984f9c", got "b3899882250c22946d0229d266049fcd133c169233530b36b9ffa7983988362f"

@NicolasHug
Copy link
Member

@winnie128 please install a more up-to-date version of torchvision

@winnie128
Copy link

Problem was solved after torchvision version 0.18.0 installed. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.