Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to load .safetensors as state dict with error from torch.frombuffer in safetensors.torch.load #442

Open
1 of 2 tasks
RunDevelopment opened this issue Feb 16, 2024 · 9 comments

Comments

@RunDevelopment
Copy link

System Info

OS: Win 10 64 bit
Python: 3.9.13
SafeTensors: 0.4.2

Information

  • The official example scripts
  • My own modified scripts

Reproduction

Loading the attached failed.safetensors file with safetensors.torch.load_file directly works, but reading the file into a bytes object first and then loading it with safetensors.torch.load fails.

I get the following error:

Traceback (most recent call last):
  File "C:\Users\micha\Git\spandrel\test.py", line 10, in <module>
    state_dict = safetensors.torch.load(b)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python39\lib\site-packages\safetensors\torch.py", line 338, in load
    return _view2torch(flat)
  File "C:\Users\micha\AppData\Local\Programs\Python\Python39\lib\site-packages\safetensors\torch.py", line 386, in _view2torch
    arr = torch.frombuffer(v["data"], dtype=dtype).reshape(v["shape"])
ValueError: both buffer length (0) and count (-1) must not be 0

Same error with pytest formatting:

    def _view2torch(safeview) -> Dict[str, torch.Tensor]:
        result = {}
        for k, v in safeview:
            dtype = _getdtype(v["dtype"])
>           arr = torch.frombuffer(v["data"], dtype=dtype).reshape(v["shape"])
E           ValueError: both buffer length (0) and count (-1) must not be 0

Steps to reproduce:

  1. Download failed.safetensros
  2. Read failed.safetensros into bytes object.
  3. Call safetensors.torch.load.

I used the following script to get the above error:

import safetensors.torch

file_path = "./failed.safetensors"
with open(file_path, "rb") as f:
    b = f.read()
state_dict = safetensors.torch.load(b)
print(state_dict.keys())

Expected behavior

safetensors.torch.load_file and safetensors.torch.load should produce the same result and load the state dict correctly.

Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Mar 18, 2024
@RunDevelopment
Copy link
Author

Still a problem.

@github-actions github-actions bot removed the Stale label Mar 19, 2024
Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Apr 18, 2024
@RunDevelopment
Copy link
Author

Still a problem.

@github-actions github-actions bot removed the Stale label Apr 19, 2024
@fpgaminer
Copy link

fpgaminer commented May 3, 2024

Ran into this issue as well. Seems like adding a check before (

arr = torch.frombuffer(v["data"], dtype=dtype).reshape(v["shape"])
) would workaround the issue. Check if the buffer is length 0, and if so, create an empty tensor instead of calling frombuffer.

Would the maintainers like a pull request to that effect?

EDIT:

Here's my patched version of that function (works for me, but not fully tested):

def _view2torch(safeview) -> dict[str, torch.Tensor]:
	result = {}
	for k, v in safeview:
		dtype = safetensors.torch._getdtype(v["dtype"])
		if len(v["data"]) == 0:
			assert all(x == 0 for x in v["shape"])
			arr = torch.empty(v["shape"], dtype=dtype)
		else:
			arr = torch.frombuffer(v["data"], dtype=dtype).reshape(v["shape"])
		if sys.byteorder == "big":
			arr = torch.from_numpy(arr.numpy().byteswap(inplace=False))
		result[k] = arr

	return result

Copy link

github-actions bot commented Jun 3, 2024

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Jun 3, 2024
@fpgaminer
Copy link

Bump

@github-actions github-actions bot removed the Stale label Jun 4, 2024
Copy link

github-actions bot commented Jul 4, 2024

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Jul 4, 2024
@fpgaminer
Copy link

Bamp

@github-actions github-actions bot removed the Stale label Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants