-
Notifications
You must be signed in to change notification settings - Fork 128
Cannot convert bigger models #12
Comments
Ok so it was not how to do it, it returned errors. import torch
import sys
import os
if len(sys.argv) < 3:
print("Usage: python combine_model_files.py <model_directory> <output_file_name>")
sys.exit(1)
model_directory = sys.argv[1]
output_file_name = sys.argv[2]
model_files = [file for file in os.listdir(model_directory) if file.startswith("pytorch_model-") and file.endswith(".bin")]
if not model_files:
print("No model files found in the specified directory.")
sys.exit(1)
combined_state_dict = {}
for model_file in sorted(model_files):
file_path = os.path.join(model_directory, model_file)
partial_state_dict = torch.load(file_path, map_location="cpu")
combined_state_dict.update(partial_state_dict)
output_file_path = os.path.join(model_directory, output_file_name)
torch.save(combined_state_dict, output_file_path)
print(f"Combined model saved as {output_file_path}") The docker instance launched successfully, I could convert and quantize with no problem. |
Hey @PierreFrn Thanks for doing some digging here. It's strange that you were getting that error before. Can I double check what OS you're running and whether you were using python3 with the requirements from Transformers $ conda create -n tbp python=3.10
Collecting package metadata (current_repodata.json): done
Solving environment: done
... Then conda activate tbp
pip install -r turbopilot/requirements.txt Then if I activate the environment and run the script, it loads the shards: python turbopilot/convert-codegen-to-ggml.py ./codegen-6B-multi-gptj 1
Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] I'm guessing there's some discrepancy between operating systems or library behaviour at play here. Thanks for supplying the conversion script you used. I will add this thread to the documentation just in case others run into the same limitation. |
(Leaving open in case you reply and we're able to work out what is going on) |
I am using NixOS, I made a virtual env in a Nix shell as follow : { pkgs ? import <nixpkgs> {} }:
(pkgs.buildFHSUserEnv {
name = "pipzone";
targetPkgs = pkgs: (with pkgs; [
python310
python310Packages.pip
python310Packages.virtualenv
]);
runScript = "bash";
}).env Then :
I will try to see if I can find anything. |
Thanks a lot - let me know if you figure out the weirdness. For now, I documented this thread in the model conversion wiki page. |
Hello,
I wanted to try the bigger models, but these come in several .bin files. When I launch the converter python script I get that it cannot find
python_model.bin
(since there are 4pytorch_model-0000x-of-00004.bin
).Do I have to merge them using
cat *.bin > pytorch_model.bin
(ensuring there is no other bin file in the directory obviously) ?The text was updated successfully, but these errors were encountered: