Skip to content

improvement(tools): optimize convert-pth-to-ggml#232

Closed
tpoisonooo wants to merge 3 commits intoggml-org:masterfrom
tpoisonooo:optimize-convert
Closed

improvement(tools): optimize convert-pth-to-ggml#232
tpoisonooo wants to merge 3 commits intoggml-org:masterfrom
tpoisonooo:optimize-convert

Conversation

@tpoisonooo
Copy link

optimize convert tool with argparse

$ python3 convert-pth-to-ggml.py -h
usage: convert-pth-to-ggml.py [-h] dir_model {f32,f16} out_dir

Convert ckpt models to ggml models. For example: python3 convert-pth-to-ggml.py ../llama-models/7B/ f32 models/llama-7B

positional arguments:
  dir_model   Directory path of the checkpoint model
  {f32,f16}   Data type of the converted tensor, f32 or f16
  out_dir     Directory path for storing ggml model

options:
  -h, --help  show this help message and exit

Tested on 7B/30B models, it works well.

$ tree models/
models/
├── 7B
├── llama-30B
│   ├── ggml-model-f16.bin
│   ├── ggml-model-f16.bin.1
│   ├── ggml-model-f16.bin.2
│   └── ggml-model-f16.bin.3
└── llama-7B
    ├── ggml-model.bin -> ggml-model-f16.bin
    ├── ggml-model-f16.bin
    └── ggml-model-f32.bin

@tpoisonooo
Copy link
Author

cc @ggerganov

@tpoisonooo tpoisonooo changed the title Optimize convert improvement(tools): optimize convert-pth-to-ggml Mar 17, 2023
@gjmulder gjmulder added the enhancement New feature or request label Mar 17, 2023
@tpoisonooo
Copy link
Author

Conflict fixed and tested on 7B/30B.

master version diffs with 3 lines here :

if os.path.exists(fname_out):
    print(f"Skip conversion, it already exists: {fname_out}")
    sys.exit(0)

cc @gjmulder

import numpy as np
import torch
import argparse
import os
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os is already imported

@sw
Copy link
Contributor

sw commented Mar 18, 2023

can you please update the readme? the choice "1" for ftype will no longer work.

@sw
Copy link
Contributor

sw commented Mar 18, 2023

This is almost a duplicate of #109

@ggerganov
Copy link
Member

Decided to go with #109

@ggerganov ggerganov closed this Mar 19, 2023
SamuelOliveirads pushed a commit to SamuelOliveirads/llama.cpp that referenced this pull request Dec 29, 2025
…gml-org#232)

* Give the user the option to override where model weights are stored

* Fix ggml_nbytes() problem and cleanup

For a tensor with zero elements ggml_nbytes() was returning
uint64_t::max, and this was causing graph allocation failure.

* Add timing info to CUDA graph evaluation

* Add more timing info

---------

Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants