Skip to content

Conversation

@jooray
Copy link
Contributor

@jooray jooray commented Mar 11, 2023

No description provided.

which can be scripted like this if you are lazy (for 65B model):

```bash
for i in models/65B/ggml-model-f16.bin*;do quantized=`echo "$i" | sed -e 's/f16/q4_0/'`; ./quantize "$i" "$quantized" 2 ;done
Copy link
Contributor

@prusnak prusnak Mar 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sed is not necessary, bash, zsh and other modern shells can perform pattern replacement of a variable:

Suggested change
for i in models/65B/ggml-model-f16.bin*;do quantized=`echo "$i" | sed -e 's/f16/q4_0/'`; ./quantize "$i" "$quantized" 2 ;done
for i in models/65B/ggml-model-f16.bin* ; do ./quantize "$i" "${i/f16/q4_0}" 2 ;done

Copy link

@s-and-witch s-and-witch Mar 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will generate 'models/65B/ggml-model-q4_0/.bin.2' such paths and will fail with errors, the right command (in bash) should be for i in models/65B/ggml-model-f16.bin* ; do ./quantize "$i" "${i/f16/q4_0}" 2 ;done

Copy link
Contributor

@prusnak prusnak Mar 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Player-205 right, updated the suggestion above, thanks

@ggerganov
Copy link
Member

Lets put this in a quantize.sh script that accepts argument like 7B, 13B, etc. and update instructions to just run the script:

source quantize.sh 7B

Should be much easier to follow

@leszekhanusz
Copy link

Note that if the disk space is limited, it is still useful to quantize each file separately so that we could delete each intermediate file in between.
In my case I added a rm command because I did not have enough disk space otherwise:

for i in models/65B/ggml-model-f16.bin* ; do ./quantize "$i" "${i/f16/q4_0}" 2 ; rm "$i"; done

@ggerganov
Copy link
Member

Good point, should have a second parameter for "keep f16" which is on by default

@prusnak
Copy link
Contributor

prusnak commented Mar 13, 2023

Superseded by #92

@ggerganov ggerganov closed this Mar 13, 2023
SlyEcho pushed a commit to SlyEcho/llama.cpp that referenced this pull request Jun 11, 2023
jesusmb1995 pushed a commit to jesusmb1995/llama.cpp that referenced this pull request Sep 29, 2025
QVAC-5545: Use char instead uint8_t for streams
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants