Update quantize command to properly suffix input file #9

rizenfrmtheashes · 2023-03-13T04:34:26Z

I believe the original code keeps on quantizing the first file, not all files for models that have more than one input quantize file. This update should fix that.

fermigas · 2023-03-13T05:52:14Z

This is one of the reasons that only the 7B model runs.

Another reason may be line 124 in dalai/index.js:

model: ./models/${req.model || "7B"}/ggml-model-q4_0.bin

It doesn't seem to be aware that the larger models have more than one q4 file.

Also, in the form (index.ejs, 57-60), pressing autocomplete results in this request:

socket.emit('request', {
prompt: input.textContent,
n_predict: 256
})

which does not specify which model, so defaults to 7B.

rizenfrmtheashes · 2023-03-13T05:59:26Z

Another reason may be line 124 in dalai/index.js:

model: ./models/${req.model || "7B"}/ggml-model-q4_0.bin

It doesn't seem to be aware that the larger models have more than one q4 file.

Actually this is fine. the underlying llama.cpp package only requires referencing the first model weights file, even if you have multiple, so this should be okay. This is based on my manual testing of the underlying package used, not an e2e test.

As for your latter comment, yeah, it has no way of using anything other than the 7B model when using the quickstart setup. I ended up manually modifying source to solve this problem. I'm not confident enough in my JS skills to make changes across the repo to have that be a selectable option in the webui, and propagating that down as a request to the server, then underlying cpp package call.

marcuswestin · 2023-03-13T11:58:06Z

@rizenfrmtheashes I implemented that web ui in #16 - would love a code review if you're up for it.

And I just verified that the requested model gets propagated all the way down to the model execution call

fermigas · 2023-03-13T12:53:11Z

@marcuswestin a lot of great usability fixes/adds in those commits! LGTM

Update quantize command to properly suffix input file

e0d6ae7

cocktailpeanut merged commit 8093d76 into cocktailpeanut:main Mar 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update quantize command to properly suffix input file #9

Update quantize command to properly suffix input file #9

rizenfrmtheashes commented Mar 13, 2023

fermigas commented Mar 13, 2023

rizenfrmtheashes commented Mar 13, 2023 •

edited

marcuswestin commented Mar 13, 2023 •

edited

fermigas commented Mar 13, 2023

Update quantize command to properly suffix input file #9

Update quantize command to properly suffix input file #9

Conversation

rizenfrmtheashes commented Mar 13, 2023

fermigas commented Mar 13, 2023

rizenfrmtheashes commented Mar 13, 2023 • edited

marcuswestin commented Mar 13, 2023 • edited

fermigas commented Mar 13, 2023

rizenfrmtheashes commented Mar 13, 2023 •

edited

marcuswestin commented Mar 13, 2023 •

edited