Is wizard 2 working with llamacpp? #6691

mirek190 · 2024-04-15T17:08:57Z

Is wizard 2 working with llamacp?
On the paper looks insane... better than gpt-4 , mistal large , claudie 3 sonet, etc

sorasoras · 2024-04-15T17:40:07Z

it should work. They're a family of finetune based on mixtral and mistral7b

Jeximo · 2024-04-15T21:08:10Z

Is wizard 2 working with llamacp?

Yes, it works. I think this is how to run it in main:

./main -m ~/WizardLM-2-7B-IQ4_XS.gguf -i --interactive-first --penalize-nl --temp 0 --in-prefix "USER: " --in-suffix "ASSISTANT: " -p "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: Hi ASSISTANT: Hello.</s> USER: Who are you? ASSISTANT: I am WizardLM.</s>"

21stcaveman · 2024-04-16T17:24:50Z

./main -m ~/WizardLM-2-7B-IQ4_XS.gguf -i --interactive-first --penalize-nl --temp 0 --in-prefix "USER: " --in-suffix "ASSISTANT: " -p "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: Hi ASSISTANT: Hello.</s> USER: Who are you? ASSISTANT: I am WizardLM.</s>"

This only generates "##########" for me! (a bunch of pound signs)

4onen · 2024-04-16T19:44:33Z

./main -m ~/WizardLM-2-7B-IQ4_XS.gguf -i --interactive-first --penalize-nl --temp 0 --in-prefix "USER: " --in-suffix "ASSISTANT: " -p "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: Hi ASSISTANT: Hello.</s> USER: Who are you? ASSISTANT: I am WizardLM.</s>"

This only generates "##########" for me! (a bunch of pound signs)

Your prompt ends with an end-of-sequence token and provides no new sequence for the model to begin with. Your in-prefix and in-suffix also don't have the newline characters required by the Vicuna prompt format.

I ran a couple queries using Mikupad and the llama.cpp server and the imatrix quant I pulled from here worked just fine with proper Vicuna formatting.

Jeximo · 2024-04-16T21:08:17Z

@4onen Worked for me. Anyway, there's no newlines between the end of input and beggining of Assistant in your link. It's consistent with vicuna prompt template, showing seperation by a space then a stop token.

This only generates "##########" for me! (a bunch of pound signs)

@21stcaveman I'm not sure what caused pound symbols. main should print the prompt, then --interactive-first will take your input.

4onen · 2024-04-16T21:12:54Z

My mistake on the prompt formatting. Visually it looks like interactive mode is adding newlines there, but I don't have much experience with local interactive mode because I want to be able to go back and edit both my inputs and the AI output, which last time I checked interactive mode didn't support.

21stcaveman · 2024-04-16T21:24:55Z

@21stcaveman I'm not sure what caused pound symbols. main should print the prompt, then --interactive-first will take your input.

Can confirm the original example by @Jeximo works. Problem seems to have been the GGUF copy I had. Download from the link provided by @4onen fixed the issue.

mirek190 added the enhancement New feature or request label Apr 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is wizard 2 working with llamacpp? #6691

Is wizard 2 working with llamacpp? #6691

mirek190 commented Apr 15, 2024

sorasoras commented Apr 15, 2024

Jeximo commented Apr 15, 2024 •

edited

21stcaveman commented Apr 16, 2024

4onen commented Apr 16, 2024

Jeximo commented Apr 16, 2024

4onen commented Apr 16, 2024

21stcaveman commented Apr 16, 2024

Is wizard 2 working with llamacpp? #6691

Is wizard 2 working with llamacpp? #6691

Comments

mirek190 commented Apr 15, 2024

sorasoras commented Apr 15, 2024

Jeximo commented Apr 15, 2024 • edited

21stcaveman commented Apr 16, 2024

4onen commented Apr 16, 2024

Jeximo commented Apr 16, 2024

4onen commented Apr 16, 2024

21stcaveman commented Apr 16, 2024

Jeximo commented Apr 15, 2024 •

edited