Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is wizard 2 working with llamacpp? #6691

Open
mirek190 opened this issue Apr 15, 2024 · 7 comments
Open

Is wizard 2 working with llamacpp? #6691

mirek190 opened this issue Apr 15, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@mirek190
Copy link

Is wizard 2 working with llamacp?
On the paper looks insane... better than gpt-4 , mistal large , claudie 3 sonet, etc

@mirek190 mirek190 added the enhancement New feature or request label Apr 15, 2024
@sorasoras
Copy link

it should work. They're a family of finetune based on mixtral and mistral7b

@Jeximo
Copy link

Jeximo commented Apr 15, 2024

Is wizard 2 working with llamacp?

Yes, it works. I think this is how to run it in main:

./main -m ~/WizardLM-2-7B-IQ4_XS.gguf -i --interactive-first --penalize-nl --temp 0 --in-prefix "USER: " --in-suffix "ASSISTANT: " -p "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: Hi ASSISTANT: Hello.</s> USER: Who are you? ASSISTANT: I am WizardLM.</s>"

@21stcaveman
Copy link

./main -m ~/WizardLM-2-7B-IQ4_XS.gguf -i --interactive-first --penalize-nl --temp 0 --in-prefix "USER: " --in-suffix "ASSISTANT: " -p "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: Hi ASSISTANT: Hello.</s> USER: Who are you? ASSISTANT: I am WizardLM.</s>"

This only generates "##########" for me! (a bunch of pound signs)

@4onen
Copy link

4onen commented Apr 16, 2024

./main -m ~/WizardLM-2-7B-IQ4_XS.gguf -i --interactive-first --penalize-nl --temp 0 --in-prefix "USER: " --in-suffix "ASSISTANT: " -p "A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: Hi ASSISTANT: Hello.</s> USER: Who are you? ASSISTANT: I am WizardLM.</s>"

This only generates "##########" for me! (a bunch of pound signs)

Your prompt ends with an end-of-sequence token and provides no new sequence for the model to begin with. Your in-prefix and in-suffix also don't have the newline characters required by the Vicuna prompt format.

I ran a couple queries using Mikupad and the llama.cpp server and the imatrix quant I pulled from here worked just fine with proper Vicuna formatting.

@Jeximo
Copy link

Jeximo commented Apr 16, 2024

Screenshot_20240416_161036

@4onen Worked for me. Anyway, there's no newlines between the end of input and beggining of Assistant in your link. It's consistent with vicuna prompt template, showing seperation by a space then a stop token.

This only generates "##########" for me! (a bunch of pound signs)

@21stcaveman I'm not sure what caused pound symbols. main should print the prompt, then --interactive-first will take your input.

@4onen
Copy link

4onen commented Apr 16, 2024

My mistake on the prompt formatting. Visually it looks like interactive mode is adding newlines there, but I don't have much experience with local interactive mode because I want to be able to go back and edit both my inputs and the AI output, which last time I checked interactive mode didn't support.

@21stcaveman
Copy link

@21stcaveman I'm not sure what caused pound symbols. main should print the prompt, then --interactive-first will take your input.

Can confirm the original example by @Jeximo works. Problem seems to have been the GGUF copy I had. Download from the link provided by @4onen fixed the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants