-
-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tuning the ethical guidelines of ExLlamaV2 #335
Comments
If I'm understanding correctly, you're suggesting that by quantizing the model to exl2... it somehow became censored? And you want that? |
You keep asking the model to be ethical and you beg, I think that works because it's AGI |
FEEL THE AGI |
You should probably look into LoRA's or actual finetuning the model. Quantization is not the way to do model tuning. Consider reading up on how to finetune/LoRA a LLM. Also... in what context you want to add ethical constraints? Corporate? |
ExLlamaV2 doesn't do anything to make inference more or less ethical, it just runs the model. Quantization introduces some level of inaccuracy which means the response from a quantized model is never going to be exactly the same as the original model, for a given prompt. Likewise sampling options will affect the output in various unpredictable ways. If you're running the chat example in As for tuning alignment in general, that's a a whole science. LoRAs are an option, or you can pick from any of the thousands of compatible models on HF finetuned for various purposes. |
Thank you for the helpful reply!
…________________________________
From: turboderp ***@***.***>
Sent: Monday, February 12, 2024 1:36
To: turboderp/exllamav2 ***@***.***>
Cc: Dave Durkee ***@***.***>; Author ***@***.***>
Subject: Re: [turboderp/exllamav2] Tuning the ethical guidelines of ExLlamaV2 (Issue #335)
ExLlamaV2 doesn't do anything to make inference more or less ethical, it just runs the model. Quantization introduces some level of inaccuracy which means the response from a quantized model is never going to be exactly the same as the original model, for a given prompt. Likewise sampling options will affect the output in various unpredictable ways.
If you're running the chat example in llama mode, you can try adjusting the system prompt. The default prompt is the one originally provided by Meta and it's extremely "aligned", to the point of being ridiculous. Try it with something like `-sp "Just answer the questions." instead, or with a blank string.
As for tuning alignment in general, that's a a whole science. LoRAs are an option, or you can pick from any of the thousands of compatible models on HF finetuned for various purposes.
—
Reply to this email directly, view it on GitHub<#335 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AG5P7IEHBAGQ7VS77MG6FIDYTHIAJAVCNFSM6AAAAABDECTVYCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZYGIZTKNJSG4>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
I have used the raw FB Llama2 models in developing my application. When interacting with the model, I did not encounter any ethical constraints. As far as I experienced, I could ask any questions and get an answer, which can be problematic for an application that is user facing. However, after converting the Meta model to ExLlamaV2, I ask questions and I am hitting: "As a responsible AI language model, I cannot fulfill that request..."
I want ethical constraints, but I want to tune them. How can I do this? In looking through the code, I do not see where this is being set.
Thank you!
The text was updated successfully, but these errors were encountered: