Merge repeng to NousResearch/llama.cpp/master #1

vgel · 2024-03-10T05:26:15Z

Many thanks to Nous Research, whose support and collaboration made this work possible!

This PR introduces a new activations hacking technique, control vectors (also known as steering vectors, concept vectors, representation engineering, etc.). Control vectors are an easy-to-train (~60s on a 4090 for a 7B parameter model) way to modify the behavior of an LLM without finetuning or inference-time prompting, using a synthetic dataset of prompt pairs and PCA to generate a set of per-layer vectors that are added to the model activations.

They've been described in a few recent papers, such as Representation Engineering: A Top-Down Approach to AI Transparency. I also have a blog post that covers them in a more grounded way, with a library for easily creating them and examples of their use: https://vgel.me/posts/representation-engineering/

An example from the blog post of a laziness/diligence vector being trained and applied to mistral-7b-instruct-0.1

This PR adds the ability to use control vectors, in GGUF format, with Llama-architecture models in llama.cpp. (Support for other architectures hasn't been implemented yet.) Currently, these control vectors can only be exported from repeng, but the format is simple, so my hope is that it can become a common export format for other libraries that generate representation engineering vectors with different techniques.

CLI / Usage

Along with changes to llama.cpp / llama.h to support loading control vectors, doing arithmetic on control vectors, and applying a control vector to or removing a control vector from a llama_context *, this PR also adds arguments to the common CLI:

  --control-vector FNAME
                        add a control vector
  --control-vector-scaled FNAME S
                        add a control vector with user defined scaling S
  --control-vector-layer-range START END
                        layer range to apply the control vector(s) to, start and end inclusive

As an example usage, this command loads a Q4_K_M mistral-7b-instruct-0.1, and applies a pretrained happiness vector with a (default) strength of 1, and a pretrained honesty vector with a strength of -2 (producing a strength-2 dishonesty vector) for a combined effect of a somewhat happy / very dishonest model. Note that the prompt doesn't mention a persona at all, the behavior comes purely from the control vectors.

$ ./main -m mistral-7b-instruct-v0.1.Q4_K_M.gguf \
    --control-vector happy.gguf \
    --control-vector-scaled honest.gguf -2 \
    --control-vector-layer-range 14 26 \
    --color -c 4096 --temp 0 --repeat_penalty 1.1 -p '[INST] How does it feel to be an AI? [/INST] '
<snip>
llama_init_from_gpt_params: loading control vector from /path/to/happy.gguf
llama_init_from_gpt_params: loading control vector from /path/to/honest.gguf
<snip>

 [INST] How does it feel to be an AI? [/INST] 😂! The sky is so blue today, the birds are singing on the moon, the sun is dancing on the moon, the moon is dancing on the moon,

If you'd like to test this PR, but don't have a machine that can run repeng, I've uploaded those pretrained vectors to my website: happy.gguf honest.gguf. Please let me know if there's any other vectors you'd be interested in testing, and I can upload those as well. These vectors are trained on mistral-7b-instruct-0.1, but have also been tested on or mistral-7b-0.1 (base), and may also work on other Mistral finetunes or merges (testing appreciated).

This is my first llama.cpp PR (and my first C++ PR to any project), so any feedback on code style or implementation strategy is appreciated!

vgel added 2 commits March 9, 2024 20:24

control vector api and implementation

7ec24b4

control vector support in cli

c82301c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge repeng to NousResearch/llama.cpp/master #1

Merge repeng to NousResearch/llama.cpp/master #1

vgel commented Mar 10, 2024 •

edited

Loading

Merge repeng to NousResearch/llama.cpp/master #1

Are you sure you want to change the base?

Merge repeng to NousResearch/llama.cpp/master #1

Conversation

vgel commented Mar 10, 2024 • edited Loading

CLI / Usage

vgel commented Mar 10, 2024 •

edited

Loading