[Feature]: Control vectors #3451

generalsvr · 2024-03-17T01:29:05Z

🚀 The feature, motivation and pitch

Add support for control vectors

See https://github.com/vgel/repeng and ggerganov/llama.cpp#5970

Alternatives

No response

Additional context

No response

justinphan3110 · 2024-04-13T00:48:36Z

@simon-mo @generalsvr I should be able to help with this. Let me know how to start.

For more context about control vectors: Representation Engineering: A Top-Down Approach to AI Transparency

Kaiyang-Chen · 2024-04-15T20:47:52Z

We can achieve this by loading the control vectors when initializing the cache engine and apply the change to forward() of specified QKVLinear layers, but such changes will be added for all models and all kinds of linear method, which introduce extra complexity to the codebase. Do you have any hints on how we can abstract such logic and make the integration clear? @simon-mo

sapountzis · 2024-04-24T22:07:17Z

Something additional to consider is specifying different control vectors (and coefficients) per request which then get stacked into a control matrix with one dimension equal to the batch size.

This can be useful when serving users that require different styles of responses at the same time.

Not sure about the impact on latency.

raywanb · 2024-04-25T20:12:02Z

currently working on an implementation by wrapping the decoder layer and changing the forward pass. lmk if you wanna collaborate on this

DreamGenX · 2024-04-28T17:56:32Z

@raywanb somethingworth looking into would be also the technique presented here, which might be superior in some regards:

https://www.lesswrong.com/posts/jGuXSZgv6qfdhMCuJ/refusal-in-llms-is-mediated-by-a-single-direction

It comes with a nice colab as well: https://colab.research.google.com/drive/1a-aQvKC9avdZpdyBn4jgRQFObTPy1JZw?usp=sharing&authuser=1

There's a discussion in the comments with the authors of the Represenation Engineering paper.

heraclex12 · 2024-04-29T16:49:58Z

@raywanb somethingworth looking into would be also the technique presented here, which might be superior in some regards:

https://www.lesswrong.com/posts/jGuXSZgv6qfdhMCuJ/refusal-in-llms-is-mediated-by-a-single-direction

It cames with a nice colab as well: https://colab.research.google.com/drive/1a-aQvKC9avdZpdyBn4jgRQFObTPy1JZw?usp=sharing&authuser=1

There's a discussion in the comments with the authors of the Represenation Engineering paper.

It seems that the colab link doesn't work.

generalsvr added the feature request label Mar 17, 2024

simon-mo mentioned this issue Apr 4, 2024

[Roadmap] vLLM Roadmap Q2 2024 #3861

Open

64 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Control vectors #3451

[Feature]: Control vectors #3451

generalsvr commented Mar 17, 2024

justinphan3110 commented Apr 13, 2024 •

edited

Kaiyang-Chen commented Apr 15, 2024

sapountzis commented Apr 24, 2024

raywanb commented Apr 25, 2024 •

edited

DreamGenX commented Apr 28, 2024 •

edited

heraclex12 commented Apr 29, 2024

[Feature]: Control vectors #3451

[Feature]: Control vectors #3451

Comments

generalsvr commented Mar 17, 2024

🚀 The feature, motivation and pitch

Alternatives

Additional context

justinphan3110 commented Apr 13, 2024 • edited

Kaiyang-Chen commented Apr 15, 2024

sapountzis commented Apr 24, 2024

raywanb commented Apr 25, 2024 • edited

DreamGenX commented Apr 28, 2024 • edited

heraclex12 commented Apr 29, 2024

justinphan3110 commented Apr 13, 2024 •

edited

raywanb commented Apr 25, 2024 •

edited

DreamGenX commented Apr 28, 2024 •

edited