New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature]: Control vectors #3451
Comments
@simon-mo @generalsvr I should be able to help with this. Let me know how to start. For more context about control vectors: Representation Engineering: A Top-Down Approach to AI Transparency |
We can achieve this by loading the control vectors when initializing the cache engine and apply the change to |
Something additional to consider is specifying different control vectors (and coefficients) per request which then get stacked into a control matrix with one dimension equal to the batch size. This can be useful when serving users that require different styles of responses at the same time. Not sure about the impact on latency. |
currently working on an implementation by wrapping the decoder layer and changing the forward pass. lmk if you wanna collaborate on this |
@raywanb somethingworth looking into would be also the technique presented here, which might be superior in some regards: https://www.lesswrong.com/posts/jGuXSZgv6qfdhMCuJ/refusal-in-llms-is-mediated-by-a-single-direction It comes with a nice colab as well: https://colab.research.google.com/drive/1a-aQvKC9avdZpdyBn4jgRQFObTPy1JZw?usp=sharing&authuser=1 There's a discussion in the comments with the authors of the Represenation Engineering paper. |
It seems that the colab link doesn't work. |
馃殌 The feature, motivation and pitch
Add support for control vectors
See https://github.com/vgel/repeng and ggerganov/llama.cpp#5970
Alternatives
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: