Skip to content

steering-vectors/steering-vectors

Repository files navigation

Steering Vectors

ci Codecov PyPI

Steering vectors / representation engineering for transformer language models in Pytorch / Huggingface

Check out our example notebook. Open In Colab

Full docs: https://steering-vectors.github.io/steering-vectors

About

This library provides utilies for training and applying steering vectors to language models (LMs) from Huggingface, like GPT, LLaMa, Gemma, Mistral, Pythia, and many more!

For more info on steering vectors and representation engineering, check out the following work:

Installation

pip install steering-vectors

Check out the full documentation for more usage info.

Contributing

Any contributions to improve this project are welcome! Please open an issue or pull request in this repo with any bugfixes / changes / improvements you have.

This project uses Ruff for code formatting and linting, MyPy for type checking, and Pytest for tests. Make sure any changes you submit pass these code checks in your PR. If you have trouble getting these to run feel free to open a pull-request regardless and we can discuss further in the PR.

License

This code is released under a MIT license.