llm_steer-oobabooga

Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vectors, now integrated within the oobabooga text generation webui!

llm_steer, the underlying codebase utilized for this extension, was created by https://github.com/Mihaiii

Note: This extension only works for models loaded using the "transformers" backend.

Installation

pip3 install llm_steer (Make sure pip3 corresponds to the particular pip used by oobabooga, for me it's the pip3 located at /home/(user)/text-generation-webui/installer_files/env/bin/pip3 - otherwise oobabooga won't pick up the installed package)
run oobabooga, and navigate to the session page. Copy and paste the github url (https://github.com/Hellisotherpeople/llm_steer-oobabooga) into the install box and press enter.

Usage

There are three values:

Layer Index (int): Which layer should the steering vector be inserted into?

This is not well understood, but in general, the earlier layers are supposedly more "general" and potentially more "impactful". Results will very

Mistral models usually have at least 24 layers.

Coefficient (float): The intensity of the vector. Gives fully granular control over the impact of the vector. Can be negative.

Steering Text (string): The prompt used for creating the vector.

Set these values and click "Add Steering Vector". Any combination of steering vectors can be used at the same time.

To reset and delete all Steering Vectors, click "Reset Steering Vectors"

To view the currently applied Steering Vectors, click "Get Steering Vectors"

Why is this a big deal?

Several reasons!

You don't consume any tokens this way, leaving the remaining system prompt tokens to have a stronger impact
You can dial the particular intensity/attention of a token up or down, and apply it to any layer or combination of layers that you'd like
Supports negative values of coeffecient, which implements effectively faster "negative prompting" behavior than existing classifier free guidance built into oobabooga.
Makes it pretty easy to implement personalization, or alignment/unalignment.

Further Background on Steering Vectors:

Related ideas/inspiration:

Screenshots

(No vector)

(Add Sad Vector)

(Add Tax Preperation Vector)

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
LICENSE		LICENSE
README.md		README.md
Screenshot from 2024-03-20 16-29-03.png		Screenshot from 2024-03-20 16-29-03.png
Screenshot from 2024-03-20 17-04-16.png		Screenshot from 2024-03-20 17-04-16.png
normal.png		normal.png
requirements.txt		requirements.txt
sad_output.png		sad_output.png
script.py		script.py
set_sad.png		set_sad.png
set_tax_prep.png		set_tax_prep.png
tax_prep_output.png		tax_prep_output.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICENSE

LICENSE

README.md

README.md

Screenshot from 2024-03-20 16-29-03.png

Screenshot from 2024-03-20 16-29-03.png

Screenshot from 2024-03-20 17-04-16.png

Screenshot from 2024-03-20 17-04-16.png

normal.png

normal.png

requirements.txt

requirements.txt

sad_output.png

sad_output.png

script.py

script.py

set_sad.png

set_sad.png

set_tax_prep.png

set_tax_prep.png

tax_prep_output.png

tax_prep_output.png

Repository files navigation

llm_steer-oobabooga

Installation

Usage

Why is this a big deal?

Screenshots

About

Releases

Packages

Languages

License

Hellisotherpeople/llm_steer-oobabooga

Folders and files

Latest commit

History

Repository files navigation

llm_steer-oobabooga

Installation

Usage

Why is this a big deal?

Screenshots

About

Resources

License

Stars

Watchers

Forks

Languages