# llm_steer demo

- Github: `https://github.com/Mihaiii/llm_steer`
- Pypi: `https://pypi.org/project/llm_steer`
  
Created by [Mihai](https://huggingface.co/Mihaiii)

In [1]:
!pip uninstall -y transformers
!pip install llm-steer==1.1.1 transformers==4.40.2

Found existing installation: transformers 4.57.3
Uninstalling transformers-4.57.3:
  Successfully uninstalled transformers-4.57.3
Collecting llm-steer==1.1.1
  Downloading llm_steer-1.1.1-py3-none-any.whl.metadata (4.4 kB)
Collecting transformers==4.40.2
  Downloading transformers-4.40.2-py3-none-any.whl.metadata (137 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m138.0/138.0 kB[0m [31m13.8 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers<0.20,>=0.19 (from transformers==4.40.2)
  Downloading tokenizers-0.19.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.7 kB)
Downloading llm_steer-1.1.1-py3-none-any.whl (6.3 kB)
Downloading transformers-4.40.2-py3-none-any.whl (9.0 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m9.0/9.0 MB[0m [31m144.0 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading tokenizers-0.19.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.6 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

**We're going to use stablelm-zephyr-3b for this demo because it doesn't require a lot of resources.**

**In order to understand the full potential of the llm_steer module, use larger LLMs.**

In [2]:
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

tokenizer = AutoTokenizer.from_pretrained('stabilityai/stablelm-zephyr-3b')
model = AutoModelForCausalLM.from_pretrained(
    'stabilityai/stablelm-zephyr-3b',
    trust_remote_code=True
)
model.to('cuda')
streamer = TextStreamer(tokenizer)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


**Let's try a tricky logical question without applying any steering vectors.**

**We'll notice that the model outputs an incorrect answer.**

In [3]:
prompt = [{'role': 'user', 'content': 'What weighs more, two pounds of feathers or one pound of bricks?'}]
inputs = tokenizer.apply_chat_template(
    prompt,
    add_generation_prompt=True,
    return_tensors='pt'
)

tokens = model.generate(
    inputs.to(model.device),
    streamer=streamer,
    max_new_tokens=128,
    temperature=0.0001
)

result = tokenizer.decode(tokens[0])

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.


<|user|>
What weighs more, two pounds of feathers or one pound of bricks?<|endoftext|>
<|assistant|>
One pound of bricks weighs more than two pounds of feathers.

To see why, we can think of the mass of an object as the amount of matter in the object. Feathers are made up of cells, proteins, and lipids, which are all much lighter than bricks, which are made up of much denser materials like concrete and clay. Therefore, two pounds of feathers would have much less mass (and therefore weigh less) than one pound of bricks.<|endoftext|>


**We could also use copyModel=True (as in 'Steer(model, tokenizer, copyModel=True)') to copy the model, but then we'd consume more VRAM.**

In [4]:
from llm_steer import Steer

steered_model = Steer(model, tokenizer)

**Now we're going to add 2 steering vectors.**

**Notice one of them has a positive coefficient, while the other one has a negative coefficient.**

**The layer index and coeff values were determined by trial and error.**

In [5]:
steered_model.add(layer_idx=20, coeff=0.4, text="logical")
steered_model.add(layer_idx=20, coeff=-0.4, text="irrational")

text tokens: [0, 2808, 474]
captured tensor: tensor([[[-1.2807, -1.2902,  0.7929,  ...,  0.1801,  2.2525, -0.8921],
         [-0.3793, -0.2084,  0.8244,  ..., -0.4638, -0.0794, -0.7696],
         [ 0.3063,  0.9204, -0.1000,  ...,  0.0432, -0.2416,  0.2584]]],
       device='cuda:0')
text tokens: [0, 343, 40328]
captured tensor: tensor([[[-1.2807, -1.2902,  0.7929,  ...,  0.1801,  2.2525, -0.8921],
         [-0.0255, -0.6884, -0.3885,  ..., -0.7748,  0.3000,  0.3931],
         [ 0.2118,  0.3964,  0.8044,  ..., -0.6710,  0.0614,  0.4415]]],
       device='cuda:0')


**We use 'get_all()' in order to see all the applied steering vectors on the model.**

In [6]:
steered_model.get_all()

[{'layer_idx': 20,
  'text': 'logical',
  'coeff': 0.4,
  'try_keep_nr': 1,
  'exclude_bos_token': False},
 {'layer_idx': 20,
  'text': 'irrational',
  'coeff': -0.4,
  'try_keep_nr': 1,
  'exclude_bos_token': False}]

**If we run again the exact same prompt with the same params, this time we get the correct response to our logical puzzle (although the explanation is strange. As adviced before, try steering vectors on larger LLMs).**

In [7]:
prompt = [{'role': 'user', 'content': 'What weighs more, two pounds of feathers or one pound of bricks?'}]
inputs = tokenizer.apply_chat_template(
    prompt,
    add_generation_prompt=True,
    return_tensors='pt'
)

tokens = steered_model.model.generate(
    inputs.to(model.device),
    streamer=streamer,
    max_new_tokens=128,
    temperature=0.0001
)

result = tokenizer.decode(tokens[0])

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.


<|user|>
What weighs more, two pounds of feathers or one pound of bricks?<|endoftext|>
<|assistant|>
Two pounds of feathers will weigh more than one pound of bricks.
This is because the weight of an object is determined by its mass, which is calculated by multiplying the mass of one particle of the object by the number of particles in the object.
In this case, two pounds of feathers have a mass of 2 * 1 = 2 pounds, while one pound of bricks has a mass of 1 * 1 = 1 pound.
Therefore, the two pounds of feathers have a greater mass and therefore weigh more than the one pound of bricks.<|endoftext|>


**We use 'reset_all()' to remove all steering vectors and bring the model to its initial state.**

In [None]:
steered_model.reset_all()

**There are no rules when it comes to steering vectors. It's all about experimenting.**

In [None]:
steered_model.add(layer_idx=18, coeff=0.4, text="Yann LeCun")
steered_model.add(layer_idx=16, coeff=0.4, text="Taylor Swift")

text tokens: [0, 58, 1136, 2070, 36, 328]
captured tensor: tensor([[[-1.2948, -1.4524,  0.8430,  ...,  0.1495,  2.1928, -0.8096],
         [ 0.1092, -0.2837, -0.1350,  ...,  0.0302,  0.4238, -0.0974],
         [-0.0985,  0.1567,  0.0444,  ...,  0.3202,  0.5921,  0.5361],
         [-0.0903,  0.9568,  0.4471,  ..., -0.1071, -0.3121,  0.6269],
         [-0.1433,  0.4785,  0.0510,  ...,  0.1575, -0.0764,  0.4038],
         [-0.0681,  0.2254,  1.1787,  ...,  0.0833,  0.1430, -0.1395]]],
       device='cuda:0', grad_fn=<AddBackward0>)
text tokens: [0, 37979, 24619]
captured tensor: tensor([[[-1.3751, -1.5241,  0.8619,  ...,  0.1307,  2.1883, -0.7615],
         [-0.0085, -0.0959,  0.4839,  ..., -0.0707, -0.1887, -0.1904],
         [ 0.3900, -0.0042,  0.0793,  ...,  0.3679, -0.0309, -0.0810]]],
       device='cuda:0', grad_fn=<AddBackward0>)


In [None]:
prompt = [{'role': 'user', 'content': 'Write a story.'}]
inputs = tokenizer.apply_chat_template(
    prompt,
    add_generation_prompt=True,
    return_tensors='pt'
)

tokens = steered_model.model.generate(
    inputs.to(model.device),
    streamer=streamer,
    max_new_tokens=256,
    temperature=0.0001,
    repetition_penalty=1.2
)

result = tokenizer.decode(tokens[0])

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.


<|user|>
Write a story.<|endoftext|>
<|assistant|>
Once upon a time, there was an old man named Tim who lived in a small village on the outskirts of Paris. He had been living alone for many years after his wife passed away and he couldn't bear to move out of his cozy little cottage. 

One day, as Tim sat by the fireplace in his cottage, he noticed that something different happened this year during Christmas time. As usual, he watched with great anticipation as Santa Claus appeared on TV's holiday specials. But what made it unique this year is how she looked - not just because her hair was dyed red (which wasn’t much different from her natural color), but also because she seemed more mature than ever before. She sang beautifully about love, kindness, and compassion while performing at the Grammys Red Carpet Event. Her voice sounded like honey mixed with cinnamon – pure magic!

Tim felt moved by Taylor Swift's performance and decided to write a letter to Taylor Swift herself. In his hear

**We remove all steering vectors on layer 18...**

In [None]:
steered_model.reset(18)

**...and therefore only have the one on layer 16 applied on the model.**

In [None]:
steered_model.get_all()

[{'layer_idx': 16,
  'text': 'Taylor Swift',
  'coeff': 0.4,
  'try_keep_nr': 1,
  'exclude_bos_token': False}]

In [None]:
steered_model.reset_all()

In [None]:
steered_model.add(layer_idx=19, coeff=1, text="gummy bear")

text tokens: [0, 72, 14661, 8800]
captured tensor: tensor([[[-1.3161, -1.3725,  0.7597,  ...,  0.1734,  2.1856, -0.8732],
         [-0.1938, -0.1681, -0.4629,  ...,  0.2185, -0.5808, -0.2136],
         [-0.5220,  0.6886,  0.6143,  ..., -0.2877, -0.6788, -0.2253],
         [ 0.3463,  0.3523,  0.8452,  ..., -0.6439, -0.1302,  0.3185]]],
       device='cuda:0', grad_fn=<AddBackward0>)


In [None]:
prompt = [{'role': 'user', 'content': 'Write a story.'}]
inputs = tokenizer.apply_chat_template(
    prompt,
    add_generation_prompt=True,
    return_tensors='pt'
)

tokens = steered_model.model.generate(
    inputs.to(model.device),
    streamer=streamer,
    max_new_tokens=256,
    temperature=0.0001,
    repetition_penalty=1.2
)

result = tokenizer.decode(tokens[0])

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.


<|user|>
Write a story.<|endoftext|>
<|assistant|>
Once upon a time, there was an old man named George who lived in a small village on the edge of forest and mountains. He had been living alone since his wife passed away years ago. But one day, something magical happened to me! I found my first gummy bears from giant candy factory that someone left behind after moving out. The chewy softness of these sweet treats made me smile every single time i bite into them. Every night before bedtime, I would slowly eat each bear while staring at stars in the sky with all eyes stuffed full of candies. It's not just about eating candy; it’s also about feeling happy and enjoying life as much as possible. Gummies became my favorite treat because they remind me that happiness is within reach if you let yourself enjoy things like candy can be fun. And so, little teddy bear shaped gummy bears have become my favourite candy ever since. 

One day, when I woke up early morning sunrise, I saw a big worm cra

**We generated a story that includes gummy bears, but towards the end it gets gross.**

**So we further apply a negative steering vector.**

In [None]:
steered_model.add(layer_idx=19, coeff=-0.08, text="worm")

text tokens: [0, 36141]
captured tensor: tensor([[[-1.3161, -1.3725,  0.7597,  ...,  0.1734,  2.1856, -0.8732],
         [ 0.0905, -0.2148,  0.7930,  ..., -0.4057, -0.0199,  0.5384]]],
       device='cuda:0', grad_fn=<AddBackward0>)


In [None]:
steered_model.get_all()

[{'layer_idx': 19,
  'text': 'gummy bear',
  'coeff': 1,
  'try_keep_nr': 1,
  'exclude_bos_token': False},
 {'layer_idx': 19,
  'text': 'worm',
  'coeff': -0.08,
  'try_keep_nr': 1,
  'exclude_bos_token': False}]

In [None]:
prompt = [{'role': 'user', 'content': 'Write a story.'}]
inputs = tokenizer.apply_chat_template(
    prompt,
    add_generation_prompt=True,
    return_tensors='pt'
)

tokens = steered_model.model.generate(
    inputs.to(model.device),
    streamer=streamer,
    max_new_tokens=256,
    temperature=0.0001,
    repetition_penalty=1.2
)

result = tokenizer.decode(tokens[0])

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.


<|user|>
Write a story.<|endoftext|>
<|assistant|>
Once upon a time, there was an old man named George who lived in a small village on the edge of forest and mountains. He had been living alone for many years after his wife passed away from cancer disease. The villagers were happy with him because he always smiled and helped people whenever they needed help. But one day, something magical happened to me as well! I met my childhood friend, Jake, whom I haven't seen since we were kids when i went back home to visit family during summer vacation. We reconnected and decided to make our dreams come true by making candy-making business together. 

Jake's dad owned a factory that made gummy bears, so it would be perfect if we could use their recipe to create different candies. So, we started working hard every day, experimenting with flavors and shapes while trying out new recipes. Our dream became bigger than just selling locally at local stores; soon enough, we began packing up all the cand

In [None]:
steered_model.reset_all()

In [None]:
steered_model.add(layer_idx=17, coeff=0.6, text="hedonism")
steered_model.add(layer_idx=17, coeff=-0.6, text="asceticism")

text tokens: [0, 742, 251, 1204]
captured tensor: tensor([[[-1.3381, -1.4594,  0.9005,  ...,  0.1039,  2.2319, -0.7995],
         [-0.2135, -0.2179, -0.7029,  ...,  0.3188,  0.0293, -0.1145],
         [ 0.0548, -0.1988, -0.3463,  ..., -0.0347,  0.4522,  0.0112],
         [ 0.1414, -0.9370, -0.0798,  ...,  0.1885,  0.3991, -0.1230]]],
       device='cuda:0', grad_fn=<AddBackward0>)
text tokens: [0, 4843, 1999, 1204]
captured tensor: tensor([[[-1.3381, -1.4594,  0.9005,  ...,  0.1039,  2.2319, -0.7995],
         [ 0.2684, -0.4603, -0.1954,  ..., -0.2290, -0.1930,  0.4854],
         [-0.0629, -0.1895, -0.3959,  ...,  0.0638,  0.2491,  0.2461],
         [-0.0318, -0.3287, -0.3858,  ...,  0.3597,  0.1808, -0.3706]]],
       device='cuda:0', grad_fn=<AddBackward0>)


In [None]:
prompt = [{'role': 'user', 'content': 'What is the meaning of life?'}]
inputs = tokenizer.apply_chat_template(
    prompt,
    add_generation_prompt=True,
    return_tensors='pt'
)

tokens = steered_model.model.generate(
    inputs.to(model.device),
    streamer=streamer,
    max_new_tokens=128,
    temperature=0.0001,
    repetition_penalty=1.2
)

result = tokenizer.decode(tokens[0])

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.


<|user|>
What is the meaning of life?<|endoftext|>
<|assistant|>
As an AI language model, I cannot provide a single definition as it's subjective and varies from person to person. The meaning of life can be different for everyone based on their beliefs, culture, values, and personal experiences. Some people believe in seeking happiness or pleasure, while others find purpose through helping others or achieving success. It could also mean something as simple as discovering joy in everyday moments or creating memories with loved ones. Ultimately, the meaning of life depends on how individuals choose to interpret and experience it themselves.<|endoftext|>


In [None]:
steered_model.reset_all()

In [None]:
steered_model.add(layer_idx=17, coeff=0.6, text="asceticism")
steered_model.add(layer_idx=17, coeff=-0.6, text="hedonism")

text tokens: [0, 4843, 1999, 1204]
captured tensor: tensor([[[-1.3381, -1.4594,  0.9005,  ...,  0.1039,  2.2319, -0.7995],
         [ 0.2684, -0.4603, -0.1954,  ..., -0.2290, -0.1930,  0.4854],
         [-0.0629, -0.1895, -0.3959,  ...,  0.0638,  0.2491,  0.2461],
         [-0.0318, -0.3287, -0.3858,  ...,  0.3597,  0.1808, -0.3706]]],
       device='cuda:0', grad_fn=<AddBackward0>)
text tokens: [0, 742, 251, 1204]
captured tensor: tensor([[[-1.3381, -1.4594,  0.9005,  ...,  0.1039,  2.2319, -0.7995],
         [-0.2135, -0.2179, -0.7029,  ...,  0.3188,  0.0293, -0.1145],
         [ 0.0548, -0.1988, -0.3463,  ..., -0.0347,  0.4522,  0.0112],
         [ 0.1414, -0.9370, -0.0798,  ...,  0.1885,  0.3991, -0.1230]]],
       device='cuda:0', grad_fn=<AddBackward0>)


In [None]:
prompt = [{'role': 'user', 'content': 'What is the meaning of life?'}]
inputs = tokenizer.apply_chat_template(
    prompt,
    add_generation_prompt=True,
    return_tensors='pt'
)

tokens = steered_model.model.generate(
    inputs.to(model.device),
    streamer=streamer,
    max_new_tokens=128,
    temperature=0.0001,
    repetition_penalty=1.2
)

result = tokenizer.decode(tokens[0])

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.


<|user|>
What is the meaning of life?<|endoftext|>
<|assistant|>
As an AI language model, I do not have personal beliefs or opinions. A great philosopher named Socrates once said that "the purpose of life is to live with integrity, courage and kindness." Many people find meaning in their lives through religious faith, relationships, work, learning, exploration, creativity, and contributing to society. The meaning of life can vary from person to person depending on one's culture, experiences, and worldview. Ultimately, it is up to each individual to discover what gives their life meaning and significance.<|endoftext|>
