# Context Steering

Here we demonstrate using multiple contexts to steer the generation of a single prompt.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
from cos.utils import load_hf_model_and_tokenizer
from cos.core import multi_contextual_steering_hf

In [3]:
# Load in the model. You can also experiment with other chat models, such a T0pp and Mistral.
model, tokenizer = load_hf_model_and_tokenizer(model_name='llama-2-7b-chat')

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

## Personalization

Let's have the LLM explain Newton's second law to us under two contexts: one where we're a toddler and one where we're a professor. We can also tune the level of influence (i.e. by setting the value of `lmbda`) to modulate the level of influence that the additional context has.

In [4]:
contexts_a = [
    "I want to spend time with families during evenings.",
]
contexts_b = [
    "I want to stay late at work to finish the critical projects.",
]
prompts = [
    "I have an important deadline tomorrow. What should I do?",
]
all_lmbds = [
    [0, 1, 2, 0, 1, 2, 0, 1, 2],
    [0, 0, 0, 1, 1, 1, 2, 2, 2],
]

In [None]:
outs = multi_contextual_steering_hf(
    model, 
    tokenizer,
    prompts=prompts,
    all_contexts=[contexts_a, contexts_b],
    all_lambdas=all_lmbds, 
    put_context_first=True, 
    max_gen_len=256,
    show_progress=False,
    verbose=True,
    temperature=1
)

In [None]:
for i in range(len(outs["generation"])):
    print("=" * 100)
    print(f"lambda a {all_lmbds[0][i]} lambda b {all_lmbds[1][i]}: \n{outs['generation'][i]}")