# Contrastive Decoding Exploration

Use this notebook after installing the project requirements to compare baseline greedy decoding with contrastive decoding across multiple prompts.

In [None]:
import transformers as tr

from contrastive_decoding import CDConfig, contrastive_decode, format_prompt, load_model, pick_device_dtype
from contrastive_decoding.utils import ensure_padding_token

amateur_path = "Qwen/Qwen2.5-Coder-0.5B-Instruct"
expert_path = "Qwen/Qwen2.5-1.5B-Instruct"

device, dtype = pick_device_dtype()
tokenizer = tr.AutoTokenizer.from_pretrained(expert_path)
ensure_padding_token(tokenizer)

expert = load_model(expert_path, device, dtype)
amateur = load_model(amateur_path, device, dtype)

cfg = CDConfig(max_new_tokens=40, alpha=0.1, top_k=50, stream=False)

prompts = {
    "Haiku": open('examples/haiku.txt').read().strip(),
    "Docstring": open('examples/docstring.txt').read().strip(),
    "Custom": "Summarize the key goals of contrastive decoding in two sentences.",
}

for name, user_prompt in prompts.items():
    formatted = format_prompt(tokenizer, user_prompt, use_chat_template=True)
    print(f"\n## {name}\n")
    print(contrastive_decode(amateur, expert, tokenizer, formatted, cfg, device))


Experiment with `alpha`, `top_k`, and `tau_amateur` to see how the resulting generations change. For longer generations you may also want to disable streaming (`--no-stream`) when using the CLI.