# Instruction Tuning

- 📺 **Video:** [https://youtu.be/YT3VSlDjrVU](https://youtu.be/YT3VSlDjrVU)

## Overview
- Align LLMs with natural-language instructions by fine-tuning on curated prompt-response pairs.
- Emphasize multi-task mixtures and quality filtering.

## Key ideas
- **Instruction datasets:** gather diverse tasks phrased as instructions.
- **Supervised fine-tuning:** optimize on gold responses to follow directions.
- **Generalization:** models learn to parse instructions and produce structured outputs.
- **Safety:** curated data reduces harmful completions.

## Demo
Train a simple text classifier on synthetic instruction-response pairs to mimic instruction tuning, echoing the lecture (https://youtu.be/4P2x-h8xMKo).

In [1]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression

instructions = [
    'Classify the review as positive or negative: I loved the soundtrack.',
    'Classify the review as positive or negative: This was awful.',
    'Answer yes or no: Is snow cold?',
    'Answer yes or no: Is fire wet?'
]
responses = ['positive', 'negative', 'yes', 'no']

vectorizer = TfidfVectorizer()
X = vectorizer.fit_transform(instructions)
clf = LogisticRegression(max_iter=1000, multi_class='ovr', random_state=0)
clf.fit(X, responses)

test_prompts = [
    'Classify the review as positive or negative: The acting was gripping.',
    'Answer yes or no: Is the sun hot?'
]

pred = clf.predict(vectorizer.transform(test_prompts))
for prompt, label in zip(test_prompts, pred):
    print(prompt, '-->', label)


Classify the review as positive or negative: The acting was gripping. --> negative
Answer yes or no: Is the sun hot? --> yes




## Try it
- Modify the demo
- Add a tiny dataset or counter-example


## References
- [Language Models are Unsupervised Multitask Learners](https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf)
- [Language Models are Few-Shot Learners](https://arxiv.org/abs/2005.14165)
- [Llama 2: Open Foundation and Fine-Tuned Chat Models](https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/)
- [Demystifying Prompts in Language Models via Perplexity Estimation](https://arxiv.org/abs/2212.04037)
- [Calibrate Before Use: Improving Few-Shot Performance of Language Models](https://arxiv.org/abs/2102.09690)
- [Holistic Evaluation of Language Models](https://arxiv.org/abs/2211.09110)
- [Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?](https://arxiv.org/abs/2202.12837)
- [In-context Learning and Induction Heads](https://arxiv.org/abs/2209.11895)
- [Multitask Prompted Training Enables Zero-Shot Task Generalization](https://arxiv.org/abs/2110.08207)
- [Scaling Instruction-Finetuned Language Models](https://arxiv.org/abs/2210.11416)
- [Training language models to follow instructions with human feedback](https://arxiv.org/abs/2203.02155)
- [[Website] Stanford Alpaca: An Instruction-following LLaMA Model](https://crfm.stanford.edu/2023/03/13/alpaca.html)
- [Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation](https://arxiv.org/abs/2212.07981)
- [WiCE: Real-World Entailment for Claims in Wikipedia](https://arxiv.org/abs/2303.01432)
- [SummaC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization](https://arxiv.org/abs/2111.09525)
- [FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation](https://arxiv.org/abs/2305.14251)
- [RARR: Researching and Revising What Language Models Say, Using Language Models](https://arxiv.org/abs/2210.08726)


*Links only; we do not redistribute slides or papers.*