From c643093d5a92f1c2f57c9493db8355f028d39ac5 Mon Sep 17 00:00:00 2001
From: younesbelkada <younesbelkada@gmail.com>
Date: Mon, 24 Jul 2023 14:41:10 +0000
Subject: [PATCH 1/2] add `DataCollatorForCompletionOnlyLM` in the docs

---
 docs/source/sft_trainer.mdx | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/docs/source/sft_trainer.mdx b/docs/source/sft_trainer.mdx
index 6fa0ece923..c9c06f57a7 100644
--- a/docs/source/sft_trainer.mdx
+++ b/docs/source/sft_trainer.mdx
@@ -50,6 +50,41 @@ The above snippets will use the default training arguments from the [`transforme
 
 ## Advanced usage
 
+### Train on  completions only 
+
+You can use the `DataCollatorForCompletionOnlyLM` to train your model on the generated prompts only. Note that this works only in the case when `packing=False`.
+To instantiate that collator, pass a response template and the tokenizer. Here is an example of how it would work to fine-tune `opt-350m` on completions only on the CodeAlpaca dataset:
+
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+from datasets import load_dataset
+from trl import SFTTrainer, DataCollatorForCompletionOnlyLM
+
+dataset = load_dataset("lucasmccabe-lmi/CodeAlpaca-20k", split="train")
+
+model = AutoModelForCausalLM.from_pretrained("facebook/opt-350m")
+tokenizer = AutoTokenizer.from_pretrained("facebook/opt-350m")
+
+def formatting_prompts_func(example):
+    output_texts = []
+    for i in range(len(example['instruction'])):
+        text = f"### Question: {example['instruction'][i]}\n ### Answer: {example['output'][i]}"
+        output_texts.append(text)
+    return output_texts
+
+response_template = " ### Answer:"
+collator = DataCollatorForCompletionOnlyLM(response_template, tokenizer=tokenizer)
+
+trainer = SFTTrainer(
+    model,
+    train_dataset=dataset,
+    formatting_func=formatting_prompts_func,
+    data_collator=collator,
+)
+
+trainer.train() 
+```
+
 ### Format your input prompts
 
 For instruction fine-tuning, it is quite common to have two columns inside the dataset: one for the prompt & the other for the response. 

From 00eb75ba1c7d6537ae6cf7ef5a4b7423316caf2c Mon Sep 17 00:00:00 2001
From: younesbelkada <younesbelkada@gmail.com>
Date: Mon, 24 Jul 2023 14:42:22 +0000
Subject: [PATCH 2/2] nit

---
 docs/source/sft_trainer.mdx | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/source/sft_trainer.mdx b/docs/source/sft_trainer.mdx
index c9c06f57a7..312dec3765 100644
--- a/docs/source/sft_trainer.mdx
+++ b/docs/source/sft_trainer.mdx
@@ -50,7 +50,7 @@ The above snippets will use the default training arguments from the [`transforme
 
 ## Advanced usage
 
-### Train on  completions only 
+### Train on completions only 
 
 You can use the `DataCollatorForCompletionOnlyLM` to train your model on the generated prompts only. Note that this works only in the case when `packing=False`.
 To instantiate that collator, pass a response template and the tokenizer. Here is an example of how it would work to fine-tune `opt-350m` on completions only on the CodeAlpaca dataset: