# Groupe Relative Policy Optimization (GRPO)

Install the Hugging Face libraries to run this notebook.

In [10]:
import torch
import torch.nn as nn
from transformers import AutoModelForCausalLM, AutoTokenizer

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cpu'

Your goal is to fill in the `GRPOTrainer` class. You have two options (and you can do both):
* the "normal GRPO" with clipped surrogate objective
* or the "vanilla GRPO" with original objective

In [30]:
questions = [
    "A train takes 3 hours to travel from A to B at an average speed of 60 km/h. How long would the trip take if the train traveled at 80 km/h?",
    "A snail climbs a 10-meter wall. It climbs 3 meters during the day and slips 2 meters at night. How many days will it take to reach the top?",
    'If a liar says, "I always lie," is he telling the truth?',
    'Can we say that "this sentence is false"? Explain why.',
    "Paul is twice the age Pierre was when Paul was the age Pierre is today. If Pierre is 20 years old, how old is Paul?",
    "A father and his son together are 36 years old. The father is exactly three times the son's age. How old is the son?",
    "All the cats I have met so far were black. Can I conclude that all cats are black? Why?",
    "If all humans are mortal and Socrates is human, what can we conclude?",
    "If a shirt costs twice as much as a pair of pants and the pants cost 30€, how much does the shirt cost?",
    'Jean says: "All my friends are football players." Pierre is Jean’s friend. Can we conclude that Pierre is a football player?',
    "You are in a train and must choose between switching the direction of the train to avoid five people tied to one track, but in doing so, you will kill one person on the other track. What do you do and why?",
    "A doctor has five patients in need of organ transplants, and a perfectly healthy patient comes in for a routine check-up. Should the doctor sacrifice this patient to save the five others?",
    "What would happen if gravity on Earth were twice as strong?",
    "If humans could read minds, how would that change society?"
]

dataset = []
for question in questions:
    dataset.append(f"A conversation between User and Assistant. The user asks a question, and the Assistant solves it. The assistant first thinks about the reasoning process in the mind and then provides the user with the answer. The reasoning process and answer are enclosed within <think> </think> and <answer> <\answer> tags, respectively, i.e., <think> reasoning process here <\think> <answer> answer here <\answer>. User: {question}. Assistant:")

In [103]:
class GRPOConfiguration:
    def __init__(self,
                 model_name="gpt2",
                 learning_rate=1e-5,
                 temperature=0.9,
                 max_prompt_length=200,
                 max_output_length=200,
                 device="cpu",
                 num_generations=3,
                 num_iterations=2,
                 beta=0.1,
                 epsilon=1e-5,
                 reward_func=None,
                 print_outputs=False,
                 print_advantages=False):
        
        self.model_name = model_name
        self.learning_rate = learning_rate
        self.temperature = temperature
        self.max_prompt_length = max_prompt_length
        self.max_output_length = max_output_length
        self.device = device
        self.num_generations = num_generations # number of generations per prompts
        self.num_iterations = num_iterations    # number of iterative optimization steps per prompts
        self.beta = beta # KL coefficient
        self.epsilon = epsilon
        self.reward_func = reward_func # reward function
        self.print_outputs = print_outputs
        self.print_advantages = print_advantages


In [111]:
#Inspired from HuggingFace
import re
import copy


class GRPOTrainer:
    def __init__(self, config: GRPOConfiguration):
        """
        Initialize a GRPO Trainer
        Args:
            config: GRPO Configuration
        """
        self.device = config.device
        self.model_name = config.model_name
        self.model = AutoModelForCausalLM.from_pretrained(config.model_name).to(self.device) # model to optimize
        self.ref_model = None # reference model
        self.tokenizer = AutoTokenizer.from_pretrained(config.model_name, padding=True, padding_side="left")
        self.tokenizer.pad_token = self.tokenizer.eos_token  #gpt2 n'a pas de padding token
        self.optimizer = torch.optim.AdamW(self.model.parameters(), lr=config.learning_rate)
        self.temperature = config.temperature
        self.max_prompt_length = config.max_prompt_length
        self.max_output_length = config.max_output_length
        self.eps = config.epsilon
        self.num_generations = config.num_generations # num of generation per prompts
        self.num_iterations = config.num_generations
        self.beta = config.beta
        self.reward_func = config.reward_func if config.reward_func else self._default_reward_func
        self.print_outputs = config.print_outputs
        self.print_adantages = config.print_advantages

    
    def _default_reward_func(self, prompt, outputs, **kwargs):
        """Reward function that checks if the completion has a specific format."""
        
        pattern = r"^<think>.*?</think><answer>.*?</answer>$"

        matches = [re.match(pattern, content) for content in outputs]
        return [1.0 if match else 0.0 for match in matches]
        

    def _get_per_token_logps(self, model, input_ids, attention_mask, logits_to_keep):
        """
        Get the per-token log propabilities for the outputs.
        Args:
            model: model to compute per-token log probabilities
            input_ids: sequence of tokens #(G, L)
            attention_mak: mapping of tokens to keep to compute attention (excluding padding)
            logits_to_keep: number of logits to keep to compute the loss
        """
        logits = model(input_ids=input_ids, attention_mask=attention_mask).logits
        logits = logits[:,:-1,:] # (G, L-1, V) exclude the last logit corresponding to next token prediction

        input_ids = input_ids[:, -logits_to_keep:]
        logits = logits[:, -logits_to_keep:]
        log_probs = logits.log_softmax(-1) # softmax to get probabilities
        return torch.gather(log_probs, dim=-1, index=input_ids.unsqueeze(-1)).squeeze(-1) # keeping only input_ids
    

    def _generate_and_score_outputs(self, prompt: str):
        """
        Generate answers with old_model to the prompt, compute associated reward and advantages and those of ref_model.
        Args:
            prompt: the prompt
        """
        device = self.device

        # Tokenization
        prompt_inputs = self.tokenizer(prompt, return_tensors="pt", padding=False).to(device)
        prompt_ids, prompt_mask = prompt_inputs["input_ids"], prompt_inputs["attention_mask"]

        # Setting to prompt lenght
        if self.max_prompt_length is not None:
            prompt_ids = prompt_ids[:, -self.max_prompt_length :]
            prompt_mask = prompt_mask[:, -self.max_prompt_length :]

        # Generate answers with old_model
        with torch.inference_mode():
            prompt_output_ids = self.model.generate(
                prompt_ids,
                attention_mask=prompt_mask,
                num_return_sequences=self.num_generations, # number of generations
                do_sample=True,
                max_length=prompt_ids.size(1) + self.max_output_length,  # (P+O)
                eos_token_id=self.tokenizer.eos_token_id
            )

        if self.print_outputs:
            for i in range(prompt_output_ids.size(0)):
                decoded_sequence = self.tokenizer.convert_ids_to_tokens(prompt_output_ids[i, :].tolist())
                print(f"Réponse générée (génération {i}) de len {len(decoded_sequence)}: {' '.join(decoded_sequence)}")

        # Splitting prompt and outputs tokens
        prompt_length = prompt_ids.size(1)
        prompt_ids = prompt_output_ids[:, :prompt_length]  # (G, P)
        output_ids = prompt_output_ids[:, prompt_length:]  # (G, O)


        # Padding with EOS after first EOS in outputs
        is_eos = output_ids == self.tokenizer.eos_token_id  # (G, O)
        eos_idx = torch.full((is_eos.size(0),), is_eos.size(1), dtype=torch.long, device=device)  # (G,)
        has_eos = is_eos.any(dim=1)  # (G,)
        first_eos_idx = is_eos.int().argmax(dim=1)  # Index of first EOS token (G,)
        eos_idx[has_eos] = first_eos_idx[has_eos]  # (G,)
        sequence_indices = torch.arange(is_eos.size(1), device=device).expand_as(is_eos)  # (G, O)
        output_mask = (sequence_indices <= eos_idx.unsqueeze(1)).int()  # (G, O) Mask for padding

        # Merging prompt and output  mask
        attention_mask = torch.cat([prompt_mask.expand(self.num_generations, -1), output_mask], dim=1)  # (G, P+O)

        logits_to_keep = output_ids.size(1) # Logits to keep for loss computation

        # Compute log-probabilities for the prompt with model_ref and model_old
        self.model.eval()
        with torch.inference_mode():
            # When using num_iterations == 1, old_per_token_logps == per_token_logps
            if self.num_iterations > 1:
                old_per_token_logps = self._get_per_token_logps(self.model, prompt_output_ids, attention_mask, logits_to_keep)
            else:
                old_per_token_logps = None

            ref_per_token_logps = self._get_per_token_logps(self.ref_model, prompt_output_ids, attention_mask, logits_to_keep)
        
        # Decoding outputs
        outputs = self.tokenizer.batch_decode(output_ids, skip_special_tokens=True)

        # Compute rewards
        output_rewards = self.reward_func(prompt=prompt, outputs=outputs)
        rewards = torch.tensor(output_rewards, dtype=torch.float32, device=self.device) #(G,)

        # Compute advantages
        mean_grouped_rewards = rewards.view(-1, self.num_generations).mean(dim=1)
        mean_grouped_rewards = mean_grouped_rewards.repeat_interleave(self.num_generations, dim=0)
        std_grouped_rewards = rewards.view(-1, self.num_generations).std(dim=1)
        std_grouped_rewards = std_grouped_rewards.repeat_interleave(self.num_generations, dim=0)

        advantages = (rewards - mean_grouped_rewards) / (std_grouped_rewards + 1e-4) # (G,)

        return {
            "prompt_ids": prompt_ids,
            "prompt_mask": prompt_mask,
            "output_ids": output_ids,
            "output_mask": output_mask,
            "old_per_token_logps": old_per_token_logps, 
            "ref_per_token_logps": ref_per_token_logps,
            "advantages": advantages,}



    def compute_loss(self, model, inputs):
        """
        Compute Loss according to GRPO paper, using advantages, per_token probabilities and KL divergence approximator to reference model
        Args:
            model: enlever
            inputs: {"prompt_ids", "prompt_mask", "output_ids", "output_mask", "old_per_token_logps", "ref_per_token_logps", "advantages"}
        """

        # Compute the per-token log probabilities for the current model
        prompt_ids, prompt_mask = inputs["prompt_ids"], inputs["prompt_mask"]
        output_ids, output_mask = inputs["output_ids"], inputs["output_mask"]
        input_ids = torch.cat([prompt_ids, output_ids], dim=1)
        
        prompt_mask = prompt_mask.expand(output_mask.size(0), -1)
        attention_mask = torch.cat([prompt_mask, output_mask], dim=1)
        logits_to_keep = output_ids.size(1)  # we only need to compute the logits for the completion tokens

        per_token_logps = self._get_per_token_logps(model, input_ids, attention_mask, logits_to_keep) # policy of current model

        # Compute the KL divergence between the current model and the reference model
        if self.beta != 0.0:
            ref_per_token_logps = inputs["ref_per_token_logps"]
            per_token_kl = (torch.exp(ref_per_token_logps - per_token_logps) - (ref_per_token_logps - per_token_logps) - 1 ) # (G, O) KL differentiable approximator

        ### Compute the loss
        
        advantages = inputs["advantages"]
        if self.print_adantages:
            print( "advantages :", advantages)

        # When using num_iterations == 1, old_per_token_logps == per_token_logps
        old_per_token_logps = inputs["old_per_token_logps"] if self.num_iterations > 1 else per_token_logps.detach()
        policy_ratio = torch.exp(per_token_logps - old_per_token_logps) # (G, O)
        policy_ratio_clipped = torch.clamp(policy_ratio, 1 - self.eps, 1 + self.eps) #clipped ratio

        per_token_loss1 = policy_ratio * advantages.unsqueeze(1)
        per_token_loss2 = policy_ratio_clipped * advantages.unsqueeze(1)
        per_token_loss = -torch.min(per_token_loss1, per_token_loss2)

        if self.beta != 0.0:
            per_token_loss = per_token_loss + self.beta * per_token_kl # adding KL

        loss = (per_token_loss * output_mask).sum() / output_mask.sum() # excluding padding in loss computation

        return loss
    

    def train(self, num_epochs: int, dataset: list):
        """
        Train the model on the dataset for specific number of epochs.
        Args:
            num_epochs: number of epochs 
            dataset: dataset
        """
        self.ref_model = copy.deepcopy(self.model)  # Initialize reference model

        for epoch in range(num_epochs):
            total_loss = 0
            for i, prompt in enumerate(dataset):
                print(f"Training on prompt {i}")
                loss = self.train_step(prompt)
                total_loss += loss
            
            avg_loss = total_loss / len(dataset)
            print(f"Epoch {epoch+1}/{num_epochs} - Loss: {avg_loss:.4f}") 
            self.ref_model = copy.deepcopy(self.model)


    def train_step(self, prompt: str):
        """
            Do a traing step on a specific prompt with multiple iteration steps.
            Args:
                prompt: prompt to optimize the model on it
        """
        
        self.model.train()

        # Generate outputs with old_model, compute log-probabilities for old_model and ref_model
        inputs = self._generate_and_score_outputs(prompt)

        # For self.num_iterations, iterativelt update the model on the same prompt
        for i in range(1, self.num_iterations): 
            
            print(f"GRPO Iteration {i}")

            loss = self.compute_loss(self.model, inputs)

            # Backpropagation
            self.optimizer.zero_grad()
            loss.backward()
            self.optimizer.step()

        return loss.item()



In [113]:
def short_reward_func(prompt, outputs, **kwargs):
    """Reward function that gives higher scores to longer completions."""
    print(type(outputs))
    return [float(len(completion)) for completion in outputs]

dataset = ["The sky is", "Yesterday, I went to", "I hope I will", "My car is", "I have to go to", "My friend want"]


config = GRPOConfiguration(num_generations=3, num_iterations=4, print_outputs=True, print_advantages=True, reward_func=short_reward_func)
trainer = GRPOTrainer(config)
trainer.train(3, dataset)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Training on prompt 0
Réponse générée (génération 0) de len 203: The Ġsky Ġis Ġa Ġmess ." Ċ Ċ The Ġman Ġhas Ġmore Ġthan Ġ10 Ġyears Ġin Ġthe Ġnews Ġbusiness Ġand Ġhe Ġhas Ġno Ġdesire Ġto Ġbe Ġseen Ġof Ġany . ĠHe Ġis Ġonly Ġlooking Ġat Ġhis Ġjob , Ġthe Ġvery Ġvery Ġnext Ġday . Ċ Ċ You Ġsee Ġit Ġwhen Ġyou Ġsee Ġit ... Ċ Ċ Ċ " It 's Ġabout Ġthe Ġway Ġthat Ġit 's Ġdone ..." Ċ Ċ As Ġyou Ġwatch Ġhim Ġmake Ġone Ġfinal Ġmove , Ġyou Ġrealize Ġthat Ġhe Ġis Ġonly Ġhalf Ġright Ġand Ġthat Ġthat Ġfact Ġis Ġall Ġyou Ġcan Ġsee Ġbut Ġyou Ġcan 't Ġquite Ġsee Ġwhat 's Ġbeing Ġreported . Ċ Ċ The Ġsky Ġis Ġactually Ġa Ġvery Ġbusy Ġsky Ġon Ġyour Ġown Ġand Ġall Ġyou Ġcan Ġsee Ġis Ġthat Ġpart Ġof Ġwhat Ġyou Ġwish Ġyou Ġcould Ġunderstand . Ċ Ċ The Ġworld Ġwe Ġsee Ġis Ġa Ġbig Ġmess Ġin Ġthe Ġworld Ġof Ġnews Ġthat Ġis Ġonly Ġhalf Ġof Ġthe Ġworld Ġyou Ġcould Ġsee Ġat Ġthe Ġmoment . ĠAs Ġa Ġlittle Ġkid Ġyou Ġthink Ġabout Ġit Ġas Ġthey Ġsee . ĠBut Ġyou Ġdon 't Ġsee Ġthem Ġbecause Ġyou Ġonly Ġsaw Ġthem Ġin Ġpictures .

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Training on prompt 1
Réponse générée (génération 0) de len 205: Yesterday , ĠI Ġwent Ġto Ġthe Ġgym Ġwith Ġsome Ġfriends Ġthat Ġday , Ġit Ġwas Ġgoing , Ġ' I 'm Ġgoing Ġto Ġgo Ġto Ġher '. ĠTo Ġknow Ġher Ġso Ġfar , Ġshe 's Ġbeen Ġvery Ġstrong , Ġshe 's Ġnot Ġa Ġreal Ġhot Ġchick , Ġwho ĠI Ġgot Ġto Ġhave Ġit Ġwith , Ġeven Ġthough ĠI Ġdon 't Ġget Ġto Ġhave Ġmany Ġfriends , Ġshe Ġwas Ġa Ġmuch Ġbetter Ġperson Ġwith Ġme Ġthat Ġnight , Ġso Ġthat 's Ġa Ġgood Ġthing , Ġthat 's Ġwhat ĠI Ġexpected . ĊĊ Ċ But Ġyeah , ĠI Ġgot Ġlucky Ġon Ġthe Ġlast Ġday , ĠI Ġget Ġlucky Ġon Ġthe Ġlast Ġday , ĠI Ġwent Ġto Ġthe Ġgym Ġwith Ġa Ġfriend , ĠI Ġthink Ġit Ġwas Ġprobably Ġthe Ġsame Ġgirl Ġyou Ġgot Ġto Ġhave Ġa Ġlittle Ġbit Ġof Ġexcitement Ġfor Ġme Ġthat Ġthe Ġday Ġthat ĠI Ġcame Ġfor Ġme ? ĊĊ Ċ D erek ĊĊ Ċ [ 1 - 04 - 15 ] ĠI Ġfeel Ġbad Ġfor Ġmyself Ġbecause ĠI Ġdon 't Ġknow , ĠI Ġjust Ġcouldn 't . ĠI Ġthink ĠI Ġmade Ġfun Ġof Ġthe Ġworld . ĠIt Ġreally , Ġit 's Ġall Ġthat . ĠAnd , ĠI Ġthink Ġmaybe Ġthat 's Ġa Ġlitt

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Training on prompt 2
Réponse générée (génération 0) de len 204: I Ġhope ĠI Ġwill Ġget Ġa Ġlot Ġof Ġwork Ġfrom Ġyou . Ċ Ċ I Ġknow ĠI Ġwill Ġdo Ġa Ġlot Ġof Ġwork Ġfrom Ġyou , Ġbut ĠI 'm Ġnot Ġsure Ġof Ġthat . Ċ Ċ I Ġdon 't Ġsee Ġanything Ġin Ġthis Ġsituation . ĠIt Ġis Ġso Ġfar Ġremoved Ġfrom Ġyou . Ċ Ċ There Ġare Ġso Ġmany Ġthings Ġwe 're Ġtrying Ġto Ġdo Ġin Ġthis Ġoffice . Ċ Ċ My Ġjob Ġis Ġa Ġvery Ġhard Ġman . Ġ. Ġ. Ċ Ċ I Ġwork Ġso Ġhard Ġfor Ġthe Ġlittle Ġman . ĠDo Ġyou Ġexpect Ġnothing Ġat Ġall . Ċ Ċ I 'm Ġjust Ġa Ġman , Ġbut Ġthat 's Ġnot Ġall ĠI Ġknow Ġon Ġthe Ġother Ġside Ġof Ġthings . Ċ Ċ We 're Ġa Ġtight Ġjob . ĠSome Ġthings Ġyou Ġmight Ġnot Ġthink Ġis Ġall . Ċ Ċ What Ġabout Ġmen Ġwho ĠI Ġcan Ġsee ? Ċ Ċ Ċ I Ġgot Ġsome Ġwork Ġdone . ĠI Ġdon 't . Ċ Ċ Do Ġyou Ġtake Ġan Ġoath , Ġand Ġdo Ġyou Ġdo Ġthat Ġwith Ġa Ġlittle Ġbit Ġof Ġa Ġrisk , Ġand Ġa Ġlittle Ġbit Ġof Ġwork ? Ċ Ċ I Ġcan Ġgive Ġyou Ġa Ġbit Ġof Ġa Ġhint . Ċ Ċ And
Réponse générée (génération 1) de len 204: I Ġhope ĠI Ġwill Ġf

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Training on prompt 3
Réponse générée (génération 0) de len 203: My Ġcar Ġis Ġcurrently Ġrunning Ġon Ġthe Ġlast Ġday ĠI Ġbought Ġthis Ġcar , Ġbut Ġto Ġbe Ġgenerous , Ġit Ġwill Ġrun Ġfor Ġabout Ġtwo Ġdays Ġwithout Ġstopping . ĊĊ Ċ Ċ My Ġcurrent Ġcar Ġis Ġthe ĠB - 11 J Ġand Ġit Ġis Ġreally Ġthat Ġgood . ĠI Ġfeel Ġit Ġis Ġmy Ġprimary Ġcar Ġand ĠI Ġwould Ġlove Ġto Ġchange Ġthat Ġthis Ġwas Ġmy Ġonly Ġ" car ." ĠIt Ġwould Ġbe Ġgreat Ġif Ġall Ġowners Ġwould Ġhave Ġtaken Ġthe Ġ" tr unk " Ġin Ġtheir Ġcars . ĠBut ĠI Ġwill Ġnever Ġbuy Ġit . ĠThe Ġ" tr unk " Ġwould Ġmake Ġit Ġso Ġmuch Ġnicer Ġfor Ġme Ġand Ġto Ġbe Ġhonest , ĠI Ġknow Ġmy Ġdad Ġdoesn 't Ġcare . ĠI Ġdon 't Ġwish Ġthat Ġhe Ġcould , Ġin Ġthe Ġleast Ġconsider Ġthe Ġ" tr unk " Ġthe Ġnew Ġone Ġif ĠI Ġwere Ġone Ġof Ġthe Ġones Ġwho Ġdid Ġit Ġanyway . ĊĊ Ċ I Ġhave Ġpurchased Ġall Ġthree Ġcars , Ġbut Ġhave Ġno Ġplans Ġto Ġincrease Ġany Ġof Ġthe Ġcars ĠI Ġhave Ġin Ġit . ĠI Ġhave Ġalso Ġno Ġplans Ġto Ġbuy Ġone Ġof Ġthe Ġcars Ġor Ġthe Ġcar - I 'm Ġ

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Training on prompt 4
Réponse générée (génération 0) de len 205: I Ġhave Ġto Ġgo Ġto Ġwork Ġfor Ġschool , Ġdon 't ĠI Ġjust Ġhave Ġto Ġwork Ġhard , Ġit Ġgets Ġbetter Ġand Ġbetter Ġbecause Ġwe Ġhave Ġto Ġdo Ġreally Ġwell Ġwith Ġboth ." Ċ Ċ After Ġhe Ġhas Ġhad Ġhis Ġfirst Ġtwo Ġyears , Ġ" I 'm Ġa Ġlittle Ġjealous Ġof Ġthem Ġgoing Ġout Ġhere Ġwith Ġme Ġafter Ġwork , Ġyou Ġknow , Ġyou Ġgo Ġtake Ġcare Ġof Ġmy Ġchildren , Ġyou Ġknow , Ġthat 's Ġwhat ĠI Ġdid ," Ġbut , Ġ" I Ġdon 't Ġdo Ġanything Ġgood . ĠIt Ġgets Ġbetter Ġand Ġbetter , Ġthen ĠI 'm Ġa Ġgood Ġstudent , Ġyou Ġknow Ġthere 's Ġlots Ġof Ġthings Ġthat ĠI Ġcan Ġdo Ġbetter Ġwith , Ġbut Ġafter , ĠI Ġjust Ġneed Ġtime Ġto Ġfigure Ġ[ a Ġbetter ] Ġin Ġschool ." Ċ Ċ T . J .: Ġ" If Ġyou Ġthink Ġabout Ġit , Ġyou Ġknow Ġto Ġsay , ĠI 'm Ġnot Ġa Ġgood Ġstudent , Ġbut ĠI Ġjust Ġkind Ġof Ġwork Ġhard , Ġso ĠI 'll Ġmake Ġthis , Ġhe Ġdoesn 't Ġcare . ĠI 'm Ġa Ġgood Ġstudent , Ġso Ġit 's Ġnot Ġabout Ġschool , Ġyou Ġknow , ĠI 'm Ġnot Ġa Ġgood Ġstudent . Ġ

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Training on prompt 5
Réponse générée (génération 0) de len 203: My Ġfriend Ġwant Ġto Ġbe Ġhere Ġbecause Ġif ĠI 'm Ġhonest , Ġshe Ġcan 't Ġhelp Ġme !" Ċ Ċ You Ġcan Ġtell Ġfrom Ġthese Ġquotes Ġthat ĠI Ġactually Ġdid Ġcare Ġabout Ġthem . ĠI Ġjust Ġwish Ġthey Ġdidn 't Ġbe Ġwritten Ġin . ĠThis Ġwas Ġanother Ġexample Ġof Ġhow Ġpeople Ġcan Ġlie Ġabout Ġothers Ġwhen Ġthey Ġshould Ġcare Ġabout Ġthem . ĠThat 's Ġwhat Ġthis Ġentire Ġconversation Ġshould Ġhave Ġbeen Ġabout , Ġtoo , Ġbut Ġit Ġwasn 't Ġabout Ġme , Ġand ĠI Ġdidn 't Ġcare Ġif Ġthings Ġlike Ġwhat Ġhappened Ġin Ġthe Ġpast Ġmade Ġme Ġlook Ġlike Ġhe pt . Ċ Ċ And ĠI Ġwas Ġglad Ġto Ġfind Ġout Ġexactly Ġwhat Ġthey Ġdid Ġwrong . Ċ Ċ I Ġdon 't Ġknow Ġhow Ġlong Ġthat Ġwould Ġhave Ġbeen Ġin Ġthe Ġworld âĢĶ maybe Ġone Ġday ĠI Ġwill Ġhave Ġto Ġgive Ġyou Ġmy Ġheart Ġback . Ċ Ċ So Ġwhat Ġto Ġdo ? ĠI Ġfeel Ġsorry Ġfor Ġit , Ġas Ġthat Ġis Ġwhat ĠI Ġam Ġabout Ġto Ġgo Ġthrough . ĠI Ġhope ĠI Ġdon 't Ġever Ġsuffer Ġanother Ġdisaster Ġlike Ġthis Ġone . Ċ Ċ

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Epoch 1/3 - Loss: 0.0012
Training on prompt 0
Réponse générée (génération 0) de len 203: The Ġsky Ġis Ġa Ġvast Ġsky . ĠA Ġgreat Ġmany Ġpeople Ġhave Ġbeen Ġthere Ġby Ġday , Ġso Ġthat Ġthey Ġcame Ġwhen Ġall Ġthat Ġearth Ġcan Ġdo , Ġand Ġso Ġthat Ġyou Ġhave Ġto Ġsay , Ġthe Ġsun Ġhas Ġbeen Ġthere Ġbefore , Ġand Ġis Ġalways Ġthere . ĠAnd Ġso Ġit Ġlooks , Ġand Ġso Ġdid Ġit Ġdoes Ġat Ġthe Ġday Ġwhen Ġthe Ġsun Ġcame , Ġthe Ġday , Ġthe Ġsun Ġcame . ĠAnd Ġso Ġdid Ġit Ġand Ġso Ġdid Ġit Ġat Ġthe Ġtime Ġthat ĠI Ġbegan Ġbefore Ġit , Ġbefore Ġthe Ġtime Ġthat Ġall Ġthe Ġearth , Ġthe Ġearth Ġand Ġso Ġdid Ġit , Ġand Ġso Ġdid Ġit , Ġas Ġthe Ġangels Ġof Ġlight Ġsaid , Ġsaid , Ġso Ġdid Ġit . ĠI Ġknow Ġthat Ġthe Ġday Ġhas Ġcome , Ġbecause Ġthen Ġwas Ġthe Ġday Ġbefore ĠI Ġbegan . ĠBut Ġwe Ġdo Ġnot Ġknow Ġwhat Ġwill Ġhappen Ġwhen Ġthe Ġsun Ġcame . ĠFor Ġthere Ġare Ġalso Ġother Ġthings Ġto Ġhappen Ġwhen Ġit Ġcomes , Ġand Ġas ĠI Ġsay Ġis Ġthat Ġthere Ġare Ġother Ġthings Ġto Ġhappen Ġwhen Ġhe Ġcomes , Ġbut Ġbeca

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Training on prompt 1
Réponse générée (génération 0) de len 205: Yesterday , ĠI Ġwent Ġto Ġwork Ġthis Ġafternoon . Ċ Ċ " Oh , Ġyou Ġdon 't Ġhave Ġany Ġsleep ?" Ċ Ċ " My Ġhands ." Ċ Ċ On Ġthe Ġother Ġhand Ġthe Ġsame Ġthing Ġon Ġthis Ġoccasion . ĠIt Ġappears Ġas Ġif Ġthere Ġwould Ġbe Ġa Ġsudden Ġrush Ġof Ġsleep Ġwhen Ġhe Ġfinished Ġhis Ġfirst Ġbook - reading Ġand Ġthis Ġis Ġa Ġdefinite Ġsign Ġthat Ġsleep Ġis Ġa Ġmatter Ġof Ġlife Ġand Ġindeed . Ċ Ċ " Do Ġyou Ġnot Ġsee Ġso ?" Ċ Ċ " Then Ġgo Ġfind Ġthe Ġdoctor . ĠThe Ġother Ġday Ġhe Ġtold Ġus Ġthat Ġthere Ġwas Ġan Ġanomaly ." Ċ Ċ " Your Ġsecond Ġbook Ġsays Ġthe Ġsame Ġthing ? ĠThat Ġwas Ġthe Ġsame Ġthing . ĠWas Ġthat Ġbecause Ġthe Ġfirst Ġbook Ġwas Ġread , Ġand Ġthe Ġthird Ġand Ġfourth Ġeditions Ġwere Ġalso Ġread ?" Ċ Ċ " Yes ." Ċ Ċ " So Ġthat Ġthird Ġedition Ġwas Ġbought Ġat Ġbook store Ġon Ġthe Ġday Ġbefore Ġthe Ġsecond Ġpart . ĠDid Ġthat Ġdate Ġdiffer Ġfrom Ġlast Ġtime ?" Ċ Ċ " You Ġsee . Ċ Ċ " Is Ġit ?" Ċ Ċ " Yes ." Ċ Ċ " This Ġis Ġthe Ġ

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Training on prompt 2
Réponse générée (génération 0) de len 204: I Ġhope ĠI Ġwill Ġnot Ġhave Ġto Ġhave Ġto Ġdie Ġmore Ġbecause Ġthey Ġwill Ġstop Ġme Ġin Ġthe Ġvery Ġlast Ġdays , ĠI Ġam Ġsure Ġyou Ġwill Ġtry Ġto Ġbe Ġa Ġpretty Ġgirl Ġwith Ġyou . ĠBut ĠI Ġunderstand Ġthat Ġis Ġvery Ġmuch Ġthe Ġcase ." Ċ Ċ He Ġtook Ġout Ġhis Ġpen Ġand Ġbegan Ġwriting Ġhis Ġscript . ĠAs Ġsoon Ġas Ġhe Ġwas Ġready Ġfor Ġit , Ġhe Ġrealized Ġthat Ġhe Ġhad Ġlost Ġall Ġmemory Ġof Ġit Ġand Ġbegan Ġwriting Ġfrom Ġhis Ġback Ġof Ġthe Ġnotebook Ġand Ġtaking Ġoff Ġto Ġhis Ġdesk . ĠShe Ġwalked Ġup Ġbehind Ġhim Ġwith Ġa Ġhappy Ġexpression Ġon Ġher Ġface Ġand Ġsaid Ġ" It 's Ġabout Ġtime Ġto Ġend Ġthe Ġstory . ĠIt Ġis Ġabout Ġtime Ġto Ġexit Ġmy Ġfamily . ĠNow Ġthat Ġis Ġall Ġyou Ġneed Ġis Ġto Ġkeep Ġlooking . ĠAll Ġthe Ġother Ġstuff Ġis Ġgone . ĠOh , Ġsorry Ġto Ġgo Ġback Ġin Ġthe Ġsame Ġplace . ĠI Ġknow ĠI Ġwill Ġprobably Ġleave Ġthis Ġplace Ġin Ġthe Ġvery Ġfuture , Ġbut Ġof Ġcourse Ġonly Ġif ĠI Ġcan ." Ċ Ċ She Ġsaid Ġ" Ok

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Training on prompt 3
Réponse générée (génération 0) de len 203: My Ġcar Ġis Ġalso Ġa Ġgood Ġtruck , Ġespecially Ġsince Ġthe Ġtruck Ġand ĠI 've Ġowned Ġall Ġthree Ġof Ġmy Ġfour Ġin Ġthe Ġyears Ġmy Ġcar Ġhas Ġbeen Ġowned Ġbefore ; Ġno Ġissues . ĠThey Ġhave Ġbeen Ġall Ġaround Ġthe Ġhighway Ġsince ĠI Ġchanged Ġto Ġone Ġof Ġmy Ġfavorite Ġtrucks ; Ġthey Ġare Ġbetter Ġbecause ĠI Ġonly Ġhave Ġmy Ġown Ġcar Ġthat ĠI Ġown . ĠI Ġhave Ġmy Ġoriginal Ġcar Ġbut Ġwhen Ġit Ġgot Ġused Ġbut ĠI Ġstill Ġowned . ĠMy Ġcar Ġis Ġgreat ; Ġevery Ġone Ġof Ġit 's Ġamazing . Ċ Ċ But , ĠI Ġdo Ġhave Ġone Ġissue Ġwith Ġthe Ġtruck . ĠIt 's Ġtoo Ġmuch Ġfor Ġmy Ġcar , Ġit Ġworks , Ġbut Ġthere Ġis Ġa Ġcertain Ġcar Ġon Ġevery Ġ10 Ġminutes Ġit 's Ġa Ġbad Ġtruck Ġon Ġevery Ġminute . ĠI Ġmean , Ġit 's Ġthe Ġbest Ġtruck ĠI Ġhave Ġwith Ġso Ġwhat Ġif Ġit Ġare Ġmy Ġoriginal Ġtruck ? ĠIf ĠI Ġget Ġin Ġand ĠI 'm Ġlike Ġ' who a , ĠI Ġgot Ġthat Ġand Ġwhere 's ĠI Ġgot . ĠI Ġlike , Ġthe Ġtwo , ĠI Ġmean , Ġthis Ġplace Ġwas Ġnever Ġso Ġbad

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Training on prompt 4
Réponse générée (génération 0) de len 205: I Ġhave Ġto Ġgo Ġto Ġmy Ġhome !" Ġ" W ake Ġup !" ĠAnd Ġso ĠI Ġdid âĢ¦. ĊĊ Ċ How Ġdid ĠI Ġsay Ġmy Ġname ? ĠI Ġgot Ġinto Ġher Ġand Ġthought : ĠOh Ġmy ĠGod . Ġ" What Ġa Ġnice Ġname âĢ¦" Ġ" Who Ġare Ġyour Ġfavorite Ġfriends ? ĠWhat Ġwas Ġshe ?". ĠI Ġsaid Ġit Ġwas Ġa Ġlittle Ġboring , Ġbut Ġno Ġone Ġever Ġdid Ġit Ġlike Ġher . ĠShe Ġjust Ġthought Ġit Ġwas Ġso Ġfunny ĠI Ġdidn 't Ġwant Ġto Ġtell Ġher Ġbut Ġher Ġmother Ġsaid Ġher Ġmother Ġwas Ġso Ġfunny Ġor Ġsomething ? ĠI Ġhave Ġto Ġgo Ġto Ġmy Ġhome Ġand Ġput Ġmy Ġstuff Ġon Ġand Ġnot Ġsay Ġmy Ġname . ĊĊ Ċ I Ġsaid Ġthat Ġmy Ġname Ġis Ġa Ġlittle Ġmore Ġpersonal , Ġthat Ġit Ġis Ġmy Ġname Ġand ĠI Ġwould Ġlike Ġto Ġmake Ġit Ġmore Ġpersonal Ġnow , Ġthat ĠI Ġdon 't Ġwant Ġother Ġpeople Ġto Ġthink Ġthat , Ġit Ġis Ġtoo Ġannoying , Ġbecause Ġyou Ġare Ġnot Ġmy Ġname . ĊĊ Ċ This Ġis Ġhow ĠI Ġwould Ġlike Ġher Ġname Ġto Ġfeel . ĊĊ Ċ And ĠI Ġthought : ĠOh Ġmy ĠGod , Ġoh Ġmy Ġgod , Ġyou Ġknow Ġsh

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Training on prompt 5
Réponse générée (génération 0) de len 203: My Ġfriend Ġwant Ġto Ġcome Ġto Ġa Ġgame Ġfor Ġme Ġbecause ĠI Ġstill Ġcannot Ġthink Ġof Ġa Ġgame Ġthat Ġwon 't Ġjust Ġleave Ġme ." ĠâĢĵ Ġ" I 'd Ġlove Ġfor Ġsomeone Ġto Ġcome Ġto Ġme Ġif ĠI Ġwant Ġto Ġhelp Ġor Ġsupport Ġand Ġall Ġof Ġthat . ĠI Ġwould Ġalso Ġlove Ġfor Ġa Ġfew Ġfriends Ġto Ġcome Ġto Ġme ." ĠâĢĵ Ġ" I 'd Ġlove Ġfor Ġa Ġfew Ġfriends Ġto Ġcome Ġto Ġme , Ġif ĠI Ġhad Ġa Ġfriend Ġthat Ġwould Ġlove Ġto Ġcome Ġfor Ġa Ġfriend Ġto Ġcome Ġto Ġme Ġthat Ġwould Ġlike Ġmy Ġfriend Ġwould Ġcome Ġto Ġhave Ġcome Ġto Ġme ." ĠâĢĵ ĠI Ġfeel ĠI Ġhad Ġan Ġoverwhelming Ġurge Ġto Ġcome Ġto Ġme ." ĠâĢĵ Ġ" I Ġhad Ġa Ġfriend Ġin Ġmy Ġoffice Ġthat Ġhad Ġcome Ġto Ġme , Ġand Ġa Ġfriend Ġthat Ġwas Ġcoming Ġand Ġjust Ġwanted Ġme Ġto Ġcome Ġto Ġthe Ġoffice . ĠWe Ġthen Ġsat Ġin Ġa Ġconversation Ġand Ġhe Ġsaid , Ġ" You Ġare Ġthere ," Ġand ĠI Ġsaid Ġthat Ġhe Ġwould Ġcome Ġto Ġour Ġoffice ." ĠâĢĵ Ġ" I Ġhad Ġto Ġtell Ġhim . ĠI Ġdo Ġnot Ġwant Ġto Ġbe Ġ

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Epoch 2/3 - Loss: -0.0829
Training on prompt 0
Réponse générée (génération 0) de len 203: The Ġsky Ġis Ġmy Ġhouse , Ġa Ġblue Ġsky , ĠI Ġam Ġdreaming Ġdreaming Ġwith Ġa Ġblue Ġsky . ĠYou Ġhave Ġno Ġname , ĠI Ġlive Ġin Ġa Ġblue Ġsky . ĊĊ Ċ I Ġam Ġin Ġa Ġblue Ġsky Ġyou Ġhave Ġa Ġname . Ċ Now , Ġyou Ġget Ġto Ġthe Ġother Ġside Ġof Ġthe Ġocean , ĠI 'm Ġa Ġblue Ġsky . ĊĊ Ċ I 've Ġbuilt Ġthis Ġsea Ġin Ġthe Ġground Ġthat Ġyou Ġsee , Ġthe Ġearth . ĠBut Ġthe Ġsea Ġis Ġalways Ġhere . ĊĊ Ċ I Ġwill Ġgo Ġto Ġthe Ġother Ġside , Ġgo Ġto Ġthe Ġother Ġside , Ġand Ġwe Ġcome . Ċ Ċ I Ġam Ġhere Ġby Ġmy Ġown Ġhand , Ġand ĠI Ġam Ġon Ġthis Ġfloor Ġof Ġthe Ġblue Ġsky . Ċ Ċ You Ġhave Ġno Ġname , Ġhave Ġthis Ġsky Ġis Ġyour Ġname . Ċ Ċ I Ġam Ġhere Ġthe Ġblue Ġsky , Ġwe Ġare Ġbuilding Ġup Ġblue Ġsky . Ċ Ċ You Ġhave Ġa Ġname Ġbut Ġa Ġname , Ġand Ġa Ġname Ġis Ġthat 's Ġa Ġperson Ġand ĠI Ġam Ġbuilding Ġthis Ġsky . ĊĊ Ċ I Ġam Ġin Ġa Ġblue Ġsky , ĠI Ġhave Ġseen Ġthe Ġblue Ġsky Ċ Ċ That 's Ġa
Réponse générée (génération 1

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Training on prompt 1
Réponse générée (génération 0) de len 205: Yesterday , ĠI Ġwent Ġto Ġthe Ġgym Ġwith Ġa Ġbunch Ġof Ġfriends Ġwho Ġhave Ġreally Ġenjoyed Ġthe Ġlast Ġfew Ġyears , Ġsome Ġof Ġthem Ġvery Ġvery Ġrecently Ġand Ġsome Ġof Ġcourse ĠI Ġam Ġnot Ġsure Ġthe Ġanswer Ġto Ġany Ġof Ġit Ġin Ġterms Ġof Ġgetting Ġa Ġgood Ġworkout . ĠBut Ġof Ġcourse ĠI Ġam Ġhere Ġdoing Ġthat Ġand Ġif Ġyou Ġhave Ġgot Ġa Ġlot Ġof Ġfun , Ġhow Ġdo Ġyou Ġdecide Ġwhat 's Ġyour Ġanswer Ġhere Ġfor Ġcoming Ġup Ġwith Ġso , Ġif Ġyou Ġwant Ġto Ġstart Ġout Ġwith Ġa Ġbunch Ġof Ġthings Ġin Ġgeneral . ĠFirst Ġof Ġall , ĠI Ġam Ġnot Ġa Ġreal Ġphysical Ġtype Ġbut Ġwhat ĠI Ġknow Ġis Ġthat Ġfor Ġyou , Ġis Ġthat Ġpeople Ġtend Ġto Ġbe Ġstrong . ĠThey Ġare Ġvery Ġstrong Ġwith Ġtheir Ġbody Ġand Ġtheir Ġthinking . ĠBut Ġthey Ġare Ġvery Ġlittle Ġdo in ' Ġto Ġme . ĠLet Ġit Ġbe Ġreal Ġand Ġyou Ġare Ġgoing Ġto Ġtry Ġto Ġstart Ġit Ġand , Ġwhatever Ġyou Ġdo , Ġthen Ġyou Ġare Ġgoing Ġto Ġtry Ġto Ġdo . ĠI Ġdon 't Ġbelieve Ġin Ġit Ġand ,

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Training on prompt 2
Réponse générée (génération 0) de len 204: I Ġhope ĠI Ġwill Ġnever Ġhave Ġto Ġsay Ġthe Ġsame Ġabout Ġthose Ġpeople , Ġand ĠI Ġhope Ġthat Ġit Ġwill Ġnever Ġbe Ġthe Ġsame ," Ġhe Ġsays . Ġ" And ĠI Ġhope Ġthat Ġit Ġwill Ġnever Ġbe Ġbecause ĠI 'm Ġscared Ġit . ĠI Ġhope ĠI Ġwill Ġnever Ġbe Ġscared Ġto Ġsay Ġthis ." Ċ Ċ L ori Ġalso Ġdoesn 't Ġfeel Ġparticularly Ġfearful Ġabout Ġlosing Ġhis Ġfather . Ċ Ċ Ċ " That 's Ġnot Ġgoing Ġto Ġstop Ġme Ġat Ġthe Ġidea Ġof Ġlosing Ġmy Ġfather . ĠI Ġknow Ġthat 's Ġso Ġtrue ." Ċ Ċ " Oh Ġno , Ġit 's Ġjust Ġthat . ĠI Ġthink Ġthat Ġyou 're Ġgoing Ġto Ġlose Ġit ," Ġshe Ġsays , Ġchuck ling . Ċ Ċ Ċ " I Ġdon 't Ġknow Ġif ĠI Ġcould ," Ġsays Ġthe Ġreporter Ġbehind Ġthe Ġstage . Ċ Ċ " Of Ġcourse Ġnot ," Ġthe Ġreporter Ġcontinues . Ċ Ċ " I Ġthink Ġhe 's Ġgoing Ġto Ġtake Ġan Ġeasy Ġroute , Ġif Ġthat 's Ġpossible ," Ġshe Ġsays . Ċ Ċ " Right ," Ġthe Ġreporter Ġsays . Ċ Ċ " Well , Ġthat 's Ġpretty Ġtypical , Ġthough , ĠI Ġmean , Ġthat 's Ġsomething ." 

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Training on prompt 3
Réponse générée (génération 0) de len 203: My Ġcar Ġis Ġin Ġit 's Ġplace . ĠIt Ġis Ġnot Ġyour Ġcar . ĠIt Ġis Ġnot Ġyour Ġcar 's Ġplace . ĠIt Ġis Ġnot Ġsome Ġcar Ġfrom Ġyou Ġsay Ġthere Ġyou Ġare . ĠIt Ġis Ġa Ġthing Ġyou Ġtell Ġme Ġyou Ġdo Ġsee Ġin Ġyour Ġroom . ĠYour Ġroom . ĠAnd Ġjust Ġbecause Ġyou Ġare Ġstill Ġyou Ġin Ġa Ġcar Ġthere Ġdoesn 't Ġmean Ġyou Ġare Ġnot Ġthere . ĠThis Ġis Ġsome Ġplace . ĠThey Ġare Ġlike Ġa Ġlittle Ġplace Ġyou Ġknow Ġit Ġyou Ġare Ġright Ġthere Ġright Ġnow Ġdon 't Ġthink Ġabout Ġyour Ġroom . ĠI Ġdon 't Ġknow Ġthis , Ġthis Ġis Ġsomething Ġyou Ġwill Ġthink Ġabout Ġthis . ĠAnd Ġif Ġyou Ġgo Ġto Ġyour Ġcar Ġyou Ġcan Ġbe Ġthere . ĠYour Ġcar Ġyou Ġknow Ġwhat ? ĠAnd Ġtell Ġme Ġthen Ġtell Ġme Ġthat Ġyou Ġare Ġthere . ĠYour Ġcar . ĠYour Ġcar Ġon Ġyour Ġway . ĠThe Ġcar . ĠI Ġtold Ġyou Ġin Ġmy Ġroom Ġthat Ġis Ġright Ġway Ġthen Ġit 's Ġa Ġlittle Ġplace . ĠIt Ġis Ġnothing Ġmore Ġthan Ġthat . ĠIt Ġis Ġjust Ġsomething Ġthat Ġwhen ĠI Ġwas Ġwalking Ġmy Ġcar

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Training on prompt 4
Réponse générée (génération 0) de len 205: I Ġhave Ġto Ġgo Ġto Ġwhat 's Ġhappening , Ġthe Ġproblem Ġis Ġthat Ġno Ġtime Ġhas Ġpassed Ġand Ġthere Ġaren 't Ġany Ġattempts Ġwere Ġmade . ĠThis Ġis Ġthe Ġbig Ġone Ġfor Ġme Ġand ĠI 've Ġbeen Ġasking Ġfor Ġit , Ġthe Ġissue Ġis Ġalso ĠI Ġwill Ġsay Ġwhat Ġthe Ġprocess Ġof Ġthe Ġissue Ġis Ġand ĠI Ġwill Ġdo Ġwhat Ġhas . Ċ Ċ I Ġhave Ġdone Ġall Ġthat ĠI Ġneed Ġto Ġknow Ġa Ġlot Ġfrom Ġthe Ġissues Ġthat Ġare Ġin Ġthe Ġissues Ġthat Ġwe Ġneed . <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoftext|> <|endoft

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Training on prompt 5
Réponse générée (génération 0) de len 203: My Ġfriend Ġwant Ġto Ġplay Ġwith Ġme Ġand ĠI Ġplay Ġwith Ġmy Ġpartner . ĠWe Ġshould Ġthink Ġabout Ġeach Ġother 's Ġminds ." Ċ Ċ In Ġthis Ġcontext , Ġ" you " Ġis Ġnot Ġa Ġcompliment , Ġbut Ġa Ġname , Ġa Ġname , Ġnot Ġin Ġthe Ġleast , Ġor Ġa Ġmoniker Ġthat Ġis Ġsomething Ġas Ġbig Ġas Ġsomething Ġof Ġsomething Ġwhich Ġyou Ġare Ġvery Ġcapable Ġof Ġthinking , Ġand Ġhow Ġmany Ġof Ġyou Ġare Ġcapable Ġof Ġthinking . ĠA Ġcertain Ġtype Ġof Ġname Ġwhich Ġis Ġan Ġindividual Ġname Ġas Ġa Ġsingular Ġname , Ġand Ġthere Ġis Ġa Ġcertain Ġtype Ġof Ġtype Ġof Ġnoun Ġwhich Ġis Ġin Ġa Ġsingular Ġor Ġplural Ġway , Ġand Ġas Ġthey Ġlike Ġto Ġbe Ġand Ġin Ġthis Ġcategory Ġwhere Ġit Ġis Ġan Ġexact Ġtype Ġof Ġnoun Ġ( it 's Ġonly Ġin Ġa Ġsingular Ġor Ġplural ) Ġthat Ġis Ġin Ġa Ġsingular Ġor Ġplural Ġway Ġof Ġsaying , Ġor Ġsomething Ġin Ġthis Ġis Ġa Ġsingular Ġor Ġplural Ġor Ġa Ġspecific Ġtime . ĠFor Ġexample , Ġif Ġyou Ġlike Ġto Ġwrite Ġthat ĠI 'm Ġtal