# Les méthodes de modification de logits

In [1]:
from transformers import AutoModelForCausalLM, AutoTokenizer

In [2]:
# On charge le modèle 
model_name = "gpt2" # Vous pouvez jouer avec différents modèles selon la puissance de votre machine

model = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer = AutoTokenizer.from_pretrained("gpt2")

prompt = "Aussois is" # Texte à tester ici
inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False) # On encode le texte en tokens

output = model.generate(**inputs, max_new_tokens=50, pad_token_id=tokenizer.eos_token_id) # On génère des tokens 
tokenizer.decode(output[0]) # On convertit les tokens en mots

'Aussois is a member of the German National Socialist Party (Bundeswehr), which is a party of the Social Democratic Party (Bundeswehr), which is a party of the Christian Democratic Party (Bundeswehr), which is a'

### Beam Search

In [3]:
output = model.generate(**inputs, num_beams=5, max_new_tokens=50, pad_token_id=tokenizer.eos_token_id) # On génère des tokens 
tokenizer.decode(output[0]) # On convertit les tokens en mots

'Aussois is one of the most well-known and well-researched and well-researched scientists in the world. He is also one of the most well-known and well-researched and well-researched'

### Sampling ancestral

In [4]:
output = model.generate(**inputs, num_beams=1, do_sample=True, max_new_tokens=50, pad_token_id=tokenizer.eos_token_id) # On génère des tokens 
tokenizer.decode(output[0]) # On convertit les tokens en mots

"Aussois is a British actor – he doesn't like to play a bad boy, but he knows what's in his head.\n\nIt's not a bad game: you just gotta make him get into the act with every word.\n\nWe will all"

### Top_k sampling

In [5]:
output = model.generate(**inputs, num_beams=1, do_sample=True, top_k=40, max_new_tokens=50, pad_token_id=tokenizer.eos_token_id) # On génère des tokens 
tokenizer.decode(output[0]) # On convertit les tokens en mots

'Aussois is at this moment on trial for an offence connected with an alleged hate crime, allegedly carried out in the name of white supremacy.\n\nAn hour after the verdict was read, he was arrested and sent to a "non-secure address". The investigation'

### Nucleus (top_p) sampling

In [6]:
output = model.generate(**inputs, num_beams=1, do_sample=True, top_p=0.9, max_new_tokens=50, pad_token_id=tokenizer.eos_token_id) # On génère des tokens 
tokenizer.decode(output[0]) # On convertit les tokens en mots

"Aussois is one of the few countries to offer a free WiFi access to customers.\n\nIt can now also access Wi-Fi to customers with a cellular data plan (sold separately).\n\nThe service is free, though it's priced at an additional $"

### Locally typical sampling (typical_p)

In [7]:
output = model.generate(**inputs, num_beams=1, do_sample=True, typical_p = 0.9, max_new_tokens=50, pad_token_id=tokenizer.eos_token_id) # On génère des tokens 
tokenizer.decode(output[0]) # On convertit les tokens en mots

'Aussois is a name given to a group of individuals, sometimes referred to as "the humanoids", that were sent out of time to escape their original forms.\n\nSome of the earliest known Homo sapiens were Homo erectus (sometimes abbreviated Homo erect'

### $\eta$-sampling (eta_cutoff)

In [8]:
output = model.generate(**inputs, num_beams=1, do_sample=True, eta_cutoff=0.003, max_new_tokens=50, pad_token_id=tokenizer.eos_token_id) # On génère des tokens 
tokenizer.decode(output[0]) # On convertit les tokens en mots

"Aussois is a man of many senses who doesn't mind seeing others, but when something's being done, he'll go for it, making it possible for others to be seen. If he wants to see a man, he'll do it, but they'll"

In [9]:
output = model.generate(**inputs, num_beams=1, do_sample=True, eta_cutoff=0.003, max_new_tokens=50, repetition_penalty=1.2, pad_token_id=tokenizer.eos_token_id) # On génère des tokens 
tokenizer.decode(output[0]) # On convertit les tokens en mots

"Aussois is an English football prodigy in his fifth year of playing. He was a member on the German national team during their World Cup run back at Leipzig, Germany and has played alongside him twice for BND's European Championship championship competition this past summer"

### Playground
On peut cumuler tous ces paramètres afin d’influencer la génération

In [10]:
# Ancestral Sampling avec plusieurs beams
output = model.generate(**inputs, num_beams=5, do_sample=True, max_new_tokens=50, pad_token_id=tokenizer.eos_token_id) # On génère des tokens 
tokenizer.decode(output[0]) # On convertit les tokens en mots

'Aussois is not the only one who has made a name for himself in the game. In fact, he is one of the most successful players in the game.\n\nAussois is the only one who has made a name for himself in the game'

In [11]:
# Top-k avec plusieurs beams
output = model.generate(**inputs, num_beams=5, do_sample=True, top_k=40, max_new_tokens=50, pad_token_id=tokenizer.eos_token_id) # On génère des tokens 
tokenizer.decode(output[0]) # On convertit les tokens en mots

'Aussois is not the only one who has taken up the cause. In fact, it is the only one who has taken up the cause. In fact, it is the only one who has taken up the cause. In fact, it is the only one who'

In [12]:
# Un peu de tout
output = model.generate(**inputs, num_beams=1, do_sample=True, top_k=40, top_p=0.9, eta_cutoff=0.003, max_new_tokens=50, pad_token_id=tokenizer.eos_token_id) # On génère des tokens 
tokenizer.decode(output[0]) # On convertit les tokens en mots

'Aussois is a game of basketball. It is about creating an edge for your opponents when it comes to the ball.\n\nFor the last few seasons, there have been three of the five players that were asked to put up 20 points in the NBA Finals on'

In [13]:
# Testez tout
output = model.generate(
    **inputs, 
    num_beams=1, 
    do_sample=True, 
    top_k=40, 
    top_p=0.9, 
    temperature=0.9,
    eta_cutoff=0.003, 
    repetition_penalty=1.2,
    max_new_tokens=50, 
    pad_token_id=tokenizer.eos_token_id) 
tokenizer.decode(output[0]) # On convertit les tokens en mots

'Aussois is in the US for a period of time and has been on staff there since 2008, at which point he became an associate Professor Emeritus.\n\n\n...as well as being chairman of the International Institute (IIC) from 2010 to 2012; I'

## Mesures de diversité de la génération
Ici, on va mesurer à quel point générer plusieurs fois va donner des textes différents

In [14]:
import torch
import sacrebleu
import random
import numpy as np

def calculate_average_bleu(generated_texts):
    # Calcul du score BLEU pour chaque paire de textes
    pairwise_bleu_scores = []
    for i in range(len(generated_texts)):
        for j in range(i+1, len(generated_texts)):
            hypothesis = generated_texts[i]
            references = [generated_texts[j]]
            bleu = sacrebleu.corpus_bleu([hypothesis], [references])
            pairwise_bleu_scores.append(bleu.score)
            #print(f"score BLEU entre la génération {i} et {j}: {bleu.score:.2f}")

    # Calcul du BLEU moyen en tant que mesure de diversité
    average_bleu = sum(pairwise_bleu_scores) / len(pairwise_bleu_scores)
    return average_bleu

In [15]:
prompt = "Aussois is"  # Prompt

# Encodage du prompt
inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)

# On génère plusieurs textes
num_samples = 5  # Nombre de textes
max_new_tokens = 50  # Nombre de tokens à générer
generated_texts = []

# Boucle de génération
for i in range(num_samples):
    """ Optionnel, on peut ajouter une seed pour reproduire toujours les mêmes 
    torch.manual_seed(i)
    np.random.seed(i)
    random.seed(i)
    """

    # Génération de textes avec différents paramètres
    output = model.generate(
        **inputs,
        num_beams=5,
        max_new_tokens=max_new_tokens,
        do_sample=True,
        #top_k=40,       
        temperature=0.7,
        #top_p=0,
        #eta_cutoff=0,
        #typical_p=0.9,
        #repetition_penalty=1,
        pad_token_id=tokenizer.eos_token_id
    )
    # Décodage des tokens
    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
    generated_texts.append(generated_text)
    print(f"Sample {i}:\n{generated_text}\n")

avg = calculate_average_bleu(generated_texts)
print(f"Score BLEU moyen :{avg}")

Sample 0:
Aussois is one of the most well-known and well-researched and well-researched and well-researched and well-researched and well-researched and well-researched and

Sample 1:
Aussois is a member of the European Parliament and a member of the Council of Europe. He is also a member of the European Parliament and a member of the Council of Europe. He is also a member of the European Parliament and a member of the Council of Europe

Sample 2:
Aussois is one of the most well-known and well-known names in the world. He is also one of the most well-known and well-known names in the world. He is also one of the most well-known and well-known names

Sample 3:
Aussois is one of the most well-known and well-known names in the history of the world. He is one of the most well-known and well-known names in the history of the world. He is one of the most well-known and

Sample 4:
Aussois is one of the most well-known and well-researched and well-researched people in the world. He is one of th

BLEU va mesurer à quel point les phrases sont similaires, plus il est bas, plus les générations sont diverses.

## Beam-search curse
À partir d’une certaine taille, augmenter le nombre de beams va dégrader les performances

In [16]:
# On génère plusieurs textes
num_samples = 5  # Nombre de textes
max_new_tokens = 50  # Nombre de tokens à générer
generated_texts = []

nb_beams = [1]#, 5, 10, 15, 20, 25, 30]

for j in nb_beams:
    # Boucle de génération
    for i in range(num_samples):
        """ Optionnel, on peut ajouter une seed pour reproduire toujours les mêmes 
        torch.manual_seed(i)
        np.random.seed(i)
        random.seed(i)
        """

        # Génération de textes avec différents paramètres
        output = model.generate(
            **inputs,
            num_beams=j,
            max_new_tokens=max_new_tokens,
            do_sample=True,
            #top_k=40,       
            temperature=0.7,
            #top_p=0,
            #eta_cutoff=0,
            #typical_p=0.9,
            #repetition_penalty=1,
            pad_token_id=tokenizer.eos_token_id
        )
        # Décodage des tokens
        generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
        generated_texts.append(generated_text)
        #print(f"Sample {i}:\n{generated_text}\n")
    avg = calculate_average_bleu(generated_texts)
    print(f"Score Bleu moyen pour {j} beams : {avg:.5f}")
    

Score Bleu moyen pour 1 beams : 3.31441


On voit que le score Bleu moyen augmente au fur et a mesure que le nombre de beams augmente, donnant donc des générations de plus en plus similaires

## Conditions d’arrêt du beam search

# Minimum Bayes Risk Decoding

In [17]:
"""model = 
tokenizer = """

prompt = "Aussois is"
# Encodage du prompt
inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)

#Génération des références 
nb_best = 5
best_set = []
for i in range(nb_best):
    output = model.generate(
        **inputs,
        # Ajouter arguments de génération ici
        pad_token_id=tokenizer.eos_token_id
    )
    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
    best_set.append(generated_text)

# Génération diverses 
nb_divers = 30
monte_carlo_set = []
for i in range(nb_divers):
    output = model.generate(
        **inputs,
        do_sample=True,
        # Ajouter arguments de génération ici
        pad_token_id=tokenizer.eos_token_id
    )
    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
    monte_carlo_set.append(generated_text)



In [18]:
import sacrebleu

def select_best_candidate(best_set, monte_carlo_set): # Ici on cherche le candidat qui synthétise le mieux les autres selon la métrique BLEU
    scores = []
    for candidate in best_set:
        pairwise_bleu_scores = []
        for reference_text in monte_carlo_set:
            bleu = sacrebleu.corpus_bleu([candidate], [[reference_text]])
            pairwise_bleu_scores.append(bleu.score)

        # Calcul du BLEU moyen pour ce candidat
        average_bleu = sum(pairwise_bleu_scores) / len(pairwise_bleu_scores)
        scores.append(average_bleu)

    # On sélectionne le candidat avec le score moyen le plus élevé, celui qui correspond le plus à tous les autres
    best_index = max(range(len(scores)), key=lambda idx: scores[idx])

    return best_set[best_index], scores[best_index]


In [19]:
candidat, score = select_best_candidate(best_set, monte_carlo_set)
print(f"La meilleure phrase est :\n{candidat} \navec un BLEU moyen de {score:.5f}")

La meilleure phrase est :
Aussois is a member of the German National Socialist Party (Bundeswehr), 
avec un BLEU moyen de 11.13099


# MCTS

In [None]:
from transformers import AutoModelForSequenceClassification

# On charge le modèle chargé de scorer nos textes
# C’est un modèle très simple chargé de donner un score positif et négatif aux textes, chacun entre 0 et 1
scoring_model_name = "siebert/sentiment-roberta-large-english"
scoring_tokenizer = AutoTokenizer.from_pretrained(scoring_model_name)
scoring_model = AutoModelForSequenceClassification.from_pretrained(scoring_model_name)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

scoring_model.to(device)
scoring_model.eval()

def get_discriminator_score(texts):
    """
    On calcule le sentiment (positif ou négatif)
    """
    inputs = scoring_tokenizer(texts, return_tensors="pt", truncation=True, padding=True).to(device)
    
    with torch.no_grad():
        outputs = scoring_model(**inputs)
    
    # Apply softmax to get probabilities
    probabilities = torch.softmax(outputs.logits, dim=1)
    
    positive_scores = probabilities[:, 1]  # Index 1 représente le score "positif"
    negative_scores = probabilities[:, 0]  # Index 0 représente le score "négatif"

    # Il faut choisir si l’on veut un texte positif ou négatif en commentant la ligne inutile
    scores = positive_scores               # On veut que le texte généré soit plutôt positif
    #scores = negative_scores               # On veut que le texte généré soit plutôt négatif
    
    return scores

# Exemple
text = "This movie was terrible"
score = get_discriminator_score([text])
print(f"Sentiment score: {score[0].item():.4f}") 

Sentiment score: 0.0005


In [33]:
from mcts_script import main

# Arguments
c = 1.0              # Constante d’exploration, plus haut = plus d’exploration des noeuds les moins visités
alpha = 1.0          # Priorité donnée aux probas du modèle ou au score que l’on renvoie
temperature = 0.7    # Température utilisée pendant la génération
penalty = 1.1        # Pénalité de répétition
num_it = 50          # Nombre d’itérations MCTS par token
prompt_text = "This movie was"


# Call the main function with parameters
main(c, alpha, temperature, penalty, num_it, get_discriminator_score, prompt_text)

Generating text: 100%|██████████| 50/50 [04:10<00:00,  5.01s/it]

Generated text:
This movie was a huge hit. I think it's one of the best movies ever made, and that is why we are so proud to be able do this film again."
"I'm really excited about what you're doing with The Hobbit: An Unexpected





# Calculs de perplexité

In [22]:
import torch
from tqdm import tqdm
import torch.nn.functional as F

# Méthode 1, avec tout le contexte à chaque fois 
def compute_token_by_token_ppl(model, encodings):
    input_ids = encodings.input_ids
    seq_len = input_ids.size(1)
    if seq_len > model.config.n_positions:
        return "Ce texte est trop long pour la fenêtre de contexte du modèle"
    with torch.no_grad():
        outputs = model(input_ids, labels=input_ids)
    
    logits = outputs.logits  # shape: [batch_size, seq_length, vocab_size]
    
    # Shift pour prédire le token i à la position i-1
    shift_logits = logits[:, :-1, :].contiguous()
    shift_labels = input_ids[:, 1:].contiguous()

    log_probs = F.log_softmax(shift_logits, dim=-1)
    # On gather pour regarder les probabilités du token qui est renvoyé
    token_log_probs = log_probs.gather(-1, shift_labels.unsqueeze(-1)).squeeze(-1)
    
    average_nll = -token_log_probs.mean() 
    token_by_token_ppl = torch.exp(average_nll)
    return token_by_token_ppl.item()

In [23]:
def compute_context_window_ppl(model, encodings, window_size):
    input_ids = encodings.input_ids
    seq_len = input_ids.size(1)
    max_length = window_size

    window_losses = []
    for start_idx in range(0, seq_len, max_length):
        end_idx = min(start_idx + max_length, seq_len)
        window_input_ids = input_ids[:, start_idx:end_idx]

        with torch.no_grad():
            outputs = model(window_input_ids, labels=window_input_ids)
        
        # outputs.loss représente la NLL moyenne sur tous les tokens dans la fenêtre
        window_loss = outputs.loss
        window_losses.append(window_loss)

    average_loss = torch.stack(window_losses).mean()
    context_window_ppl = torch.exp(average_loss)
    return context_window_ppl.item()

In [24]:
def compute_perplexity_with_half_window_context(model, encodings, window_size):
    half_window = window_size // 2

    input_ids = encodings.input_ids
    seq_len = input_ids.size(1)

    nlls = []
    start_positions = range(0, seq_len, half_window)
    for start_pos in tqdm(start_positions):
        end_pos = start_pos + window_size
        # Si on a plus assez de tokens, on break ou on peut calculer la ppl des tokens restants en mettant end_pos = seq_len
        if end_pos > seq_len:
            end_pos = seq_len
            break

        # On choisit notre fenêtre
        window_input_ids = input_ids[:, start_pos:end_pos]

        # Seule la deuxième moitié de la fenêtre de contexte sert à scorer notre texte
        target_ids = window_input_ids.clone()
        # On masque la première moitié du texte (-100 est une valeur arbitraire)
        target_ids[:, :half_window] = -100

        with torch.no_grad():
            outputs = model(window_input_ids, labels=target_ids)
            neg_log_likelihood = outputs.loss
            nlls.append(neg_log_likelihood)
        
        if end_pos == seq_len:
            break

    # On récupère la perplexité moyenne
    average_nll = torch.stack(nlls).mean()
    ppl = torch.exp(average_nll)
    return ppl.item()

In [25]:
text = "On a crisp autumn morning, Eleanor stepped outside her small cottage at the edge of the forest and began walking down a narrow path lined with amber and russet leaves. The world had changed colors overnight, as if an unseen painter had swept across the landscape with a palette of warm hues. She breathed in the scent of damp earth, distant pine, and the faint aroma of woodsmoke drifting from a neighbor’s chimney. Although the air carried a slight chill, the day promised gentle sunshine and light winds, perfect for a long and thoughtful stroll. As she continued along the winding trail, Eleanor recalled the stories her grandfather had told her when she was a child—tales of hidden groves and forgotten clearings deep in the woods, where ancient trees whispered secrets to one another. He had insisted that if one listened carefully enough, the wind through the branches carried voices from centuries past. She had never been entirely sure whether to believe him, but as she walked, she allowed herself the luxury of imagining these old legends might hold some truth. There was comfort in thinking that one’s ancestors might leave behind faint echoes, lingering in the quiet corners of nature. About half a mile in, Eleanor reached a small stream. Its waters, clear and cold, danced over smooth stones, creating a soft, melodic murmur. She paused to watch leaves float downstream, each one a tiny vessel drifting toward unknown destinations. The sunlight, filtering through the half-bare branches, reflected in bright spots off the ripples, reminding her that beauty often lay in small, transient details. She took her time before moving on, feeling as if each moment was a gift she shouldn’t rush. Not long after crossing the stream via a makeshift wooden plank, she came across a clearing she did not recognize. It was shaped like an oval and ringed with young birch trees whose bark gleamed pale against the darker backdrop of firs and oaks. In the center stood a solitary stone bench, weathered and covered in moss, as though it had been placed there by someone who valued solitude. Eleanor brushed off some of the moss and sat, resting her legs and taking in the silent theater around her. No birds chirped, no squirrels chattered. It was as if this spot had declared itself a sanctuary from noise. While resting, she thought about the countless times she had ventured outdoors in search of nothing in particular: just the quiet companionship of trees, the patient passage of clouds, and the sound of her own footsteps on the trail. She considered how, in these quiet moments, she often found a clarity that eluded her in the hustle of daily life. Back at her cottage, there were chores to be done, letters to answer, and errands waiting. Out here, these demands receded, leaving room for the warmth of memories and the subtle interplay of time and stillness. Refreshed by the pause, Eleanor stood and continued forward, leaving the clearing behind. Eventually, the path widened, and she found herself walking alongside an old stone wall, partially collapsed in places. Vines and moss had claimed it, weaving new textures and patterns that hinted at the slow, persistent artistry of nature over generations. She liked to imagine who might have built the wall—farmers long ago clearing land, or perhaps villagers marking a boundary. The wall, now broken and quiet, bore silent witness to a past she could only guess at. As noon approached, the sunlight grew warmer, and the forest seemed to awaken. A distant woodpecker tapped at a trunk, small birds fluttered between branches, and a gentle breeze carried the distant laughter of someone working in a nearby orchard. Eleanor knew there was a village not far beyond the forest’s border, where life continued its pleasant rhythm: apples harvested, bread baked, stories swapped over steaming cups of tea. Soon, she would turn back toward her cottage, but not just yet. The day still had hours left to unfold. After another half-hour of walking, she reached a hillside that offered a view of rolling fields beyond the trees. Patches of farmland, dotted with hay bales, stretched toward a line of distant hills. A single hawk circled overhead, its keen eyes scanning the ground below for a quick meal. Eleanor watched the hawk’s flight, feeling a quiet admiration for its independence and grace. When she finally turned around to retrace her steps home, she found the forest just as welcoming as before. The path felt familiar but not stale; rather, it was like greeting an old friend. She noticed details she had missed earlier—a cluster of mushrooms at the base of a beech tree, the gentle slope of the trail as it curved around a thicket. Each step brought her closer to the cottage, and with it the ordinary tasks of her life, but she carried within her a renewed sense of peace. By the time Eleanor stepped through her front door, the golden afternoon light had begun to angle across the floorboards. She placed a kettle on the stove, choosing a fragrant herbal blend for her tea. Waiting for the water to boil, she looked out the window at the edge of the forest, grateful that she had taken the time to wander among the trees. There was a quiet magic in that forest—subtle, unassuming, yet undeniably present. It asked for nothing and offered everything: a calm place to think, to listen, to remember. And tomorrow, or perhaps the day after, she might return to discover something new, or to simply be, wrapped in the gentle hush of autumn leaves and whispered histories."

inputs = tokenizer(text, return_tensors="pt", add_special_tokens=False)

Token indices sequence length is longer than the specified maximum sequence length for this model (1147 > 1024). Running this sequence through the model will result in indexing errors


In [26]:
compute_token_by_token_ppl(model, inputs)

'Ce texte est trop long pour la fenêtre de contexte du modèle'

In [27]:
compute_perplexity_with_half_window_context(model, inputs, window_size=512)

 60%|██████    | 3/5 [00:00<00:00,  4.44it/s]


31.677274703979492

In [28]:
compute_context_window_ppl(model, inputs, window_size=512)

31.246902465820312

On obtient différents résultats de perplexité à cause des différentes façons de calculer. Sur les texte très longs, ces différences peuvent devenir très grandes.