support repeat_penalty config #153

flaneur2020 · 2024-04-06T13:29:45Z

it seems works at the sample phase. which helps reduce the repeativeness problem.

this is a description about repeat_penalty in kimi chat:

The function of repeat_penalty is to adjust the model's sensitivity to tokens that have already been generated when creating new tokens. By increasing the value of repeat_penalty, the model is encouraged to avoid reusing recently generated words, thereby increasing the diversity and novelty of the text. Conversely, reducing the value of repeat_penalty lessens the penalty for repetition, which may result in more repetitive content in the generated text.

related links: https://www.reddit.com/r/LocalLLaMA/comments/15s7ln1/potential_fix_to_the_repetitiveness_problem_of/

flaneur2020 · 2024-04-06T13:43:20Z

this is a description in GPT4:

Here's a simplified explanation of how the repetition penalty works:

When the model generates text, it calculates the probability of each word or token in its vocabulary to be the next one in the sequence.

If the repetition penalty is greater than 1, the probabilities of tokens that have already appeared in the current output are divided by this penalty factor, effectively reducing their chances of being selected again.

The model then samples the next token from the adjusted distribution of probabilities, taking the repetition penalty into account.

import numpy as np

def apply_repetition_penalty(probabilities, token_sequence, penalty=1.2):
    """
    Apply repetition penalty to the probabilities of tokens.
    
    :param probabilities: numpy array of original probabilities for each token
    :param token_sequence: list of tokens already generated
    :param penalty: repetition penalty factor (default 1.2)
    :return: adjusted probabilities
    """
    # If penalty is set to 1, no penalty is applied
    if penalty == 1:
        return probabilities
    
    # Apply penalty to the probabilities of tokens that have already appeared
    for token in set(token_sequence):
        token_idx = token_to_index[token]  # Assuming a function that maps tokens to their index
        probabilities[token_idx] /= penalty
    
    # Re-normalize the probabilities
    probabilities /= sum(probabilities)
    
    return probabilities

# Example usage
original_probabilities = np.array([0.1, 0.2, 0.3, 0.4])  # Probabilities for 4 hypothetical tokens
generated_sequence = ['token2', 'token3']  # Tokens that have been generated so far
token_to_index = {'token1': 0, 'token2': 1, 'token3': 2, 'token4': 3}  # Mapping tokens to their index

# Apply repetition penalty
adjusted_probabilities = apply_repetition_penalty(original_probabilities, generated_sequence, penalty=1.2)

# Sample the next token using the adjusted probabilities
next_token_index = np.random.choice(range(len(adjusted_probabilities)), p=adjusted_probabilities)
next_token = index_to_token[next_token_index]  # Assuming a function that maps indices back to tokens

print(f"Next token: {next_token}")

flaneur2020 added the good first issue Good for newcomers label Apr 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support repeat_penalty config #153

support repeat_penalty config #153

flaneur2020 commented Apr 6, 2024

flaneur2020 commented Apr 6, 2024

support repeat_penalty config #153

support repeat_penalty config #153

Comments

flaneur2020 commented Apr 6, 2024

flaneur2020 commented Apr 6, 2024