Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bad_words_ids not working #17504

Closed
Jack000 opened this issue Jun 1, 2022 · 10 comments
Closed

bad_words_ids not working #17504

Jack000 opened this issue Jun 1, 2022 · 10 comments
Assignees

Comments

@Jack000
Copy link

Jack000 commented Jun 1, 2022

Feature request

I'm using gpt2 for text generation with a word blacklist and noticed that some words on the blacklist were still being generated.

I found that even though the word ["badword"] would not be generated, it would still generate ["bad", "word"] in two tokens.

an example of this is [11908] and [7286, 1754]

this seems to be a different issue from the leading space issue and padding issue. I think I could get around it by adding the split tokens to the blacklist, but I can't seem to get the tokenizer to split the string to produce [7286, 1754]. Is there a way to get all possible permutations of a string to add to the blacklist?

Motivation

Without this feature bad_words_ids basically doesn't work most of the time

Your contribution

Not familiar with the tokenizer code unfortunately

@Jack000
Copy link
Author

Jack000 commented Jun 1, 2022

I wrote a function to enumerate all possible permutations of " Badword", but it quickly blows up with hundreds of permutations like [" B","a","d","w","o","r","d"]. Limiting the token length works ok, but still doesn't prevent generation of variations like [" Bad","words"]

I think this overall approach just doesn't really work for preventing the generation of bad_words. Don't know if there's a better solution than generate + filter.

def get_bad_words_ids(tokenizer, bad_words, min_strlen=2):
    vocab_tokens = tokenizer.get_vocab()
    vocab = {}

    for token in vocab_tokens:
        vocab[tokenizer.convert_tokens_to_string([token])] = token

    results = []

    for bad_word in bad_words:
        confirmed_tokens = []
        possible_tokens = []
        for token in vocab:
            if bad_word == token:
                confirmed_tokens.append([token])
            elif bad_word.startswith(token):
                possible_tokens.append([token])
        while len(possible_tokens) > 0:
            new_possible_tokens = []
            for prefixes in possible_tokens:
                prefix = ''.join(prefixes)
                for token in vocab:
                    if len(token) < min_strlen:
                        continue
                    if bad_word == prefix + token:
                        found_prefix = prefixes.copy()
                        found_prefix.append(token)
                        confirmed_tokens.append(found_prefix)
                    elif bad_word.startswith(prefix + token):
                        found_prefix = prefixes.copy()
                        found_prefix.append(token)
                        new_possible_tokens.append(found_prefix)
            possible_tokens = new_possible_tokens
        results += confirmed_tokens

    ids = []
    for tokens in results:
        gtokens = []
        for token in tokens:
            gtokens.append(vocab[token])
        ids.append(tokenizer.convert_tokens_to_ids(gtokens))
    return ids

@gante
Copy link
Member

gante commented Jun 1, 2022

Hey @Jack000 👋 It is not clear from your description -- have you tried using the tokenizer with the instructions given in the NoBadWordsLogitsProcessor docs?

["...in order to get the token ids of the words that should not appear in the generated text, use tokenizer(bad_words, add_prefix_space=True, add_special_tokens=False).input_ids."]

@Jack000
Copy link
Author

Jack000 commented Jun 1, 2022

That's what I did. This will consistently tokenize [" Badword"] as [11908] but during inference the model will generate [7286, 1754] which is [" Bad", "word"]

as I mentioned above I wrote a function to enumerate all possible ways of combining tokens to form "Badword", but the problem is that it doesn't work for variations like "Badwords" and "Badwordo". Extending the permutations to include these variations results in thousands of permutations per bad_word and doesn't really scale.

@gante
Copy link
Member

gante commented Jun 1, 2022

Okay, I think I got your issue :) When you add a word to bad_word_ids, you would like to have its sub-words and/or related words banned as well, correct?

There are a few things worth mentioning here:

  1. It is intentional that sub-words do NOT get banned. Think about the word "doctorate", which is very different from two of its subwords ("doctor" and "ate"). Banning a word doesn't imply banning the subwords in most scenarios, and our implementation has to be flexible in that regard.
  2. When a long word gets broken into more than a token, the first token has a prefix space and will be different from the corresponding token without the space. This is to avoid banning valid sequences that would contain the same characters. Example: if you ban "doctorate", "doctor ate" is a valid sequence. This is because the banned tokens will be " doctor" and "ate", not " doctor" and " ate" (notice the spaces).
  3. Banned tokens resulting from a long word are never considered in isolation. Example: if you ban "doctorate", you can still generate " doctor" and "ate" in isolation, "the doctor wants to dictate" is a valid sequence.
  4. I've tried running the "Badword" example you mentioned, and I do get two tokens (one for " Bad", the other for "word").

You can see an example for a few cases mentioned above here.

The solution for banning subwords is to explicitly add them to the list of bad_word_ids. @patrickvonplaten have you seen tools to generate sub-words and/or derived words from a list of candidate words?

@Jack000
Copy link
Author

Jack000 commented Jun 1, 2022

ah the actual bad word I was trying to ban was [" Hitler"].

I do understand how the bad_words_ids feature works, but I guess my issue is that I don't want the word "Hitler" generated under any circumstances subwords or otherwise. As you can see I did implement a function to enumerate all possible ways tokens can be combined to form "Hitler" to add to bad_words_ids, but if I include "Hitlers" and other such variations the possible permutations will number in the thousands.

anyways, I don't see a simple solution to this but the function I wrote in addition to filtering afterwards works ok for now.

@gante
Copy link
Member

gante commented Jun 1, 2022

I do understand how the bad_words_ids feature works

My apologies :D Better safe than sorry, in case there was some confusion about the intended behavior.

@patrickvonplaten
Copy link
Contributor

@patil-suraj could you maybe also take a look here? Otherwise happy to dive deeper if necessary

@patrickvonplaten
Copy link
Contributor

Sorry could I ping @ArthurZucker or @gante on this one maybe? :-)

@github-actions github-actions bot closed this as completed Nov 2, 2022
@huggingface huggingface deleted a comment from github-actions bot Nov 2, 2022
@ArthurZucker
Copy link
Collaborator

Hey! I looked at the problem a bit, and as you mentioned, the permutations would be a bit too problematic.

We can probably work this out by rather banning a normalized string. Instead of checking if [Bad_id,Word_id] are generated, we can should convert the a string by deciding, normalize and remove the bad word. This is more efficient but might not have its place in the generate function, as the tokenizer is not available. But it probably makes sens to have a custom logit processor that needs to be initialized with the tokenizer. Let me ask around 🤗

@ArthurZucker ArthurZucker reopened this Nov 3, 2022
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot closed this as completed Dec 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants