Support returning raw logits in `generate` #17521

shijie-wu · 2022-06-02T08:42:59Z

Feature request

Support returning raw logits in generate by either:

creating a new arg that enables return of raw logits
or support callback that allow users to collect the raw logits

Motivation

Raw logits "would be the most understandable & consistent across generation methods" (@patrickvonplaten)
For testing, returning raw logits would help "identify which parts get wrong if any test failure occurs" (@ydshieh)
There's concern about "rampant too many options" (@Narsil), thus I would prefer the second option to support this feature.
However, the second option still needs code change to support it. As the user provided logits_processor is appended to a new instance of LogitsProcessorList. As a result, users cannot get the raw logits using the current implementation even with a custom LogitsProcessor.

See further discussion in #17424

Your contribution

I could open a PR to reorder how logits_processor is merged with the predefined list of LogitsProcessorList.

The text was updated successfully, but these errors were encountered:

patrickvonplaten · 2022-06-02T16:56:11Z

cc @patil-suraj @gante as well

patrickvonplaten · 2022-06-02T16:56:52Z

I'm personally fine with adding a output_logits flag to generate since it already has 50+ flags it won't make a difference and it's a useful feature indeed. What do you think @patil-suraj @gante ?

gante · 2022-06-02T17:02:01Z

I'm cool with it 👍 (and it might be interesting to use as part of PT-TF cross tests)

patrickvonplaten · 2022-06-03T09:34:31Z

@patil-suraj what do you think? Do you want to open a PR to work on it?

ydshieh · 2022-06-03T09:54:15Z

@patil-suraj what do you think? Do you want to open a PR to work on it?

@shijie-wu seems willing to open a PR, as mentioned at the end of the issue description.

shijie-wu · 2022-06-04T00:46:50Z

I could open a PR for this.

patil-suraj · 2022-06-06T11:47:20Z

I'm okay with this, let me know if you need any help @shijie-wu :)

patrickvonplaten · 2022-06-08T22:30:27Z

Cool thanks for taking care of it @shijie-wu

github-actions · 2022-07-28T15:02:09Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

lxianl455 · 2022-10-22T07:16:45Z

So, is there any work on it? I did not find a new feature about getting the raw logits.

gante · 2022-10-23T10:19:28Z

I don't think so -- gently pinging @shijie-wu, who manifested interest in opening a PR :)

shijie-wu · 2022-11-02T18:37:44Z

sorry about the delay! i will resume working on it in the coming week.

xkianteb · 2023-03-26T15:47:55Z

gently ping @shijie-wu --- any updates on this?

xkianteb · 2023-03-26T15:53:38Z

@gante should I open a PR? I think the change is fairly minor.

gante · 2023-03-26T16:13:26Z

@xkianteb sounds good 👍

khalidsaifullaah · 2023-08-03T02:34:46Z

is there any update on this...?

gante · 2023-08-03T14:56:31Z

None that I know of. Open to contributors :)

vwxyzjn · 2024-02-01T22:15:20Z

Hey for folks running into this issue: I have a snippet already getting the raw logits. Prob related to your quest as well @xkianteb . It's for RLHF PPO so you don't have to do another forward pass to get the logprobs.

import torch
import transformers
import torch.nn.functional as F

tokenizer = transformers.AutoTokenizer.from_pretrained("gpt2", padding_side="right")
tokenizer.add_special_tokens({"pad_token": "[PAD]"})
pad_id = tokenizer.pad_token_id
policy = transformers.AutoModelForCausalLM.from_pretrained("gpt2")
policy.generation_config.pad_token_id = policy.generation_config.eos_token_id

query = torch.tensor([
    [pad_id, pad_id, 23073],
    [pad_id, pad_id, 234],
])
temperature = 0.7
context_length = query.shape[1]

def forward(model, query_responses, tokenizer):
    attention_mask = query_responses != tokenizer.pad_token_id
    position_ids = attention_mask.cumsum(1) - attention_mask.long()
    input_ids = torch.masked_fill(query_responses, ~attention_mask, 0)
    return model(
        input_ids=input_ids,
        attention_mask=attention_mask,
        position_ids=position_ids,
        return_dict=True,
        output_hidden_states=True,
    )

def generate_and_return_logits(lm_backbone, queries, tokenizer, generation_config):
    """generate in a way that does not affect padding tokens"""
    context_length = queries.shape[1]
    attention_mask = queries != tokenizer.pad_token_id
    input_ids = torch.masked_fill(queries, ~attention_mask, 0)
    output = lm_backbone.generate(
        input_ids=input_ids,
        attention_mask=attention_mask,
        # position_ids=attention_mask.cumsum(1) - attention_mask.long(), # already handled in generation
        generation_config=generation_config,
        return_dict_in_generate=True,
        output_scores=True
    )
    logits = torch.stack(output.scores, 1)
    return torch.cat((queries, output.sequences[:, context_length:]), dim=1), logits

generation_config = transformers.GenerationConfig(
    max_new_tokens=5,
    min_new_tokens=5,
    temperature=temperature,
    top_k=0.0,
    top_p=1.0,
    do_sample=True,
)
query_response, logits = generate_and_return_logits(policy, query, tokenizer, generation_config)
response = query_response[:, context_length:]
all_logprob = F.log_softmax(logits, dim=-1)
logprob = torch.gather(all_logprob, 2, response.unsqueeze(-1)).squeeze(-1)
print(f"{response=}")
print(f"{logprob=}")

output = forward(policy, query_response, tokenizer)
logits = output.logits[:, context_length - 1 : -1]
logits /= temperature
all_logprob = F.log_softmax(logits, dim=-1)
logprob = torch.gather(all_logprob, 2, response.unsqueeze(-1)).squeeze(-1)
print(f"{logprob=}")

response=tensor([[  198,   198,     3,   399,   532],
        [  198,   198, 48412,  4803, 19321]])
logprob=tensor([[-3.2519e+00, -5.9604e-06, -5.2666e+00, -7.8440e+00, -2.6367e+00],
        [-1.5943e+00, -5.6028e-06, -9.8833e+00, -2.3764e+00, -4.8006e+00]])
logprob=tensor([[-3.2519e+00, -5.9604e-06, -5.2666e+00, -7.8440e+00, -2.6367e+00],
        [-1.5943e+00, -5.6028e-06, -9.8833e+00, -2.3764e+00, -4.8006e+00]],
       grad_fn=<SqueezeBackward1>)

gante · 2024-02-07T15:04:37Z

(see #28667)

huggingface deleted a comment from github-actions bot Jul 4, 2022

github-actions bot closed this as completed Aug 5, 2022

xkianteb mentioned this issue Mar 26, 2023

[Generate] Greedy Search, fix output scores from logits to scores #17442

Merged

5 tasks

sbrugman mentioned this issue Jan 29, 2024

ENH: added new output_logits option to generate function #28667

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support returning raw logits in `generate` #17521

Support returning raw logits in `generate` #17521

shijie-wu commented Jun 2, 2022

patrickvonplaten commented Jun 2, 2022

patrickvonplaten commented Jun 2, 2022

gante commented Jun 2, 2022

patrickvonplaten commented Jun 3, 2022

ydshieh commented Jun 3, 2022

shijie-wu commented Jun 4, 2022

patil-suraj commented Jun 6, 2022

patrickvonplaten commented Jun 8, 2022

github-actions bot commented Jul 28, 2022

lxianl455 commented Oct 22, 2022

gante commented Oct 23, 2022

shijie-wu commented Nov 2, 2022

xkianteb commented Mar 26, 2023

xkianteb commented Mar 26, 2023

gante commented Mar 26, 2023

khalidsaifullaah commented Aug 3, 2023

gante commented Aug 3, 2023

vwxyzjn commented Feb 1, 2024 •

edited

gante commented Feb 7, 2024

Support returning raw logits in generate #17521

Support returning raw logits in generate #17521

Comments

shijie-wu commented Jun 2, 2022

Feature request

Motivation

Your contribution

patrickvonplaten commented Jun 2, 2022

patrickvonplaten commented Jun 2, 2022

gante commented Jun 2, 2022

patrickvonplaten commented Jun 3, 2022

ydshieh commented Jun 3, 2022

shijie-wu commented Jun 4, 2022

patil-suraj commented Jun 6, 2022

patrickvonplaten commented Jun 8, 2022

github-actions bot commented Jul 28, 2022

lxianl455 commented Oct 22, 2022

gante commented Oct 23, 2022

shijie-wu commented Nov 2, 2022

xkianteb commented Mar 26, 2023

xkianteb commented Mar 26, 2023

gante commented Mar 26, 2023

khalidsaifullaah commented Aug 3, 2023

gante commented Aug 3, 2023

vwxyzjn commented Feb 1, 2024 • edited

gante commented Feb 7, 2024

Support returning raw logits in `generate` #17521

Support returning raw logits in `generate` #17521

vwxyzjn commented Feb 1, 2024 •

edited