eval_bleu with pretrained gpt model #3

Gugutse · 2021-10-04T23:17:54Z

Hi @wasiahmad,
I'm trying to evaluate a gpt-2 model with your code. Thus, I run run.py with microsoft/CodeGPT-small-py in pretrain_dir parameter and do_infer. In eval_blue script outputs equal to model(inputs)[1] – these are hidden states of pretrained gpt – and it's a tuple of 12 elements (n_layers) consisting of 2 elements each, and these two have [1, 12, 48, 64]. When it goes to this line past_hidden = [x[:, i:i + 1].expand(-1, beam_size, -1, -1, -1) for x in outputs] an error occurs: TypeError: tuple indices must be integers or slices, not tuple – and it also implies that the shape of each element in outputs should have 5 dimensions.
Which corrections should be done in this case?

The text was updated successfully, but these errors were encountered:

wasiahmad · 2021-10-05T00:09:17Z

Too many information and thus unable to understand the problem. I don't see any eval_blue script too. Please look at the code for other models.

Gugutse · 2021-10-05T04:28:15Z

eval_bleu is here: https://github.com/wasiahmad/AVATAR/blob/main/codegpt/run.py

for step, (batch, token_labels) in enumerate(test_dataloader):
        inputs = batch.to(args.device)
        with torch.no_grad():
            beam_size = args.beam_size
            m = torch.nn.LogSoftmax(dim=-1)
            outputs = model(inputs)[1]
            p = []
            zero = torch.cuda.LongTensor(1).fill_(0)
            for i in range(inputs.shape[0]):
                past_hidden = [x[:, i:i + 1].expand(-1, beam_size, -1, -1, -1) for x in outputs]

I would like to evaluate the model, but the error TypeError: tuple indices must be integers or slices, not tuple occurs on the line past_hidden = [x[:, i:i + 1].expand(-1, beam_size, -1, -1, -1) for x in outputs]

wasiahmad · 2021-10-05T04:52:59Z

I am not sure. The CodeGPT codebase is supposed to work fine. I am not sure why you are getting this error. I will run again in my environment to see if CodeGPT training and evaluation work correctly.

Gugutse · 2021-10-05T17:48:50Z

The problem is that GPT-2 model's output format has changed in the newer versions of Huggingface's transformers library – thus, it needs to apply torch.stack() to the output.

wasiahmad · 2021-10-06T04:11:51Z

I see, you can modify the code to be compatible with the newer versions of the transformers API. I am closing this issue for now.

wasiahmad closed this as completed Oct 6, 2021

wasiahmad added the question Further information is requested label Dec 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

eval_bleu with pretrained gpt model #3

eval_bleu with pretrained gpt model #3

Gugutse commented Oct 4, 2021

wasiahmad commented Oct 5, 2021

Gugutse commented Oct 5, 2021 •

edited

Loading

wasiahmad commented Oct 5, 2021

Gugutse commented Oct 5, 2021

wasiahmad commented Oct 6, 2021

eval_bleu with pretrained gpt model #3

eval_bleu with pretrained gpt model #3

Comments

Gugutse commented Oct 4, 2021

wasiahmad commented Oct 5, 2021

Gugutse commented Oct 5, 2021 • edited Loading

wasiahmad commented Oct 5, 2021

Gugutse commented Oct 5, 2021

wasiahmad commented Oct 6, 2021

Gugutse commented Oct 5, 2021 •

edited

Loading