Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about calculating MRR and predicting the code block #7

Open
Zero0one1 opened this issue Sep 2, 2022 · 1 comment
Open

Comments

@Zero0one1
Copy link

From my understanding, mrr function uses the variable where to filter out tokens in memory (e.g., the first half tokens for intermediate blocks as explained in #4 ) , unknown tokens, and padding tokens. Then _mrr function sorts the top-10 predicted tokens for each position and uses the indexes of correctly predicted tokens of each position to calculate MRR.

My question is are all positions in this code block predicted at the same time or they are all predicted when calculating MRR (i.e., in _step() function of LightningModel class)? Let's say this code block is [range(250, 750), 250], then the range(250, 500) is the memory, and range(500, 750) is to be predicted. So when we calculate MRR of this block, is it for all the tokens in the range(500, 750)?

How does it complete the code block? Does it predict the 501th token based on range(250, 500) and then predict 502nd token based on range(250, 501)? Could you explain more and point out the related code.

Besides, how can I use a trained model for inference and get the vocab from their idx? Here is the script I wrote. I'm not sure about it. Please let me know your idea. Thanks for your help!

...
args.load = 'path to the trained model.ckpt'

model = LightningModel(args)
model.eval()

# use eval_dataset for example
eval_dataset = model.eval_dataset
eval_dataloader = DataLoader(eval_dataset, 
    num_workers=args.num_workers, collate_fn=model.collate_fn, drop_last=False, pin_memory=True, shuffle=False)

# get the vocab of tensor idx
def idx2vocab_func(batch, idx2vocab): # assuming batch = 1
    vocab_len = len(idx2vocab)
    batch = batch.view(-1).tolist()
    input_v = list()
    for i in batch:
        if i >= vocab_len:
            input_v.append('UNK')
        else:
            input_v.append(idx2vocab[i])
    return input_v

def get_pred(batch):
    y = batch['input_seq']['values']
    y_pred_types, y_pred_values = model(batch['input_seq'], rel=batch['rel_mask'], positions=batch['positions'])
    
    ext = batch['extended'].unsqueeze(-1).repeat(1, y.size(-1))
    ext_ids = torch.arange(y.size(-1), device=ext.device).view(1, -1).repeat(*(y.size()[:-1]+(1,)))
    where = ext_ids >= ext
    where = where.view(-1)

    y_pred_values = y_pred_values.view(-1, y_pred_values.size(-1))[where]

    _, y_pred_values = torch.topk(y_pred_values, k=1, dim=-1) # choose the top1 predicted token
    return y_pred_values.view(-1)

with torch.no_grad():
    for i, sample in enumerate(eval_dataloader):
        print('---------------Input----------------')
        input_v = sample['input_seq']['values']
        print(input_v.shape)
        print('Vocab:', ' '.join(idx2vocab_func(input_v, idx2vocab_value)))

        print('---------------Target----------------')
        target_v = sample['target_seq']['values']
        print(target_v.shape)
        print('Vocab:', ' '.join(idx2vocab_func(target_v, idx2vocab_value)))

        print('---------------Predicted----------------')
        pred_v = get_pred(sample)
        print(pred_v.shape)
        print('Vocab:', ' '.join(idx2vocab_func(pred_v, idx2vocab_value)))
@serjtroshin
Copy link
Collaborator

Hi, thank you for the question! For code completion evaluation, we only predict the next type/value given the context, as in the teacher forcing regime. Predicting the whole subtree given the context is out of the scope of this work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants