problem when reproduce experiment #5

cat538 · 2024-03-27T18:03:09Z

Thanks for the great work!
I'm having a little problem reproducing the PPL results in the paper. I used the code snippet from the gptq repo for measuring ppl and was able to reproduce the fp16 baseline for the llama family in the paper, but I was unable to reproduce the fp16 baseline for mistral-7b using the same test code:

testdata = load_dataset('wikitext', 'wikitext-2-raw-v1', split='test')
testenc = tokenizer("\n\n".join(testdata['text']), return_tensors='pt')["input_ids"]

nsamples = testenc.numel() // input_len
nlls = []

loss_fct = nn.CrossEntropyLoss()
for i in tqdm(range(nsamples)):
    batch = testenc[:, (i * input_len) : ((i + 1) * input_len)].to(model.device)
    outputs = model.model(batch)
    hidden_states = outputs[0]
    logits = model.lm_head(hidden_states)
    shift_logits = logits[:, :-1, :]
    shift_labels = batch[:, 1:].to(model.lm_head.weight.device)
    loss = loss_fct(
        shift_logits.view(-1, shift_logits.size(-1)),
        shift_labels.view(-1),
    )
    neg_log_likelihood = loss.float() * input_len
    nlls.append(neg_log_likelihood)

ppl = torch.exp(torch.stack(nlls).sum() / (nsamples * input_len)).item()

Specifically, I use mistral-7b-v0.1, tried seqlen=8000 as well as seqlen=8192, both slightly lower than the results in the paper, which gave us a bit of trouble.

I would like to ask will you release the code of measuring ppl?

The text was updated successfully, but these errors were encountered:

chooper1 · 2024-04-05T17:21:58Z

Thank you for the interest in our work!

I was able to reproduce this. The gap in scores depends on what attention implementation is used in Transformers. I measured our PPL numbers with seqlen=8192 using "_attn_implementation": "eager" in the config.json file. If you use newer versions of transformers, by default _attn_implementation": "sdpa" is used instead. When using "sdpa" I get 4.73 instead of 4.76. Let me know if this doesn't fix the issue.

chooper1 closed this as completed Apr 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

problem when reproduce experiment #5

problem when reproduce experiment #5

cat538 commented Mar 27, 2024

chooper1 commented Apr 5, 2024

problem when reproduce experiment #5

problem when reproduce experiment #5

Comments

cat538 commented Mar 27, 2024

chooper1 commented Apr 5, 2024