-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
why i can't reproduce the result of paper? #37
Comments
@ilmarkov hi, could you reply to me in your spare time ? i am stuck in this problem. I don't know how i reproduce the result as showed in paper. please |
Hi, what command are you running? Generally, the code has been running fine for many people; however, it is possible that some recent HuggingFace update broke things. |
Hi, in the paper we run with calibration data from
|
We evaluate on wikitext but always use c4 as calibration data (this should be noted in the paper). Yes, this is a bit weird, however there can sometimes be outlier models which do not follow the trend. |
yes. i realize i made a mistake. i didn't use c4 as calibration data. Now I solve the problem, Thank you!!!! |
hi, i am very confused, I am reproducing opt-1.3b sparsity. the fact is that i can get the same dense model preplexity at 14.62 of wikitext2, but the preplexity after sparsing 50% is 26.71, which is higher than 17.46 in paper. And I didn't modify any code in this repo. I wonder if the hyperparams in paper's experiments are different?
look forward to your reply.
The text was updated successfully, but these errors were encountered: