Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add simple demo ppl test with wikitext2 #17

Merged
merged 2 commits into from
Apr 26, 2023
Merged

add simple demo ppl test with wikitext2 #17

merged 2 commits into from
Apr 26, 2023

Conversation

qwopqwop200
Copy link
Collaborator

@qwopqwop200 qwopqwop200 commented Apr 26, 2023

This is a demo code using ppl to test the performance of AutoGPTQ.
opt - 125m
RTN baseline 37.27 ppl
GPTQ for llama 29.21 ppl
AutoGPTQ 29.87 ppl
Some performance degradation is observed, but it seems to be a minor error.

@PanQiWei
Copy link
Collaborator

Yeah, I can also confirm that there is a small gap between AutoGPTQ and GPTQ-for-LLaMa by running this example.

I'm not quite sure if it's because AutoGPTQ's logic used in model.quant() function and how attention_mask been processed is slightly different with GPTQ-for-LLaMa.

This finding is important, maybe there's room for improvement here.

I think this pr can be merged.

@PanQiWei PanQiWei merged commit 5a70052 into main Apr 26, 2023
@qwopqwop200 qwopqwop200 deleted the simple-demo branch April 26, 2023 12:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants