Rollback model loading to match the code from the paper #23

justheuristic · 2023-07-05T20:25:17Z

This code fixes bad perplexity that was found with the following config

CUDA_VISIBLE_DEVICES=3 OMP_NUM_THREADS=16 MKL_NUM_THREADS=16 python main.py decapoda-research/llama-7b-hf custom --custom_data_path data/red_pajama_n=1024.pth --nsamples 128 --wbits 3 --perchannel --percdamp 1.0 --groupsize 16 --qq_scale_bits 3 --qq_zero_bits 3 --qq_groupsize 64 --outlier_threshold=0.7 --permutation_order act_order

... and with all dependency versions set by requirements.txt

p.s. kind thanks to the authors (esp. @Godofnothing @Vahe1994 ) for helping me figure out what was causing the problem

justheuristic · 2023-07-05T20:27:42Z

@Vahe1994 i'm re-running the main config now, results will be available in 40-ish minutes

Would you like me to run any additional tests to make sure this PR does not introduce more bugs?

Vahe1994 · 2023-07-05T21:11:19Z

@Vahe1994 i'm re-running the main config now, results will be available in 40-ish minutes

Would you like me to run any additional tests to make sure this PR does not introduce more bugs?

I think your experiments are sufficient.

Vahe1994

I looked at the code and your provided experiments,all seems good. Tank you for bug fix!

poedator · 2023-09-25T19:25:53Z

I tried to reproduce the problem fixed here. It appeared that it was coming from omission of this code:

    if dtype == "auto":
        dtype = AutoConfig.from_pretrained(model_path).torch_dtype or "auto"  # force transformers 4.29.2 to follow the same rules as 4.30.x

which was still necessary to keep while we still tested code using transformers==4.29.2

Modify the code s.t. the new number of errors is even

8c5fcd9

justheuristic added 2 commits July 5, 2023 23:41

Update modelutils.py

fb4c1a9

Update modelutils.py

779534e

Vahe1994 approved these changes Jul 5, 2023

View reviewed changes

Vahe1994 merged commit e75c55b into Vahe1994:main Jul 5, 2023

poedator mentioned this pull request Jul 19, 2023

undo low_cpu_mem_usage=True in huggingface.py #26

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rollback model loading to match the code from the paper #23

Rollback model loading to match the code from the paper #23

justheuristic commented Jul 5, 2023 •

edited

justheuristic commented Jul 5, 2023

Vahe1994 commented Jul 5, 2023

Vahe1994 left a comment •

edited

poedator commented Sep 25, 2023 •

edited

Rollback model loading to match the code from the paper #23

Rollback model loading to match the code from the paper #23

Conversation

justheuristic commented Jul 5, 2023 • edited

justheuristic commented Jul 5, 2023

Vahe1994 commented Jul 5, 2023

Vahe1994 left a comment • edited

Choose a reason for hiding this comment

poedator commented Sep 25, 2023 • edited

justheuristic commented Jul 5, 2023 •

edited

Vahe1994 left a comment •

edited

poedator commented Sep 25, 2023 •

edited