AMD quantize #6

rraulison · 2023-12-01T00:26:38Z

trying to quantize and no model is generated
my hardware is amd

Loading model ...
Quantizing model weights for int8 weight-only symmetric per-channel quantization
Morto

Chillee · 2023-12-01T07:26:24Z

Nothing is generated in the model folder? Can you provide more details on what's being printed?

rraulison · 2023-12-01T11:04:40Z

i can run inference:

python generate.py --compile --checkpoint_path checkpoints/$MODEL_REPO/model.pth --prompt "name 10 animals similar to duck"
Loading model ...
Time to load model: 75.21 seconds
/home/pai/pytorch/gpt-fast/model.py:182: UserWarning: 1Torch was not compiled with flash attention. (Triggered internally at ../aten/src/ATen/native/transformers/hip/sdp_utils.cpp:254.)
  y = F.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0)
/home/pai/pytorch/gpt-fast/model.py:182: UserWarning: 1Torch was not compiled with memory efficient attention. (Triggered internally at ../aten/src/ATen/native/transformers/hip/sdp_utils.cpp:292.)
  y = F.scaled_dot_product_attention(q, k, v, attn_mask=mask, dropout_p=0.0)
Compilation time: 68.25 seconds
name 10 animals similar to duck. Here are 10 animals that are similar to ducks:

1. Geese - Geese are similar to ducks in many ways, but they are generally larger and have longer necks.

2. Swans - Swans are larger than ducks and have a more slender, graceful appearance.

3. Coots - Coots are small to medium-sized birds that are similar to ducks in many ways, but they have a more rounded body shape and a distinctive red beak.

4. Grebes - Grebes are small to medium-sized birds that are similar to ducks in many ways, but they have a more slender body shape and a distinctive long neck.

5. Mergansers - Mergansers are small to medium-sized birds that are similar to ducks in many ways, but they have a more slender body shape and a distinctive black and
Time for inference 1: 56.20 sec total, 3.56 tokens/sec
Bandwidth achieved: 47.96 GB/s
name 10 animals similar to duck.

1. Goose
2. Swan
3. Turkey
4. Pheasant
5. Chicken
6. Quail
7. Pigeon
8. Crow
9. Heron
10. Ostrich HM Revenue & Customs (HMRC) is the UK’s tax, payments and customs authority. Its purpose is to collect taxes, pay benefits, and manage national insurance. HMRC also enquires into and investigates tax evasion and avoidance.

How does HMRC collect taxes?
HMRC collects taxes through various methods, including:

1. PAYE (Pay As You Earn) - employers deduct tax and National Insurance contributions from their employees' wages and pay them over to HMRC.
2. Self Assessment - individuals who are self-employed or have
Time for inference 2: 56.82 sec total, 3.52 tokens/sec
Bandwidth achieved: 47.44 GB/s

but to quantize the message is:

(pyenv) (base) pai@localhost:~/pytorch/gpt-fast> python quantize.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth --mode int8
Loading model ...
Quantizing model weights for int8 weight-only symmetric per-channel quantization
Morto

trying GPTQ:

python quantize.py --mode int4-gptq --calibration_tasks wikitext --calibration_seq_length 2048
Loading model ...
Quantizing model weights for int4 weight-only affine per-channel groupwise quantization using GPTQ...
Traceback (most recent call last):
  File "/home/pai/pytorch/gpt-fast/quantize.py", line 612, in <module>
    quantize(args.checkpoint_path, args.mode, args.groupsize, args.calibration_tasks, args.calibration_limit, args.calibration_seq_length, args.pad_calibration_inputs, args.percdamp, args.blocksize, args.label)
  File "/home/pai/pytorch/gpt-fast/quantize.py", line 573, in quantize
    quantized_state_dict = quant_handler.create_quantized_state_dict(
  File "/home/pai/pytorch/pyenv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/pai/pytorch/gpt-fast/quantize.py", line 281, in create_quantized_state_dict
    inputs = GPTQQuantHandler.get_inputs(self.mod, tokenizer, calibration_tasks, calibration_limit, calibration_seq_length, pad_calibration_inputs)
  File "/home/pai/pytorch/gpt-fast/quantize.py", line 252, in get_inputs
    input_recorder = InputRecorder(
NameError: name 'InputRecorder' is not defined. Did you mean: 'input_recorder'?

content on my folder: /checkpoints/meta-llama/Llama-2-7b-chat-hf> ls

config.json                
       model.safetensors.index.json   
   tokenizer_config.json
generation_config.json       
     pytorch_model-00001-of-00002.bin  tokenizer.json
LICENSE.txt     
                  pytorch_model-00002-of-00002.bin  
tokenizer.model
model-00001-of-00002.safetensors  
pytorch_model.bin.index.json    
  USE_POLICY.md
model-00002-of-00002.safetensors  
README.md
model.pth             
            special_tokens_map.json

pip list

Package             Version
------------------- --------------------------
certifi             2022.12.7
charset-normalizer  2.1.1
filelock            3.9.0
fsspec              2023.10.0
huggingface-hub     0.19.4
idna                3.4
Jinja2              3.1.2
MarkupSafe          2.1.3
mpmath              1.2.1
networkx            3.0rc1
numpy               1.24.1
packaging           23.2
Pillow              9.3.0
pip                 23.3.1
pytorch-triton-rocm 2.1.0+dafe145982
PyYAML              6.0.1
requests            2.28.1
sentencepiece       0.1.99
setuptools          65.5.0
sympy               1.11.1
torch               2.2.0.dev20231130+rocm5.7
torchaudio          2.2.0.dev20231130+rocm5.7
torchvision         0.17.0.dev20231130+rocm5.7
tqdm                4.66.1
typing_extensions   4.8.0
urllib3             1.26.13

python --version
Python 3.10.13

Chillee · 2023-12-04T21:02:00Z

The performance here is a lot lower than I'd expect. What GPU are you using?

As for the quantization note, perhaps the issue is that you're running out of CPU memory at some point during the process? I don't see any reason why the quantization script would stop in the middle.

rraulison · 2023-12-05T13:17:31Z

I am using iGPU from ryzen 5600g CPU.

Yes, to quantize I must have more memory. Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AMD quantize #6

AMD quantize #6

rraulison commented Dec 1, 2023

Chillee commented Dec 1, 2023

rraulison commented Dec 1, 2023 •

edited

Loading

Chillee commented Dec 4, 2023

rraulison commented Dec 5, 2023

AMD quantize #6

AMD quantize #6

Comments

rraulison commented Dec 1, 2023

Chillee commented Dec 1, 2023

rraulison commented Dec 1, 2023 • edited Loading

Chillee commented Dec 4, 2023

rraulison commented Dec 5, 2023

rraulison commented Dec 1, 2023 •

edited

Loading