Hi,
I think this bug is similar to #252.
I found that the following code can load any GPTQ v2 models successfully:
model = GPTQModel.from_quantized(args.model, device_map='auto',torch_dtype=torch.float16)
However, when I try to saving like:
model.save_pretrained(args.save_dir)
I found that no any checkpoints is saved. Additionally, if I try to save as:
model.model.save_pretrained(args.save_dir)
The shard checkpoints can be saving. However, the output results of model are wrong when reload the saved checkpoints throug model.model.save_pretrained.
Overall, model.save_pretrained cannot save shard checkpoints. model.model.save_pretrained can save shard checkpoints while produce wrong outputs.