[DRAFT] Enable CPU data layout convert to XPU #2441

jiqing-feng · 2025-06-25T07:29:34Z

Enable model quantize on CPU and reload on XPU:

import torch
from transformers import TorchAoConfig, AutoModelForCausalLM, AutoTokenizer
from torchao.quantization import Int4WeightOnlyConfig
from torchao.dtypes import Int4CPULayout

quant_config = Int4WeightOnlyConfig(group_size=32, layout=Int4CPULayout())
quantization_config = TorchAoConfig(quant_type=quant_config)

# Load and quantize the model
quantized_model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.1-8B-Instruct",
    torch_dtype="auto",
    device_map="cpu",
    quantization_config=quantization_config
)
# save the quantized model
output_dir = "llama-3.1-8b-torchao-int8"
quantized_model.save_pretrained(output_dir, safe_serialization=False)

# reload the quantized model
reloaded_model = AutoModelForCausalLM.from_pretrained(
    output_dir,
    device_map="auto",
    torch_dtype=torch.bfloat16
)
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
input_text = "What are we having for dinner?"
input_ids = tokenizer(input_text, return_tensors="pt").to(reloaded_model.device.type)

output = reloaded_model.generate(**input_ids, max_new_tokens=10)
print(tokenizer.decode(output[0], skip_special_tokens=True))

pytorch-bot · 2025-06-25T07:29:39Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2441

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 25, 2025

jiqing-feng added 2 commits June 25, 2025 15:05

enable cpu to xpu

1c9f2fe

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

fix format

affe779

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DRAFT] Enable CPU data layout convert to XPU #2441

[DRAFT] Enable CPU data layout convert to XPU #2441

Uh oh!

jiqing-feng commented Jun 25, 2025

Uh oh!

pytorch-bot bot commented Jun 25, 2025

Uh oh!

Uh oh!

[DRAFT] Enable CPU data layout convert to XPU #2441

Are you sure you want to change the base?

[DRAFT] Enable CPU data layout convert to XPU #2441

Uh oh!

Conversation

jiqing-feng commented Jun 25, 2025

Uh oh!

pytorch-bot bot commented Jun 25, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2441

Uh oh!

Uh oh!