XPU support #25965

Serizao · 2023-09-04T17:23:29Z

Feature request

I would like to know if XPU support is in pipline ?
I try to use with current package and i have error :
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, xpu:0 and cpu! (when checking argument for argument index in method wrapper_XPU__index_select)

i read the doc carefuly and i didn't fine mention of xpu backend so i think it is not impemented yet. I found a repos wich implet it https://github.com/rahulunair/transformers_xpu but i have an other error but i think his implementation is good

Motivation

Suport Intel GPU A770 and A750 and many wich will come

Your contribution

I can test if needed

The text was updated successfully, but these errors were encountered:

amyeroberts · 2023-09-05T11:21:33Z

Hi @Serizao, thanks for opening this issue!

There's a PR #25714 for adding support for XPU with the trainer.

Could you share a minimal reproducer so we can replicate the issue? The underlying problem might be the weight allocation rather then XPU itself.

cc @muellerzr @pacman100

abhilash1910 · 2023-09-05T11:28:39Z

Hi @Serizao , post this PR #25714 , huggingface suite should be functional with the next gen arc systems from Intel. If you face any issues on any Intel device , please send a reproducer here. Yes Rahul (the repo mentioned) is my colleague and this PR aims to resolve issues arising from HF suite on Intel side.

Serizao · 2023-09-05T12:05:58Z

My code to reproduce

import csv
import intel_extension_for_pytorch as ipex
import torch
from random import randint
from tqdm import tqdm
from itertools import islice, zip_longest

from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, DataCollatorForSeq2Seq
from transformers import Trainer, TrainingArguments

import warnings

warnings.filterwarnings("ignore", category=UserWarning, module="intel_extension_for_pytorch")
warnings.filterwarnings("ignore", category=UserWarning, module="torchvision.io.image", lineno=13)


DEVICE = torch.device("xpu" if torch.xpu.is_available() else "cpu")
print(f"Finetuning on device: {ipex.xpu.get_device_name()}")
def get_device(self):
    if torch.xpu.is_available():
        return DEVICE
    else:
        return self.device

def place_model_on_device(self):
    self.model.to(self.args.device)


deviceCompute = torch.device("xpu" if torch.xpu.is_available() else "cpu")
print(f"Using device: {deviceCompute}")
model_name = "facebook/nllb-200-distilled-600M"
#model_name ="Helsinki-NLP/opus-mt-tc-big-fr-en"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
model.to(deviceCompute)


dataset = load_dataset("json", data_files="dataset-3/unit/data.json")
dataset = dataset["train"].shuffle(seed=42)
dataset = dataset.shard(num_shards=10, index=0)


def preprocess_function(examples):
    padding = "max_length"
    max_length = 512

    inputs = [ex for ex in examples["fr"]]
    targets = [ex for ex in examples["en"]]
    model_inputs = tokenizer(inputs, padding=padding, truncation=True)
    labels = tokenizer(targets, padding=padding, truncation=True)
    #model_inputs = tokenizer(inputs, max_length=max_length, padding=padding, truncation=True)
    #labels = tokenizer(targets, max_length=max_length, padding=padding, truncation=True)

    model_inputs["labels"] = labels["input_ids"]
    return model_inputs


train_dataset = dataset.map(preprocess_function, batched=True, desc="Running tokenizer")
data_collator = DataCollatorForSeq2Seq(
            tokenizer,
            model=model,
            label_pad_token_id=tokenizer.pad_token_id,
            pad_to_multiple_of=64
        )

# Paramètres d'entraînement avec PyTorch
training_args = TrainingArguments(
    gradient_accumulation_steps=2,
    output_dir="./results",
    per_device_train_batch_size=4,
    num_train_epochs=3,
    logging_dir="./logs",
    logging_steps=100,
    save_steps=500,
    bf16=True,  # setting datype to bfloat16
    save_total_limit=2,  # Conservez seulement les 2 derniers checkpoints
    push_to_hub=False,
    #use_ipex=True,  # optimize the model and optimizer using intel extension for pyotrch (optional)
)

trainer = Trainer(
    model=model, 
    args=training_args,
    data_collator=data_collator,
    train_dataset=train_dataset,
)


trainer.train()

save_directory = "./dataset-3/models/new-finetune-fb-nllb-600M"
model.save_pretrained(save_directory)
tokenizer.save_pretrained(save_directory)

Please excuse the quality of my code, I'm a novice in artificial intelligence.

Serizao closed this as completed Sep 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XPU support #25965

XPU support #25965

Serizao commented Sep 4, 2023

amyeroberts commented Sep 5, 2023

abhilash1910 commented Sep 5, 2023

Serizao commented Sep 5, 2023

XPU support #25965

XPU support #25965

Comments

Serizao commented Sep 4, 2023

Feature request

Motivation

Your contribution

amyeroberts commented Sep 5, 2023

abhilash1910 commented Sep 5, 2023

Serizao commented Sep 5, 2023