Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XPU support #25965

Closed
Serizao opened this issue Sep 4, 2023 · 3 comments
Closed

XPU support #25965

Serizao opened this issue Sep 4, 2023 · 3 comments

Comments

@Serizao
Copy link
Contributor

Serizao commented Sep 4, 2023

Feature request

I would like to know if XPU support is in pipline ?
I try to use with current package and i have error :
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, xpu:0 and cpu! (when checking argument for argument index in method wrapper_XPU__index_select)

i read the doc carefuly and i didn't fine mention of xpu backend so i think it is not impemented yet. I found a repos wich implet it https://github.com/rahulunair/transformers_xpu but i have an other error but i think his implementation is good

Motivation

Suport Intel GPU A770 and A750 and many wich will come

Your contribution

I can test if needed

@amyeroberts
Copy link
Collaborator

Hi @Serizao, thanks for opening this issue!

There's a PR #25714 for adding support for XPU with the trainer.

Could you share a minimal reproducer so we can replicate the issue? The underlying problem might be the weight allocation rather then XPU itself.

cc @muellerzr @pacman100

@abhilash1910
Copy link
Contributor

Hi @Serizao , post this PR #25714 , huggingface suite should be functional with the next gen arc systems from Intel. If you face any issues on any Intel device , please send a reproducer here. Yes Rahul (the repo mentioned) is my colleague and this PR aims to resolve issues arising from HF suite on Intel side.

@Serizao
Copy link
Contributor Author

Serizao commented Sep 5, 2023

My code to reproduce

import csv
import intel_extension_for_pytorch as ipex
import torch
from random import randint
from tqdm import tqdm
from itertools import islice, zip_longest

from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, DataCollatorForSeq2Seq
from transformers import Trainer, TrainingArguments

import warnings

warnings.filterwarnings("ignore", category=UserWarning, module="intel_extension_for_pytorch")
warnings.filterwarnings("ignore", category=UserWarning, module="torchvision.io.image", lineno=13)


DEVICE = torch.device("xpu" if torch.xpu.is_available() else "cpu")
print(f"Finetuning on device: {ipex.xpu.get_device_name()}")
def get_device(self):
    if torch.xpu.is_available():
        return DEVICE
    else:
        return self.device

def place_model_on_device(self):
    self.model.to(self.args.device)


deviceCompute = torch.device("xpu" if torch.xpu.is_available() else "cpu")
print(f"Using device: {deviceCompute}")
model_name = "facebook/nllb-200-distilled-600M"
#model_name ="Helsinki-NLP/opus-mt-tc-big-fr-en"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
model.to(deviceCompute)


dataset = load_dataset("json", data_files="dataset-3/unit/data.json")
dataset = dataset["train"].shuffle(seed=42)
dataset = dataset.shard(num_shards=10, index=0)


def preprocess_function(examples):
    padding = "max_length"
    max_length = 512

    inputs = [ex for ex in examples["fr"]]
    targets = [ex for ex in examples["en"]]
    model_inputs = tokenizer(inputs, padding=padding, truncation=True)
    labels = tokenizer(targets, padding=padding, truncation=True)
    #model_inputs = tokenizer(inputs, max_length=max_length, padding=padding, truncation=True)
    #labels = tokenizer(targets, max_length=max_length, padding=padding, truncation=True)

    model_inputs["labels"] = labels["input_ids"]
    return model_inputs


train_dataset = dataset.map(preprocess_function, batched=True, desc="Running tokenizer")
data_collator = DataCollatorForSeq2Seq(
            tokenizer,
            model=model,
            label_pad_token_id=tokenizer.pad_token_id,
            pad_to_multiple_of=64
        )

# Paramètres d'entraînement avec PyTorch
training_args = TrainingArguments(
    gradient_accumulation_steps=2,
    output_dir="./results",
    per_device_train_batch_size=4,
    num_train_epochs=3,
    logging_dir="./logs",
    logging_steps=100,
    save_steps=500,
    bf16=True,  # setting datype to bfloat16
    save_total_limit=2,  # Conservez seulement les 2 derniers checkpoints
    push_to_hub=False,
    #use_ipex=True,  # optimize the model and optimizer using intel extension for pyotrch (optional)
)

trainer = Trainer(
    model=model, 
    args=training_args,
    data_collator=data_collator,
    train_dataset=train_dataset,
)


trainer.train()

save_directory = "./dataset-3/models/new-finetune-fb-nllb-600M"
model.save_pretrained(save_directory)
tokenizer.save_pretrained(save_directory)


Please excuse the quality of my code, I'm a novice in artificial intelligence.

@Serizao Serizao closed this as completed Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants