You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am training a simple translation model using DPO Trainer and the code is below:
from datasets import Dataset
from transformers import Trainer, TrainingArguments, AutoTokenizer, AutoModelForSeq2SeqLM
from torch.nn.functional import cross_entropy
import torch
# Load pre-trained T5 tokenizer and model
model_name = "t5-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
from trl import DPOConfig, DPOTrainer
# Example tokenized dataset (replace with your actual dataset)
tokenized_dataset = Dataset.from_dict({
'prompt': ["Translate to French: The house is beautiful.", "Translate to French: The cat sat on the mat."],
'chosen': ["La maison est belle.", "Le chat était sur le tapis."],
'rejected': ["La maison est laide.", "Le chat a mangé le tapis."],
})
# Preprocess function
def preprocess_function(examples):
inputs = examples['prompt']
chosen_targets = examples['chosen']
rejected_targets = examples['rejected']
model_inputs = tokenizer(inputs, max_length=512, truncation=True, padding="max_length")
model_inputs["labels"] = tokenizer(chosen_targets, max_length=512, truncation=True, padding="max_length").input_ids
model_inputs["chosen_labels"] = tokenizer(chosen_targets, max_length=512, truncation=True, padding="max_length").input_ids
model_inputs["rejected_labels"] = tokenizer(rejected_targets, max_length=512, truncation=True, padding="max_length").input_ids
return model_inputs
# Tokenize the dataset
tokenized_dataset = tokenized_dataset.map(preprocess_function, batched=False)
# Training arguments
training_args = DPOConfig(
output_dir="./results",
evaluation_strategy="epoch",
learning_rate=2e-5,
per_device_train_batch_size=4,
num_train_epochs=3,
logging_dir="./logs",
)
# Initialize Trainer
trainer = DPOTrainer(
model=model,
args=training_args,
train_dataset=tokenized_dataset,
tokenizer=tokenizer,
)
# Train the model
trainer.train()
The error:
AttributeError Traceback (most recent call last)
<ipython-input-24-a566970b894e> in <cell line: 64>()
62
63 # Train the model
---> 64 trainer.train()
4 frames
/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py in estimate_tokens(self, input_dict)
1186 self.warnings_issued = {}
1187 if self.main_input_name in input_dict:
-> 1188 return input_dict[self.main_input_name].numel()
1189 elif "estimate_tokens" not in self.warnings_issued:
1190 logger.warning(
AttributeError: 'list' object has no attribute 'numel'
I tried different envs like sagemaker and google colab but the error persists.
The text was updated successfully, but these errors were encountered:
I don't know the fix, but I have worked around by doing:
model.floating_point_ops = lambda s: 0
More detail: I was passing data to DPO as plain Python objects, not tensors, since that's what DPO expects. But the floating_point_ops method of some models expects tensors. Since this method was only used for monitoring, I just replaced it with a noop.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Transformers:
4.41.2
trl:
0.9.4
torch:
Version: 2.3.0+cu121
I am training a simple translation model using DPO Trainer and the code is below:
The error:
I tried different envs like sagemaker and google colab but the error persists.
The text was updated successfully, but these errors were encountered: